160 likes | 268 Vues
Join Jeff Schoolcraft, Senior Architect and TDD Evangelist at RGII Technologies, for an engaging session on regular expressions. This talk covers the core concepts of regex, offering both theoretical foundations and practical applications. Expect insights into effective usage, best practices, and hands-on demonstrations. Learn how regular expressions can enhance your data validation, searching, and string manipulation tasks, while avoiding common pitfalls. The session concludes with a Q&A segment to clarify your queries about this powerful tool.
E N D
Regular ExpressionsTheory and Practice Jeff Schoolcraft MDCFUG 12/13/2005
Who am I? • Jeff Schoolcraft • Senior Architect / Operations Manager at RGII Technologies. • Speaker at Usergroups • President WinProTeam Vienna Usergroup (.NET) • TDD Evangelist • Tool guy
What can you expect? • “The gist” in 60 seconds or less. • Theory • Practical Usage • Best Practices • Hands On • A sermon • Q & A
The Gist • Regular Expressions (regex) describe patterns in strings and are often used for data validation, searching and text transformations.
Theory • BasicsA regular expression, often called a pattern, is an expression that describes a set of strings without actually listing its elements. • Say what? The set of strings {“dog”, “bog”, “fog”} can be described by this regular expression (regex): [bdf]og
Formal Language Theory • A Regular Language is any language where all possible strings of that language can be described by a regular expression
Formal Language Theory (cont’d) • Regular expressions consist of constants and operators that denote sets of strings and operations over these sets, respectively. Given a finite alphabet Σ the following constants are defined: • (empty set) ∅ denoting the set ∅ • (empty string) ε denoting the set {ε} • (literal character) a in Σ denoting the set {a} • and the following operations: • (concatenation) RS denoting the set { αβ | α in R and β in S }. For example {"ab", "c"}{"d", "ef"} = {"abd", "abef", "cd", "cef"}. • (alternation) R|S denoting the set union of R and S. • (Kleene star) R* denoting the smallest superset of R that contains ε and is closed under string concatenation. This is the set of all strings that can be made by concatenating zero or more strings in R. For example, {"ab", "c"}* = {ε, "ab", "c", "abab", "abc", "cab", "cc", "ababab", ... }. http://en.wikipedia.org/wiki/Regular_expression
DEMO • All binary numbers • All binary numbers that start and end in 1 • All binary number that have 00 any other bits followed by 111 and any other bits.
Practical Usage • In order of popularity • Searching • Much nicer than * or % • String Manipulation • Parsing • Replacement • Validation • Input validation • Database check constraints
Best Practices • The most important thing to remember: • Regular Expressions are greedy • Make the most explicit match possible • Just because some implementations allow *? Don’t fall back on that.
A Sermon • Some people develop a religious fascination with new tools & technologies (design patterns, regex, whatever). • Use the tools that make the most sense for your problem/solution.
Email Validation with REGEX? • Are you kidding me? • See email.regex • Multi-tiered approach, regex to test format, some code to test validity of email address.
Further Resources • Mastering Regular Expressionshttp://www.oreilly.com/catalog/regex/ • A website, 1000’s on google. http://www.regular-expressions.info/ • Mehttp://thequeue.net/blog/http://regexadvice.com/blogs/jschoolcraft/jeff@thequeue.net
Tools • The Regulator (http://regex.osherove.com/) • Expresso (http://www.ultrapico.com/Expresso.htm)