Never been to DZone Snippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Greedy vs Lazy in Regular Expressions (See related posts)

The greedy expression can be seen as a True *and* False predicate, meaning true while the token is valid, while being false if the pattern matching hasn't been exhausted.
The lazy expression is the True *or* False predicate, meaning true while the token is valid, or false if the pattern match hasn't been fully exhausted.

irb(main):037:0> "<EM>first</EM>"[/<.+>/] #greedy 
=> "<EM>first</EM>"
irb(main):038:0> "<EM>first</EM>"[/<.+?>/] #lazy
=> "<EM>"
irb(main):039:0> "<EM>first</EM>"[/<[^<>]+>/] # better solution
=> "<EM>"

source: Regular Expression Quick Start [regular-expressions.info]

Comments on this post

rmills posts on Feb 04, 2008 at 09:20
Can you explain further why you consider the 3rd option a better solution? I'm a bit unclear about it.

Thanks!
jrobertson posts on Feb 04, 2008 at 10:33
re:rmills

In short it's just a bit more specific.

1) it's more efficient to understand to the trained eye, in this example within the first 3 characters of the pattern the string shouldn't proceed past either '<' or '>'.
2) it's more *specific* about what's valid. Tt doesn't use the dot character which as you may know is a wild card and that could lead to less reliable results as almost any character would be treated as valid.

eg. (trying out data which the regex might not expect)
irb(main):042:0> "<EM</EM>"[/<.+?>/] # this returns a result with 3 angle brackets
=> "<EM</EM>"
irb(main):046:0> "<EM</EM>"[/<[^<>]+>/] # this returns (fortunately) a result with the proper 2 angle brackets
=> ""
jrobertson posts on Feb 04, 2008 at 10:40
here's the correct last bit of code I tried ..
irb(main):046:0> "<EM</EM>"[/<[^<>]+>/] # this returns (fortunately) a result with the proper 2 angle brackets
=> "</EM>"

You need to create an account or log in to post comments to this site.


Click here to browse all 4861 code snippets

Related Posts