Never been to DZone Snippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

« Newer Snippets
Older Snippets »
Showing 1-10 of 14 total  RSS 

Finding your match with Ruby

This example finds an email subject in a string and passes the value to a variable called 'subject'.
subject = (/^Subject\: (.+)$/).match(email)[1]

Prior to this code I would have used the following:
  email[/^Subject\: (.+)$/]
  subject = $1

Reference: Ruby SMTP Server - Save to Database [dzone.com]

Lookaround in Regular Expressions

What follows below is the output from an irb session to specifically try out Lookahead and Lookbehind in regex.

irb(main):002:0> "question"[/q(?=u)/]
=> "q"
irb(main):003:0> "qu>"[/qu(?=>)/]
=> "qu"
irb(main):004:0> "qu"[/qu(?=>)/]
=> nil
irb(main):006:0> "accumulator"[/(?=c)cumulator/]
=> "cumulator"
irb(main):007:0> "accumulator"[/a(?=c)cumulator/]
=> nil
irb(main):009:0> "abc"[/abc(?=c)/]
=> nil
irb(main):010:0> "abc"[/ab(?=c)/]
=> "ab"

irb(main):011:0> "abd"[/ab(?!c)/]
=> "ab"
irb(main):012:0> "abc"[/ab(?!c)/]
=> nil

irb(main):013:0> "abc"[/a(?<=b)c/] # lookbehind doesn't work in Ruby
SyntaxError: compile error
(irb):13: undefined (?...) sequence: /a(?<=b)c/
        from (irb):13
        from :0
irb(main):014:0> "abc"[/(?<=a)bc/] # lookbehind doesn't work in Ruby
SyntaxError: compile error
(irb):14: undefined (?...) sequence: /(?<=a)bc/
        from (irb):14
        from :0

irb(main):022:0> "how's the weather?"[/how.*(?=the)/]
=> "how's the wea"
irb(main):024:0> "how's the weather?"[/how.*(?=[the])/]
=> "how's the weath"
irb(main):025:0> "how's the weather?"[/how.*(?=t)/]
=> "how's the wea"
irb(main):037:0> "how's the weather?"[/how.+(?=\bt)/]
=> "how's "

irb(main):057:0> "how's the weather?"[/ho.*(?=[a-z])/]
=> "how's the weathe"
irb(main):061:0> "how's the weather?"[/ho.*(?=['])/]
=> "how"

reference: Regular Expressions Quick Start [regular-expressions.info]

Greedy vs Lazy in Regular Expressions

The greedy expression can be seen as a True *and* False predicate, meaning true while the token is valid, while being false if the pattern matching hasn't been exhausted.
The lazy expression is the True *or* False predicate, meaning true while the token is valid, or false if the pattern match hasn't been fully exhausted.

irb(main):037:0> "<EM>first</EM>"[/<.+>/] #greedy 
=> "<EM>first</EM>"
irb(main):038:0> "<EM>first</EM>"[/<.+?>/] #lazy
=> "<EM>"
irb(main):039:0> "<EM>first</EM>"[/<[^<>]+>/] # better solution
=> "<EM>"

source: Regular Expression Quick Start [regular-expressions.info]

Backreferences in Regular Expressions

This irb session example shows back referencing, where a token enclosed within round brackets is automatically assigned as a variable. That variable is represented by a dollar ($) symbol followed by it's index number starting at 1.

#match capture = STRONG
irb(main):652:0> "<STRONG>text here</STRONG>"[/<(.*)>.*<\/\1>/]
=> "<STRONG>text here</STRONG>"

irb(main):655:0> "abttba"[/^(.)(.).*\2\1$/]
=> "abttba"
irb(main):658:0> "ab123ttba"[/^(.)(.)(123).*\2\1\3$/]
=> nil
irb(main):659:0> "ab123ttba123"[/^(.)(.)(123).*\2\1\3$/]
=> "ab123ttba123"
irb(main):663:0> "how are you"[/(o)*\1/]
=> nil
irb(main):664:0> "how are you"[/(o).*\1/]
=> "ow are yo"
irb(main):685:0> "how are you"[/(o)[^\1]+\1/]
=> "ow are yo"
irb(main):686:0> "how are you"[/(o)[^\1]+\1u/]
=> "ow are you"
irb(main):692:0> "how is Lucy?"[/(o)[^\1]+\1/]
=> nil
irb(main):693:0> "how is Lucy?"[/(o)[^\1]+\1?/]
=> "ow is Lucy?"

irb(main):702:0> "I eat now?".gsub(/(.)\s(\w+)\s(\w+)(.)/,$1)
=> "I"
irb(main):708:0> "I eat now?".gsub(/(.)\s(\w+)\s(\w+)(.)/,"#{$3} #{$2}")
=> "now eat"
irb(main):711:0> "I eat now?".gsub(/(.)\s(\w+)\s(\w+)(.)/,"#{$3} #{$1} #{$2}#{$4}")
=> "now I eat?"

irb(main):712:0> $1
=> "I"


Backreference support in Ruby uses the following variables:

$` returns everything before the matched string.
$' returns everything after the matched string.
$+ returns whatever the last bracket match matched.
$& returns the entire matched string.

irb(main):713:0> $&
=> "I eat now?"

irb(main):715:0> "I eat now? Mother?".gsub(/(.)\s(\w+)\s(\w+)(.)/,"#{$3} #{$1} #{$2}#{$4}")
=> "now I eat? Mother?"

irb(main):716:0> $'
=> " Mother?"
irb(main):721:0> $'[/M/]
=> "M"

irb(main):732:0> "I'm hungry ... I eat now? Mother?".gsub(/[^.](.)\s(\w+)\s(\w+)(.)/,"#{$3} #{$1} #{$2}#{$4}")
=> "I'm hungry ...now I eat? Mother?"
irb(main):733:0> $`
=> "I'm hungry ..."


Note: These Ruby examples should work from Rubular [rubular.com].
Reference: Unix Regular Expressions: Backreferences [webreference.com]

Repetition in Regular Expressions

This irb session example helps demonstrate repetition in regular expressions using * ? + {}

Definitions:

? - the preceding token is optional
+ - find the token 1 or more times
* - find the token 0 or more times
{n} - repeat the token n no of times


irb(main):058:0> "this song is my favourite"[/favou?rite/]
=> "favourite"
irb(main):059:0> "this song is my favorite"[/favou?rite/]
=> "favorite"

irb(main):100:0> "this is my favourite\s"[/this\s((song|video)\s)?is/]
=> "this is"
irb(main):101:0> "this song is my favourite\s"[/this\s((song|video)\s)?is/]
=> "this song is"
irb(main):102:0> "this flower is my favourite\s"[/this\s((song|video)\s)?is/]
=> nil
irb(main):103:0> "this video is my favourite\s"[/this\s((song|video)\s)?is/]
=> "this video is"

irb(main):105:0> "anything you say"[/.*/]
=> "anything you say"
irb(main):119:0> "will be repeated"[/.*/]
=> "will be repeated"
irb(main):116:0> "good times</p>"[/.*[^<\/p>]/]
=> "good times"

irb(main):220:0> "test 123 test 123"[/\d+/]
=> "123"
irb(main):224:0> "test 123 test 123"[/\d{2}/]
=> "12"
irb(main):226:0> "test 123 test 123"[/\w+/]
=> "test"

irb(main):008:0> "try this test that"[/try.\b\w+/]
=> "try this"
irb(main):009:0> "try this test that"[/test.\b\w+/]
=> "test that"

irb(main):233:0> "test 123 test 123"[/\w+\s\d+/]
=> "test 123"
irb(main):248:0> "test 123 test 123"[/\w+\s\d+.*/]
=> "test 123 test 123"
irb(main):249:0> "test this test 123"[/\w+\s\d+.*/]
=> "test 123"
irb(main):250:0> "test this test that"[/\w+\s\d+.*/]
=> nil
irb(main):279:0> "try this test that"[/try\s\w[^\s]+/]
=> "try this"
irb(main):280:0> "try this test that"[/test\s\w[^\s]+/]
=> "test that"

irb(main):310:0> "try this test that"[/test\s\w\B+/]
=> "test tha"
irb(main):308:0> "try this test that"[/try\s\w\B+./]
=> "try this"
irb(main):309:0> "try this test that"[/test\s\w\B+./]
=> "test that"

irb(main):322:0> "try this test that"[/try\s\w+\b/]
=> "try this"
irb(main):323:0> "try this test that"[/test\s\w+\b/]
=> "test that"

irb(main):008:0> "try this test that"[/try.\b\w+/]
=> "try this"
irb(main):009:0> "try this test that"[/test.\b\w+/]
=> "test that"

irb(main):010:0> "try this test that"[/test.\w+/]
=> "test that"
irb(main):011:0> "try this test that"[/try.\w+/]
=> "try this"

irb(main):012:0> "try this test that"[/try\s\w+/]
=> "try this"
irb(main):013:0> "try this test that"[/test\s\w+/]
=> "test that"

irb(main):015:0> "try this test that"[/try\s\w+\s\w+/]
=> "try this test"

irb(main):041:0> "try this test that"[/\w+\s\w+$/]
=> "test that"
irb(main):043:0> "try this test that"[/^\w+\s\w+/]
=> "try this"



Note: Similar to CSS the more specific an expression or selector is the less ambiguous it is.

Alternation in Regular Expressions

The following code from an irb session highlights optional pattern matching using an OR operator, represented as a bar (|).

irb(main):003:0> "chickens are sleeping"[/chick|sleeping/]
=> "chick"
irb(main):004:0> "chickens are sleeping"[/sleeping/]
=> "sleeping"
irb(main):005:0> "chickens are sleeping"[/sleeping|chicken/]
=> "chicken"
irb(main):006:0> "chickens are sleeping"[/sleeping|are|chicken/]
=> "chicken"
irb(main):007:0> "chickens are sleeping"[/sleeping|are/]
=> "are"
irb(main):008:0> "chickens are sleeping"[/[^sleeping|are]/]
=> "c"
irb(main):009:0> "chickens are sleeping"[/[^sleeping|are].[a-z]*\s/]
=> "chickens "
irb(main):011:0> "chickens are sleeping"[/[^chickens|are].[a-z]*\s/]
=> " are "
irb(main):015:0> "chickens are sleeping"[/[sleeping|are].[a-z]*\s/]
=> "ickens "
irb(main):020:0> "scared chickens are sleeping"[/are|sleeping/]
=> "are"
irb(main):026:0> "scared chickens are sleeping"[/(are|sleeping)../]
=> "ared "
irb(main):023:0> "scared chickens are sleeping"[/\b(are|sleeping)\b/]
=> "are"
irb(main):025:0> "scared chickens are sleeping"[/\b(are|sleeping)\b.../]
=> "are sl"
irb(main):032:0> "scared chickens are sleeping"[/(monkeys|dancing)|(sheep|laughing)/]
=> nil
irb(main):033:0> "scared chickens are sleeping"[/(monkeys|dancing)|(sheep|laughing)|(chickens)/]
=> "chickens"
irb(main):034:0> "scared chickens are sleeping"[/(monkeys|sleeping)|(sheep|laughing)|(chickens)/]
=> "chickens"

Anchors in Regular Expressions

Anchors are about finding matches beginning with or ending with a certain pattern.

This example from an irb session helps to show what \b, \w, ^, $ and \B do. \b - matches a word boundary, \w - matches a word character, ^ - matches a character at the beginning of a line, $ - matches the character at the end of a line, and \B matches a word boundary charcter which does not match \w.

irb(main):010:0> "paella"[/\ba/]
=> nil
irb(main):013:0> "artistic paella wake"[/\wa/]
=> "pa"
irb(main):026:0> "artistic paella wake"[/\B.[a-z]a$/]
=> nil
irb(main):027:0> "artistic paella wake"[/\B.[a-z]$/]
=> "ke"
irb(main):039:0> "artistic paella wake"[/\B./]
=> "r"
irb(main):040:0> "artistic paella wake"[/\Ba./]
=> "ae"
irb(main):044:0> "artistic paella wake"[/[^\B.{3}]/]
=> "a"
irb(main):046:0> "artistic paella wake"[/[^\B.]{3}/]
=> "art"
irb(main):047:0> "artistic paella wake"[/[^\w]{3}/]
=> nil
irb(main):047:0> "artistic paella wake"[/[^\w]{3}/]
=> nil
irb(main):048:0> "artistic paella wake"[/[^\br]{3}/]
=> "tis"
irb(main):050:0> "artistic paella wake"[/[^\b]{3}/]
=> "art"
irb(main):051:0> "artistic paella wake"[/[^\ba]{3}/]
=> "rti"
irb(main):054:0> "artistic paella wake"[/\ba/]
=> "a"
irb(main):055:0> "artistic paella wake"[/\ba./]
=> "ar"
irb(main):058:0> "people paella wake"[/\bp./]
=> "pe"
irb(main):060:0> "apostrophe paella wake"[/\bpa./]
=> "pae"
irb(main):061:0> "apostrophe paella wake"[/\bp./]
=> "pa"
irb(main):062:0> "apostrophe paella wake"[/\Bp./]
=> "po"
irb(main):064:0> "pancake paella wake"[/\Bp./]
=> nil
irb(main):065:0> "anticipated paella wake"[/\Bp./]
=> "pa"
irb(main):066:0> "anticipated paella wake"[/\Bp../]
=> "pat"
irb(main):067:0> "anticipated paella wake"[/\wp../]
=> "ipat"

Shorthand Character Classes in Regular Expressions

Learning Regular Expressions can seem tedious however with a little knowledge you can programmatically select or validate specific text in a few lines of code. key: \d = digit; \w = alphanumeric; \s space (includes tabs and line-breaks)

irb(main):462:0> "happy"[/\d/]
=> nil
irb(main):463:0> "happy"[/\w/]
=> "h"
irb(main):464:0> "123 happy"[/\w/]
=> "1"
irb(main):465:0> "123 happy"[/\d/]
=> "1"
irb(main):466:0> "123 happy"[/\d\s/]
=> "3 "
irb(main):467:0> "123 happy"[/\d{3}/]
=> "123"
irb(main):468:0> "123 happy"[/\d{3}\s\w{3}/]
=> "123 hap"

Character sets in Regular Expressions

Here is a listing from my irb session, as you can see I'm learning character sets with regular expressions. The characters ('a' and 'e') within the square brackets are treated as operands which are XORed between the characters 'h' and 'y'.

irb(main):155:0> "hello"[/[ae]/]
=> "e"
irb(main):156:0> "hello"[/h[ae]y/]
=> nil
irb(main):157:0> "hay"[/h[ae]y/]
=> "hay"
irb(main):158:0> "howdy"[/h[ae]y/]
=> nil
irb(main):159:0> "hardy"[/h[ae]y/]
=> nil
irb(main):160:0> "hardy"[/h[ae]/]
=> "ha"
irb(main):161:0> "hardy"[/h[ae]y/]
=> nil
irb(main):162:0> "hoy"[/h[ae]y/]
=> nil
irb(main):165:0> "they"[/h[ae]y/]
=> "hey"
irb(main):166:0> "they would"[/h[ae]y/]
=> "hey"

singleton in ruby

// simple singleton

class Singleton 
  private_class_method :new
  @@singleton = nil
  
  def Singleton.create
    @@singleton = new unless @@singleton
    @@singleton
  end
end
« Newer Snippets
Older Snippets »
Showing 1-10 of 14 total  RSS