<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DZone Snippets: expressions code</title>
    <link>http://snippets.dzone.com/posts</link>
    <pubDate>Fri, 25 Jul 2008 02:08:35 GMT</pubDate>
    <description>DZone Snippets: expressions code</description>
    <item>
      <title>Using Regular Expressions to validate a fixed length numerical string.</title>
      <link>http://snippets.dzone.com/posts/show/4627</link>
      <description>Using Ruby and Regular Sxpressions, this code successfully validates a string if it contains exactly 3 numbers.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;a = '547'&lt;br /&gt;/^\d\d\d$/ ~= a&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Tue, 09 Oct 2007 11:55:10 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/4627</guid>
      <author>jrobertson (James Robertson)</author>
    </item>
    <item>
      <title>Using Regular Expressions to validate a numerical string.</title>
      <link>http://snippets.dzone.com/posts/show/4626</link>
      <description>Using Ruby and Regular Expressions, this code validates a string for numbers only. If the string variable contains only numbers then 0 will be returned otherwise nil.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;a = '134323'&lt;br /&gt;/^[0-9]*$/ =~ a&lt;br /&gt;#result returns 0, indicating success&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Tue, 09 Oct 2007 11:45:13 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/4626</guid>
      <author>jrobertson (James Robertson)</author>
    </item>
    <item>
      <title>Haskell Regular Expression Matcher</title>
      <link>http://snippets.dzone.com/posts/show/4434</link>
      <description>Basic implementation of Regular Expressions based on "Derivatives of Regular Expressions" by Janusz A. Brzozowski (Journal of Association for Computing Machinery, October 1964)&lt;br /&gt;&lt;br /&gt;Not really intended for serious use. Just a proof of concept.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;module Regexp&lt;br /&gt;where&lt;br /&gt;&lt;br /&gt;import Data.Set (Set)&lt;br /&gt;import Data.Map (Map)&lt;br /&gt;import Monad&lt;br /&gt;import List&lt;br /&gt;import Maybe&lt;br /&gt;import qualified Data.Set as Set&lt;br /&gt;import qualified Data.Map as Map&lt;br /&gt;&lt;br /&gt;data Regexp = &lt;br /&gt;    Zero&lt;br /&gt;  | Match Char       -- matches a single char&lt;br /&gt;  | Not Regexp       -- matches the negation of its argument&lt;br /&gt;  | Prod [Regexp]    -- matches a concatentation of its arguments&lt;br /&gt;  | Sum (Set Regexp) -- matches either of its arguments&lt;br /&gt;  | Star Regexp      -- matches repetitions of its argument (including 0 repetitions)&lt;br /&gt;  deriving (Eq, Ord)&lt;br /&gt;&lt;br /&gt;instance Show Regexp where&lt;br /&gt;  show Zero = "0"&lt;br /&gt;  show (Match c) = [c]&lt;br /&gt;  show (Not x)   = '~' : show x&lt;br /&gt;  show (Prod x) = join . (map show)  $ x&lt;br /&gt;  show (Sum x) = "(" ++ ( join . intersperse "|" . (map show) . Set.toList $ x ) ++ ")"&lt;br /&gt;  show (Star x) = "(" ++ show x ++ ")*" &lt;br /&gt;&lt;br /&gt;-- Flagrant abuse of type classes to allow implicit conversion of datatypes into regular&lt;br /&gt;-- expressions.&lt;br /&gt;class Match a where&lt;br /&gt;  match :: a -&gt; Regexp&lt;br /&gt;&lt;br /&gt;instance Match Char where&lt;br /&gt;  match c = Match c&lt;br /&gt;&lt;br /&gt;instance (Match a) =&gt; Match [a] where&lt;br /&gt;  match = con&lt;br /&gt;&lt;br /&gt;instance Match Regexp where&lt;br /&gt;  match = id&lt;br /&gt;&lt;br /&gt;-- "smart" versions of the constructors, which perform normalisation of the datatype.&lt;br /&gt;-- As long as all regular expressions are built up using these and the match instance&lt;br /&gt;-- for char we can guarantee that structural equality of terms == similarity.&lt;br /&gt;-- This is important to make sure we only generate a finite number of states.&lt;br /&gt;zero :: Regexp &lt;br /&gt;zero = Zero &lt;br /&gt;&lt;br /&gt;one :: Regexp&lt;br /&gt;one = Prod []&lt;br /&gt;&lt;br /&gt;(&lt;+&gt;) :: (Match a, Match b) =&gt; a -&gt; b -&gt; Regexp&lt;br /&gt;x &lt;+&gt; y = &lt;br /&gt;  case (match x, match y) of&lt;br /&gt;    (Zero, b)      -&gt; b&lt;br /&gt;    (a, Zero)      -&gt; a&lt;br /&gt;    (Sum a, Sum b) -&gt; Sum (Set.union a b)&lt;br /&gt;    (Sum a, b)     -&gt; Sum (Set.insert b a)&lt;br /&gt;    (a, Sum b)     -&gt; Sum (Set.insert a b)    &lt;br /&gt;    (a, b)         -&gt; Sum $ Set.fromList [a, b]&lt;br /&gt;  &lt;br /&gt;oneOf :: (Match a) =&gt; [a] -&gt; Regexp&lt;br /&gt;oneOf = foldr (&lt;+&gt;) zero&lt;br /&gt;&lt;br /&gt;(&lt;*&gt;) :: (Match a, Match b) =&gt; a -&gt; b -&gt; Regexp&lt;br /&gt;u &lt;*&gt; v = &lt;br /&gt;  case (match u, match v) of &lt;br /&gt;    (Zero, _)         -&gt; zero&lt;br /&gt;    (_, Zero)         -&gt; zero&lt;br /&gt;    (Prod x, Prod y)  -&gt; Prod (x ++ y)&lt;br /&gt;    (Prod x, y)       -&gt; Prod (x ++ [y])&lt;br /&gt;    (x, Prod y)       -&gt; Prod (x : y)&lt;br /&gt;    (x, y)            -&gt; Prod [x, y]&lt;br /&gt;&lt;br /&gt;con :: (Match a) =&gt; [a] -&gt; Regexp&lt;br /&gt;con = foldr (&lt;*&gt;) one&lt;br /&gt;&lt;br /&gt;neg :: (Match a) =&gt; a -&gt; Regexp&lt;br /&gt;neg x = &lt;br /&gt;  case (match x) of&lt;br /&gt;  (Not y) -&gt; y&lt;br /&gt;  y       -&gt; Not y&lt;br /&gt;&lt;br /&gt;star :: (Match a) =&gt; a -&gt; Regexp&lt;br /&gt;star x =&lt;br /&gt;  case (match x) of&lt;br /&gt;    (Zero)   -&gt; Zero&lt;br /&gt;    (Star y) -&gt; Star y&lt;br /&gt;    y        -&gt; Star y&lt;br /&gt;&lt;br /&gt;-- Returns if the regex matches the empty string.&lt;br /&gt;del :: Regexp -&gt; Bool&lt;br /&gt;del (Zero)    = False&lt;br /&gt;del (Sum x)   = or . map del $ Set.toList x&lt;br /&gt;del (Prod x)  = and . map del $ x&lt;br /&gt;del (Match _) = False&lt;br /&gt;del (Not x)   = not $ del x;&lt;br /&gt;del (Star _)  = True&lt;br /&gt;&lt;br /&gt;-- The derivative of a regular language A with respect to a character&lt;br /&gt;-- c is dA/dc = { s : cs \in A } &lt;br /&gt;diff :: Char -&gt; Regexp -&gt; Regexp&lt;br /&gt;diff _ (Zero)  = zero&lt;br /&gt;diff c (Match d) | (c == d) = one&lt;br /&gt;diff c (Match d) = zero&lt;br /&gt;diff c (Sum x) = oneOf $ (map $ diff c) (Set.toList x)&lt;br /&gt;diff c (Prod []) = zero&lt;br /&gt;diff c (Prod (x:xs)) | del x = (diff c x &lt;*&gt; xs) &lt;+&gt; diff c (Prod xs)&lt;br /&gt;diff c (Prod (x:xs)) = diff c x &lt;*&gt; xs&lt;br /&gt;diff c (Not x) = Not (diff c x)&lt;br /&gt;diff c (Star x) = diff c x &lt;*&gt; Star x&lt;br /&gt;&lt;br /&gt;flattenSet :: (Ord a) =&gt; Set (Set a) -&gt; Set a&lt;br /&gt;flattenSet = Set.fold Set.union Set.empty&lt;br /&gt;&lt;br /&gt;(/&gt;&gt;=) :: (Ord a, Ord b) =&gt; Set a -&gt; (a -&gt; Set b) -&gt; Set b&lt;br /&gt;x /&gt;&gt;= f = flattenSet (Set.map f x)&lt;br /&gt;&lt;br /&gt;-- The alphabet of all characters that appear in this regexp&lt;br /&gt;alphabet :: Regexp -&gt; Set Char&lt;br /&gt;alphabet (Zero) = Set.empty&lt;br /&gt;alphabet (Sum x) = flattenSet (Set.map alphabet x) &lt;br /&gt;alphabet (Prod x) = Set.unions $ map alphabet x&lt;br /&gt;alphabet (Not x) = alphabet x&lt;br /&gt;alphabet (Star x) = alphabet x&lt;br /&gt;alphabet (Match c) = Set.singleton c&lt;br /&gt;&lt;br /&gt;-- Set of all derivatives of a regular expression (including itself, and higher order derivatives).&lt;br /&gt;derivatives :: Regexp -&gt; [Regexp]&lt;br /&gt;derivatives exp = Set.toList $ enlarge (Set.singleton exp) (Set.singleton exp) &lt;br /&gt;  where&lt;br /&gt;    alpha = alphabet exp&lt;br /&gt;    firstDerivatives x = Set.map (`diff` x) alpha &lt;br /&gt;    enlarge :: Set Regexp -&gt; Set Regexp -&gt; Set Regexp&lt;br /&gt;    enlarge new found = &lt;br /&gt;      if Set.null new&lt;br /&gt;        then found&lt;br /&gt;        else&lt;br /&gt;          let nextNew   = (new /&gt;&gt;= firstDerivatives) Set.\\ found&lt;br /&gt;              nextFound = found `Set.union` nextNew&lt;br /&gt;          in enlarge nextNew nextFound&lt;br /&gt;&lt;br /&gt;-- A simple finite state machine type &lt;br /&gt;data FSM = State { transitions :: (Map Char FSM), isFinal :: Bool } &lt;br /&gt;&lt;br /&gt;-- Converts a Regexp into a finite state machine by using the derivatives&lt;br /&gt;-- with respect to specific characters as the transitions. Essentially at &lt;br /&gt;-- each stage we build up a regular expression that the remaining characters&lt;br /&gt;-- have to match. Due to Cunning Mathematics, only finitely many such regular&lt;br /&gt;-- expressions (up to similarity) result.&lt;br /&gt;compile :: Regexp -&gt; FSM&lt;br /&gt;compile x = fromJust $ Map.lookup x states&lt;br /&gt;  where&lt;br /&gt;    states :: Map Regexp FSM&lt;br /&gt;    states = Map.fromList $&lt;br /&gt;      do re &lt;- derivatives x -- Totally gratuitious use of list monad. :)&lt;br /&gt;         let trans = do c &lt;- Set.toList $ alphabet re&lt;br /&gt;                        let d = diff c re&lt;br /&gt;                        return (c, fromJust $ Map.lookup d states)&lt;br /&gt;         let state = State (Map.fromList trans) (del re) &lt;br /&gt;         return (re, state) &lt;br /&gt;&lt;br /&gt;runFSM :: FSM -&gt; String -&gt; Bool&lt;br /&gt;runFSM x []     = isFinal x&lt;br /&gt;runFSM x (c:cs) = case (Map.lookup c $ transitions x) of&lt;br /&gt;                    Nothing -&gt; False&lt;br /&gt;                    Just y  -&gt; runFSM y cs&lt;br /&gt;               &lt;br /&gt;matches :: (Match a) =&gt; String -&gt; a -&gt; Bool&lt;br /&gt;matches cs exp = runFSM (compile $  match exp) cs&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Sun, 19 Aug 2007 20:57:56 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/4434</guid>
      <author>DRMacIver (David R. MacIver)</author>
    </item>
    <item>
      <title>Handling Accented Characters with Python Regular Expressions</title>
      <link>http://snippets.dzone.com/posts/show/1588</link>
      <description>[A-z] just isn't good enough!&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;import re&lt;br /&gt;string = 'rich&#233;'&lt;br /&gt;print string&lt;br /&gt;rich&#233;&lt;br /&gt;&lt;br /&gt;richre = re.compile('([A-z]+)')&lt;br /&gt;match = richre.match(string)&lt;br /&gt;print match.groups()&lt;br /&gt;('rich',)&lt;br /&gt;&lt;br /&gt;richre = re.compile('(\w+)',re.LOCALE)&lt;br /&gt;match = richre.match(string)&lt;br /&gt;print match.groups()&lt;br /&gt;('rich',)&lt;br /&gt;&lt;br /&gt;richre = re.compile('([&#233;\w]+)')&lt;br /&gt;match = richre.match(string)&lt;br /&gt;print match.groups()&lt;br /&gt;('rich\xe9',)&lt;br /&gt;&lt;br /&gt;richre = re.compile('([\xe9\w]+)')&lt;br /&gt;match = richre.match(string)&lt;br /&gt;print match.groups()&lt;br /&gt;('rich\xe9',)&lt;br /&gt;&lt;br /&gt;richre = re.compile('([\xe9-\xf8\w]+)')&lt;br /&gt;match = richre.match(string)&lt;br /&gt;print match.groups()&lt;br /&gt;('rich\xe9',)&lt;br /&gt;&lt;br /&gt;string = 'rich&#233;&#241;'&lt;br /&gt;match = richre.match(string)&lt;br /&gt;print match.groups()&lt;br /&gt;('rich\xe9\xf1',)&lt;br /&gt;&lt;br /&gt;richre = re.compile('([\u00E9-\u00F8\w]+)')&lt;br /&gt;print match.groups()&lt;br /&gt;('rich\xe9\xf1',)&lt;br /&gt;&lt;br /&gt;matched = match.group(1)&lt;br /&gt;print matched&lt;br /&gt;rich&#233;&#241;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Mon, 27 Feb 2006 21:20:20 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/1588</guid>
      <author>offspinner ()</author>
    </item>
    <item>
      <title>Regular Expression for WikiWords</title>
      <link>http://snippets.dzone.com/posts/show/59</link>
      <description>This RegEx should return all the Wiki Words (i.e. any word in Camel Case):&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;[^A-Za-z]*([A-Z][a-z]+[A-Z][a-z]+([A-Z][a-z]+)*)[^A-Za-z]*&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Thu, 07 Apr 2005 23:07:07 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/59</guid>
      <author>sottey (Sean Ottey)</author>
    </item>
  </channel>
</rss>
