CSV parsing regex
For a good general CSV overview see The Comma Separated Value (CSV) File Format.
A complete Ruby CSV parsing library is FasterCSV (sudo gem install fastercsv).
csv_data = <<-EOS fname,lname,age,salary nancy,davolio,33,$30000 erin,borakova,28,$25250 tony,raphael,35,$28700 "Date","Pupil","Grade" "25 May","Bloggs, Fred","C" "25 May","Doe, Jane","B" "15 July","Bloggs, Fred","D" 123456789,"Carr, Lisa",100000.00 444556666,"Barr, Clark",87000.00 777227878,"Parr, Jack",123000.00 998877665,"Charr, Lee",123000.00 Conference room 1, "John, Please bring the M. Mathers file for review -J.L. " 10/18/2002,... John,Doe,120 jefferson st.,Riverside, NJ, 08075 Jack,McGinnis,220 hobo Av.,Phila, PA,09119 "John ""Da Man""",Repici,120 Jefferson St.,Riverside, NJ,08075 Stephen,Tyler,"7452 Terrace ""At the Plaza"" road",SomeTown,SD, 91234 ,Blankman,,SomeTown, SD, 00298 "Joan ""the bone"", Anne",Jet,"9th, at Terrace plc",Desert City,CO,00123 XXXX,D,3-May-02,83.01,83.58,71.13,78.04,9645300 XXXX,D,2-May-02,82.47,85.76,82.05,83.84,7210000, XXXX,D,1-May-02,86.80,90.83,81.74,85.50,14253300 "1997",car model,E350 1997,car model,E350," Super luxurious truck " 1997,car model,E350,"Go get one now they are going fast" 1997,car model,E350,"Super ""luxurious"" truck" 1997,car model,E350,"Super, luxurious truck" 1997,car model,E350,"ac, abs, moon",3000.00 1999, car model,"Venture ""Extended Edition""",,4900.00, 1996, car model,Old Car,"BEYOND REPAIR! air, moon roof, loaded",4799.00 This,is,a test,CSV, file," from ""http://lorance.freeshell.org/csv/test.csv""." It contains,"quoted text",and,numbers 1234,5678 It also has,"quoted text with an embedded quote""<- right there" Then there are a few,,blank fields like these here ->,,, A quoted blank field,"",<- there. A quoted blank field with newline,"\n",<- there. This next one causes an error if newline handling is turned off. "There is a newline here -> <- and it should be processed correctly." ABCD "And here,,, is an""Error - no" "And here,,, is an"Error - yes "And here,,, is an",Error - no 1,2,3 ab,"c,d","e""f", "g"",""","h jk",kl "aaa","bbb","ccc" zzz,yyy,xxx "aaa","b bb","ccc" zzz,yyy,xxx "aaa","b""bb","ccc" EOS csv_data.split(/(,|\r\n|\n|\r)(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/m).each do |csv| #csv_data.split(/[,\n\r]+(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/m).each do |csv| next if csv.empty? csv = csv.strip if csv =~ /\A(".*[^"]|[^"].*")\z/m then # examples: csv => "ab\nc"def or abc"de\nf" puts puts "Error:" p csv puts csv[/\A./mu], csv[/.\z/mu] #puts csv[0..0], csv[-1..-1] puts next end if csv =~ /\A".*"\z/m then csv.gsub!(/\A"(.*)"\z/m, '\1') end # remove double-quotes at string beginning & end if csv =~ /""/m then csv.gsub!(/""/m, '"') end # remove a double-quote from double double-quotes p csv end