<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DZone Snippets: method code</title>
    <link>http://snippets.dzone.com/posts</link>
    <pubDate>Thu, 24 Jul 2008 21:13:07 GMT</pubDate>
    <description>DZone Snippets: method code</description>
    <item>
      <title>A simple XHTML submit form for ProjectX</title>
      <link>http://snippets.dzone.com/posts/show/5354</link>
      <description>Preparing ProjectX API requests through the browser's address bar  can get quite messy, however inputting the request through a simple form makes it much easier to read.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"&lt;br /&gt;  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;&lt;br /&gt;&lt;html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"&gt;&lt;br /&gt;  &lt;head&gt;&lt;br /&gt;    &lt;title&gt;ProjectX API&lt;/title&gt;&lt;br /&gt;    &lt;meta http-equiv="Content-Type" content="text/html;charset=utf-8"/&gt;&lt;br /&gt;  &lt;/head&gt;&lt;br /&gt;  &lt;body&gt;&lt;br /&gt;    &lt;h1&gt;ProjectX API form&lt;/h1&gt;&lt;br /&gt;    &lt;p&gt;Enter the Project API XML to send a request to the server.&lt;/p&gt;&lt;br /&gt;    &lt;form action="http://rorbuilder.info/api/projectx.cgi" method="post" id="projectx_form"&gt;&lt;br /&gt;    &lt;fieldset&gt;&lt;legend&gt;xml_project&lt;/legend&gt;&lt;textarea id="xml_project" name="xml_project" cols="104" rows="20"&gt;&lt;/textarea&gt;&lt;/fieldset&gt;&lt;br /&gt;    &lt;div&gt;&lt;button type="submit"&gt;Submit&lt;/button&gt;&lt;/div&gt;&lt;br /&gt;    &lt;/form&gt;&lt;br /&gt;  &lt;p&gt;&lt;br /&gt;    &lt;a href="http://validator.w3.org/check?uri=referer"&gt;&lt;img&lt;br /&gt;        src="http://www.w3.org/Icons/valid-xhtml10"&lt;br /&gt;        alt="Valid XHTML 1.0 Strict" height="31" width="88" style="float:right;  border:0 "/&gt;&lt;/a&gt;&lt;br /&gt;  &lt;/p&gt;&lt;br /&gt;  &lt;p style="clear:float"&gt;last updated: 13th April 2008&lt;/p&gt;&lt;br /&gt;  &lt;br /&gt;  &lt;/body&gt;&lt;br /&gt;&lt;/html&gt;&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;The web page can be seen at http://rorbuilder.info/r/projectx-api/index.html&lt;br /&gt;The following XML request value when submitted should return an XML result containing the results and the method executed.&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&lt;project name='whiteboardqueue'&gt;&lt;br /&gt;  &lt;methods&gt;&lt;br /&gt;    &lt;method name='get_user_id'&gt;&lt;br /&gt;      &lt;params/&gt;&lt;br /&gt;    &lt;/method&gt;&lt;br /&gt;  &lt;/methods&gt;&lt;br /&gt;&lt;/project&gt;&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;eg.&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&lt;result method="rtn_get_user_id"&gt;&lt;br /&gt;  &lt;get_user_id&gt;36539&lt;/get_user_id&gt;&lt;br /&gt;&lt;/result&gt;&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Sun, 13 Apr 2008 22:32:48 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/5354</guid>
      <author>jrobertson (James Robertson)</author>
    </item>
    <item>
      <title>Declaring a private method in Ruby</title>
      <link>http://snippets.dzone.com/posts/show/5090</link>
      <description>The following example code shows how to make a method private in Ruby. source: &lt;a href="http://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Classes"&gt;Ruby Programming/Syntax/Classes&lt;/a&gt; [wikibooks.org]&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Simple example:&lt;br /&gt;&lt;code&gt;&lt;br /&gt; class Example&lt;br /&gt;   def methodA&lt;br /&gt;   end&lt;br /&gt;   &lt;br /&gt;   private # all methods that follow will be made private: not accessible for outside objects&lt;br /&gt;   &lt;br /&gt;   def methodP&lt;br /&gt;   end&lt;br /&gt; end&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;If private is invoked without arguments, it sets access to private for all subseqent methods. It can also be invoked with named arguments.&lt;br /&gt;&lt;br /&gt;Named private method example:&lt;br /&gt;&lt;code&gt;&lt;br /&gt; class Example&lt;br /&gt;   def methodA&lt;br /&gt;   end&lt;br /&gt;   &lt;br /&gt;   def methodP&lt;br /&gt;   end&lt;br /&gt;   &lt;br /&gt;   private :methodP&lt;br /&gt; end&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Here private was invoked with an argument, altering the visibility of methodP to private.&lt;br /&gt;</description>
      <pubDate>Sat, 02 Feb 2008 20:18:28 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/5090</guid>
      <author>jrobertson (James Robertson)</author>
    </item>
    <item>
      <title>How to call a base class method</title>
      <link>http://snippets.dzone.com/posts/show/5082</link>
      <description>This Ruby example demonstrates using the keyword super to call the superclass method.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;class Claw&lt;br /&gt;  def grab(item)&lt;br /&gt;    puts item + ' grabbed'&lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;class Hand &lt; Claw&lt;br /&gt;  def grab(item)&lt;br /&gt;    super(item)&lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;h = Hand.new&lt;br /&gt;h.grab('apple')&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;output &lt;br /&gt;&lt;code&gt;&lt;br /&gt;apple grabbed&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;for more information: http://www.google.com/search?q=ruby+keyword+super</description>
      <pubDate>Sat, 02 Feb 2008 00:51:59 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/5082</guid>
      <author>jrobertson (James Robertson)</author>
    </item>
    <item>
      <title>Unify the handling of XML records</title>
      <link>http://snippets.dzone.com/posts/show/4833</link>
      <description>This Ruby code creates, updates or deletes an XML record, using a hash, record handling objects, and XML to invoke the correct method.  &lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;#file: recordx_handler.rb&lt;br /&gt;require 'recordx'&lt;br /&gt;&lt;br /&gt;class RecordX_Update &lt; RecordX&lt;br /&gt;  def call(params)&lt;br /&gt;    #todo: write the code to read the xml parameter string&lt;br /&gt;    update_record(h)&lt;br /&gt;    save_file &lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;class RecordX_Create &lt; RecordX&lt;br /&gt;  def call(params)&lt;br /&gt;    #todo: write the code to read the xml parameter string  &lt;br /&gt;    create_record(h)&lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;class RecordX_Delete &lt; RecordX&lt;br /&gt;  def call(params)&lt;br /&gt;    doc = Document.new(params)&lt;br /&gt;    node = doc.root.elements["param[@var='id']"]&lt;br /&gt;    puts node&lt;br /&gt;    id = node.attributes.get_attribute('val').to_s&lt;br /&gt;    delete_record(id)&lt;br /&gt;    puts 'deleted record ' + id&lt;br /&gt;    save_file&lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;class RecordX_handler&lt;br /&gt;  def invoke(method, params)&lt;br /&gt;    h = Hash.new&lt;br /&gt;    h["create"] = RecordX_Create.new&lt;br /&gt;    h["update"] = RecordX_Update.new&lt;br /&gt;    h["delete"] = RecordX_Delete.new&lt;br /&gt;    h[method].call(params)&lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;if __FILE__ == $0&lt;br /&gt;  xml_method = "&lt;method name='delete'&gt;&lt;params&gt;&lt;param var='id' val='17648' /&gt;&lt;/params&gt;&lt;/method&gt;"&lt;br /&gt;  doc = Document.new(xml_method)&lt;br /&gt;  method = doc.root.attributes.get_attribute('name').to_s&lt;br /&gt;  params = doc.root.elements['params'].to_s&lt;br /&gt;&lt;br /&gt;  rh = RecordX_handler.new&lt;br /&gt;  rh.invoke(method, params)&lt;br /&gt;end&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;This code is intended to be called by a Ruby CGI script which can simply relay the cgi post argument to the recordx_handler object.&lt;br /&gt;</description>
      <pubDate>Sun, 02 Dec 2007 13:50:18 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/4833</guid>
      <author>jrobertson (James Robertson)</author>
    </item>
    <item>
      <title>UTF8-aware string methods in Ruby</title>
      <link>http://snippets.dzone.com/posts/show/4527</link>
      <description>Author:  ntk&lt;br /&gt;License:    &lt;a href="http://www.opensource.org/licenses/mit-license.php"&gt;The MIT License&lt;/a&gt;, Copyright (c) 2007 ntk&lt;br /&gt;Description:  some basic UTF8-aware string methods for Ruby's String class (Ruby 1.8.6)&lt;br /&gt;Requirements: save this snippet to an UTF-8 encoded file and set the character set encoding of Terminal.app &lt;br /&gt;              to UTF-8 (on Mac OS X: Terminal menu -&gt; Window Settings -&gt; Display -&gt; Character Set Encoding; to enable additional features see &lt;a href="http://smyck.de/2007/06/06/great-stuff-being-able-to-type-utf-8-characters-in-a-terminal-on-os-x/"&gt;here&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Further tools:&lt;br /&gt;- &lt;a href="http://www.yoshidam.net/Ruby.html"&gt;rbuconv&lt;/a&gt;, a pure Ruby library for Unicode translation&lt;br /&gt;- &lt;a href="http://www.yoshidam.net/unicode.txt"&gt;unicode&lt;/a&gt;, a library for Unicode Normalization (sudo gem install unicode); for a Windows version see &lt;a href="http://www.ruby.org.ee/wiki/Unicode_in_Ruby/Rails"&gt;Unicode in Ruby on Rails&lt;/a&gt;&lt;br /&gt;- &lt;a href="http://icu4r.rubyforge.org"&gt;ICU4R&lt;/a&gt;, a Ruby C-extension binding for the &lt;a href="http://www.icu-project.org"&gt;ICU&lt;/a&gt; library&lt;br /&gt;- &lt;a href="http://billposer.org/Software/msort.html"&gt;Msort&lt;/a&gt;, a command-line sorting program&lt;br /&gt;- &lt;a href="http://raa.ruby-lang.org/project/punycode4r/"&gt;punycode4r&lt;/a&gt;, a pure Ruby implementation of Punycode (RFC 3492; sudo gem install punycode4r)&lt;br /&gt;- &lt;a href="http://www.flexiguided.de/publications.utf8proc.en.html"&gt;utf8proc&lt;/a&gt;, library for processing UTF-8 encoded Unicode strings, (sudo gem install utf8proc)&lt;br /&gt;- &lt;a href="http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt"&gt;Oniguruma&lt;/a&gt;, Ruby's regular expression engine; cf. &lt;a href="http://www.igvita.com/blog/2007/04/11/secure-utf-8-input-in-rails/"&gt;Secure UTF-8 Input in Rails&lt;/a&gt; and &lt;a href="http://woss.name/2006/10/25/migrating-your-rails-application-to-unicode/"&gt;Migrating your Rails application to Unicode&lt;/a&gt;&lt;br /&gt;- &lt;a href="http://rubyforge.org/projects/char-encodings/"&gt;character-encodings&lt;/a&gt;, seamless integration of character encodings into Ruby's String class, (sudo gem install character-encodings)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&lt;br /&gt;class String&lt;br /&gt;&lt;br /&gt;   require 'iconv' &lt;br /&gt;   require 'open-uri'      # cf. http://www.ruby-doc.org/stdlib/libdoc/open-uri/rdoc/index.html&lt;br /&gt;&lt;br /&gt;   # taken from: http://www.w3.org/International/questions/qa-forms-utf-8&lt;br /&gt;   UTF8REGEX = /\A(?:                               # ?: non-capturing group (grouping with no back references)&lt;br /&gt;                 [\x09\x0A\x0D\x20-\x7E]            # ASCII&lt;br /&gt;               | [\xC2-\xDF][\x80-\xBF]             # non-overlong 2-byte&lt;br /&gt;               |  \xE0[\xA0-\xBF][\x80-\xBF]        # excluding overlongs&lt;br /&gt;               | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}  # straight 3-byte&lt;br /&gt;               |  \xED[\x80-\x9F][\x80-\xBF]        # excluding surrogates&lt;br /&gt;               |  \xF0[\x90-\xBF][\x80-\xBF]{2}     # planes 1-3&lt;br /&gt;               | [\xF1-\xF3][\x80-\xBF]{3}          # planes 4-15&lt;br /&gt;               |  \xF4[\x80-\x8F][\x80-\xBF]{2}     # plane 16&lt;br /&gt;               )*\z/mnx&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;#  create UTF-8 character arrays (as class instance variables)&lt;br /&gt;#&lt;br /&gt;#  mapping tables: - http://www.unicode.org/Public/UCA/latest/allkeys.txt&lt;br /&gt;#                  - http://unicode.org/Public/UNIDATA/UnicodeData.txt &lt;br /&gt;#                  - http://unicode.org/Public/UNIDATA/CaseFolding.txt&lt;br /&gt;#                  - http://www.decodeunicode.org &lt;br /&gt;#                  - ftp://ftp.mars.org/pub/ruby/Unicode.tar.bz2&lt;br /&gt;#                  - http://camomile.sourceforge.net&lt;br /&gt;#                  - Character Palette (Mac OS X)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   # test data&lt;br /&gt;   @small_letters_utf8 = ["U+00F1", "U+00F4", "U+00E6", "U+00F8", "U+00E0", "U+00E1", "U+00E2", "U+00E4", "U+00E5", "U+00E7", "U+00E8", "U+00E9", "U+00EA", "U+00EB", "U+0153"].map { |x| u = [x[2..-1].hex].pack("U*"); u =~ UTF8REGEX ? u : nil }&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   @capital_letters_utf8 = ["U+00D1", "U+00D4", "U+00C6", "U+00D8", "U+00C0", "U+00C1", "U+00C2", "U+00C4", "U+00C5", "U+00C7", "U+00C8", "U+00C9", "U+00CA", "U+00CB", "U+0152"].map { |x| u = [x[2..-1].hex].pack("U*"); u =~ UTF8REGEX ? u : nil }&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   @other_letters_utf8 = ["U+03A3", "U+0639", "U+0041", "U+F8D0", "U+F8FF", "U+4E2D", "U+F4EE", "U+00FE", "U+10FFFF", "U+00A9", "U+20AC", "U+221E", "U+20AC", "U+FEFF", "U+FFFD", "U+00FF", "U+00FE", "U+FFFE", "U+FEFF"].map { |x| u = [x[2..-1].hex].pack("U*"); u =~ UTF8REGEX ? u : nil }&lt;br /&gt;&lt;br /&gt;   if @small_letters_utf8.size != @small_letters_utf8.nitems then raise "Invalid UTF-8 char in @small_letters_utf8!" end&lt;br /&gt;   if @capital_letters_utf8.size != @capital_letters_utf8.nitems then raise "Invalid UTF-8 char in @capital_letters_utf8!" end&lt;br /&gt;   if @other_letters_utf8.size != @other_letters_utf8.nitems then raise "Invalid UTF-8 char in @other_letters_utf8!" end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   @unicode_array = []&lt;br /&gt;   #open('http://unicode.org/Public/UNIDATA/UnicodeData.txt') do |f| f.each(nil) { |line| line.scan(/^[^;]+/) { |u| @unicode_array &lt;&lt; u } }  end&lt;br /&gt;   #open('http://unicode.org/Public/UNIDATA/UnicodeData.txt') do |f|                                                                               &lt;br /&gt;   #   f.each do |line| line =~ /LATIN|GREEK|CYRILLIC/  ?  ( line.scan(/^[^;]+/) { |u| @unicode_array &lt;&lt; u } )  :  next  end&lt;br /&gt;   #end&lt;br /&gt;&lt;br /&gt;   #@letters_utf8 = @unicode_array.map { |x| u = [x.hex].pack("U*"); u =~ UTF8REGEX ? u : nil }.compact   # code points from UnicodeData.txt&lt;br /&gt;   @letters_utf8 = @small_letters_utf8 + @capital_letters_utf8 + @other_letters_utf8                      # test data only&lt;br /&gt;&lt;br /&gt;   # Hash[*array_with_keys.zip(array_with_values).flatten]&lt;br /&gt;   @downcase_table_utf8 = Hash[*@capital_letters_utf8.zip(@small_letters_utf8).flatten]&lt;br /&gt;   @upcase_table_utf8 = Hash[*@small_letters_utf8.zip(@capital_letters_utf8).flatten]&lt;br /&gt;   @letters_utf8_hash = Hash[*@letters_utf8.zip([]).flatten]    #=&gt; ... "\341\272\242"=&gt;nil ...&lt;br /&gt;&lt;br /&gt;   class &lt;&lt; self &lt;br /&gt;      attr_accessor :small_letters_utf8&lt;br /&gt;      attr_accessor :capital_letters_utf8&lt;br /&gt;      attr_accessor :other_letters_utf8&lt;br /&gt;      attr_accessor :letters_utf8&lt;br /&gt;      attr_accessor :letters_utf8_hash&lt;br /&gt;      attr_accessor :unicode_array&lt;br /&gt;      attr_accessor :downcase_table_utf8&lt;br /&gt;      attr_accessor :upcase_table_utf8&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   def each_utf8_char&lt;br /&gt;      scan(/./mu) { |c| yield c }&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def each_utf8_char_with_index&lt;br /&gt;      i = -1&lt;br /&gt;      scan(/./mu) { |c| i+=1; yield(c, i) }&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def length_utf8&lt;br /&gt;      #scan(/./mu).size&lt;br /&gt;      count = 0&lt;br /&gt;      scan(/./mu) { count += 1 }&lt;br /&gt;      count&lt;br /&gt;   end&lt;br /&gt;   alias :size_utf8 :length_utf8&lt;br /&gt;&lt;br /&gt;   def reverse_utf8&lt;br /&gt;      split(//mu).reverse.join&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def reverse_utf8!&lt;br /&gt;      split(//mu).reverse!.join&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def swapcase_utf8&lt;br /&gt;     gsub(/./mu) do |char|  &lt;br /&gt;         if !String.downcase_table_utf8[char].nil? then String.downcase_table_utf8[char]&lt;br /&gt;         elsif !String.upcase_table_utf8[char].nil? then String.upcase_table_utf8[char]&lt;br /&gt;         else char.swapcase&lt;br /&gt;         end&lt;br /&gt;      end&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def swapcase_utf8!&lt;br /&gt;      gsub!(/./mu) do |char|  &lt;br /&gt;         if !String.downcase_table_utf8[char].nil? then String.downcase_table_utf8[char]&lt;br /&gt;         elsif !String.upcase_table_utf8[char].nil? then String.upcase_table_utf8[char]&lt;br /&gt;         else ret = char.swapcase end&lt;br /&gt;      end&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def downcase_utf8&lt;br /&gt;      gsub(/./mu) do |char|  &lt;br /&gt;         small_char = String.downcase_table_utf8[char]&lt;br /&gt;         small_char.nil? ? char.downcase : small_char&lt;br /&gt;      end&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def downcase_utf8!&lt;br /&gt;      gsub!(/./mu) do |char|  &lt;br /&gt;         small_char = String.downcase_table_utf8[char]&lt;br /&gt;         small_char.nil? ? char.downcase : small_char&lt;br /&gt;      end&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def upcase_utf8&lt;br /&gt;      gsub(/./mu) do |char|  &lt;br /&gt;         capital_char = String.upcase_table_utf8[char]&lt;br /&gt;         capital_char.nil? ? char.upcase : capital_char&lt;br /&gt;      end&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def upcase_utf8!&lt;br /&gt;      gsub!(/./mu) do |char|  &lt;br /&gt;         capital_char = String.upcase_table_utf8[char]&lt;br /&gt;         capital_char.nil? ? char.upcase : capital_char&lt;br /&gt;      end&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def count_utf8(c)&lt;br /&gt;      return nil if c.empty?&lt;br /&gt;      r = %r{[#{c}]}mu&lt;br /&gt;      scan(r).size&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def delete_utf8(c)&lt;br /&gt;      return self if c.empty?&lt;br /&gt;      r = %r{[#{c}]}mu&lt;br /&gt;      gsub(r, '')&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def delete_utf8!(c)&lt;br /&gt;      return self if c.empty?&lt;br /&gt;      r = %r{[#{c}]}mu&lt;br /&gt;      gsub!(r, '')&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def first_utf8&lt;br /&gt;      self[/\A./mu]&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def last_utf8&lt;br /&gt;      self[/.\z/mu]&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def capitalize_utf8&lt;br /&gt;     return self if self =~ /\A[[:space:]]*\z/m&lt;br /&gt;     ret = ""&lt;br /&gt;     split(/\x20/).each do |w| &lt;br /&gt;         count = 0&lt;br /&gt;         w.gsub(/./mu) do |char|  &lt;br /&gt;            count += 1&lt;br /&gt;            capital_char = String.upcase_table_utf8[char]&lt;br /&gt;            if count == 1 then &lt;br /&gt;               capital_char.nil? ? char.upcase : char.upcase_utf8&lt;br /&gt;            else&lt;br /&gt;               capital_char.nil? ? char.downcase : char.downcase_utf8&lt;br /&gt;            end&lt;br /&gt;         end&lt;br /&gt;         ret &lt;&lt; w + ' '&lt;br /&gt;     end&lt;br /&gt;     ret =~ /\x20\z/ ? ret.sub!(/\x20\z/, '') : ret  &lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def capitalize_utf8!&lt;br /&gt;     return self if self =~ /\A[[:space:]]*\z/m &lt;br /&gt;     ret = ""&lt;br /&gt;     split(/\x20/).each do |w| &lt;br /&gt;         count = 0&lt;br /&gt;         w.gsub!(/./mu) do |char|  &lt;br /&gt;            count += 1&lt;br /&gt;            capital_char = String.upcase_table_utf8[char]&lt;br /&gt;            if count == 1 then &lt;br /&gt;               capital_char.nil? ? char.upcase : char.upcase_utf8&lt;br /&gt;            else&lt;br /&gt;               capital_char.nil? ? char.downcase : char.downcase_utf8&lt;br /&gt;            end&lt;br /&gt;         end&lt;br /&gt;         ret &lt;&lt; w + ' '&lt;br /&gt;     end&lt;br /&gt;     ret =~ /\x20\z/ ? ret.sub!(/\x20\z/, '') : ret&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   def index_utf8(s)&lt;br /&gt;&lt;br /&gt;      return nil unless !self.empty? &amp;&amp; (s.class == Regexp || s.class == String)&lt;br /&gt;      #raise(ArgumentError, "Wrong argument for method index_utf8!", caller) unless !self.empty? &amp;&amp; (s.class == Regexp || s.class == String)&lt;br /&gt;&lt;br /&gt;      if s.class == Regexp&lt;br /&gt;         opts = s.inspect.gsub(/\A(.).*\1([eimnosux]*)\z/mu, '\2')&lt;br /&gt;         if  opts.count('u') == 0 then opts = opts + "u" end&lt;br /&gt;         str = s.source&lt;br /&gt;         return nil if str.empty?&lt;br /&gt;         str = "%r{#{str}}" + opts&lt;br /&gt;         r = eval(str)&lt;br /&gt;         l = ""&lt;br /&gt;         sub(r) { l &lt;&lt; $`; " " }  # $`: The string to the left of the last successful match (cf. http://www.zenspider.com/Languages/Ruby/QuickRef.html)&lt;br /&gt;         l.empty? ? nil : l.length_utf8&lt;br /&gt;&lt;br /&gt;      else&lt;br /&gt;&lt;br /&gt;         return nil if s.empty?&lt;br /&gt;         r = %r{#{s}}mu&lt;br /&gt;         l = ""&lt;br /&gt;         sub(r) { l &lt;&lt; $`; " " }&lt;br /&gt;         l.empty? ? nil : l.length_utf8&lt;br /&gt;&lt;br /&gt;# this would be a non-regex solution&lt;br /&gt;=begin &lt;br /&gt;         return nil if s.empty?&lt;br /&gt;         return nil unless self =~ %r{#{s}}mu&lt;br /&gt;         indices = []&lt;br /&gt;         s.split(//mu).each do |x|&lt;br /&gt;            ar = []&lt;br /&gt;            self.each_utf8_char_with_index { |c,i| if c == x then ar &lt;&lt; i end  }   # first get all matching indices c == x&lt;br /&gt;            indices &lt;&lt; ar unless ar.empty?&lt;br /&gt;         end&lt;br /&gt;         if indices.empty?&lt;br /&gt;            return nil&lt;br /&gt;         elsif indices.size == 1 &lt;br /&gt;            indices.first.first&lt;br /&gt;         else &lt;br /&gt;            #p indices&lt;br /&gt;            ret = []&lt;br /&gt;            a0 = indices.shift&lt;br /&gt;            a0.each do |i|&lt;br /&gt;               ret &lt;&lt; i&lt;br /&gt;               indices.each { |a| if a.include?(i+1) then i += 1; ret &lt;&lt; i else ret = []; break end  }&lt;br /&gt;               return ret.first unless ret.empty?&lt;br /&gt;            end&lt;br /&gt;            ret.empty? ? nil : ret.first&lt;br /&gt;         end&lt;br /&gt;=end&lt;br /&gt;&lt;br /&gt;      end&lt;br /&gt;   end   &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   def rindex_utf8(s)&lt;br /&gt;&lt;br /&gt;      return nil unless !self.empty? &amp;&amp; (s.class == Regexp || s.class == String)&lt;br /&gt;      #raise(ArgumentError, "Wrong argument for method index_utf8!", caller) unless !self.empty? &amp;&amp; (s.class == Regexp || s.class == String)&lt;br /&gt;&lt;br /&gt;      if s.class == Regexp&lt;br /&gt;         opts = s.inspect.gsub(/\A(.).*\1([eimnosux]*)\z/mu, '\2')&lt;br /&gt;         if  opts.count('u') == 0 then opts = opts + "u" end&lt;br /&gt;         str = s.source&lt;br /&gt;         return nil if str.empty?&lt;br /&gt;         str = "%r{#{str}}" + opts&lt;br /&gt;         r = eval(str)&lt;br /&gt;         l = ""&lt;br /&gt;         scan(r) { l = $` }  &lt;br /&gt;         #gsub(r) { l = $`; " " }  &lt;br /&gt;         l.empty? ? nil : l.length_utf8&lt;br /&gt;      else&lt;br /&gt;         return nil if s.empty?&lt;br /&gt;         r = %r{#{s}}mu&lt;br /&gt;         l = ""&lt;br /&gt;         scan(r) { l = $` }  &lt;br /&gt;         #gsub(r) { l = $`; " " }&lt;br /&gt;         l.empty? ? nil : l.length_utf8&lt;br /&gt;      end&lt;br /&gt;&lt;br /&gt;   end   &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   # note that the i option does not work in special cases with back references&lt;br /&gt;   # example: "&#224;&#192;".slice_utf8(/(.).*?\1/i) returns nil whereas "aA".slice(/(.).*?\1/i) returns "aA"&lt;br /&gt;   def slice_utf8(regex)   &lt;br /&gt;      opts = regex.inspect.gsub(/\A(.).*\1([eimnosux]*)\z/mu, '\2')&lt;br /&gt;      if  opts.count('u') == 0 then opts = opts + "u" end&lt;br /&gt;      s = regex.source&lt;br /&gt;      str = "%r{#{s}}" + opts&lt;br /&gt;      r = eval(str)&lt;br /&gt;      slice(r)&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def slice_utf8!(regex)   &lt;br /&gt;      opts = regex.inspect.gsub(/\A(.).*\1([eimnosux]*)\z/mu, '\2')&lt;br /&gt;      if  opts.count('u') == 0 then opts = opts + "u" end&lt;br /&gt;      s = regex.source&lt;br /&gt;      str = "%r{#{s}}" + opts&lt;br /&gt;      r = eval(str)&lt;br /&gt;      slice!(r)&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def cut_utf8(p,l)    # (index) position, length&lt;br /&gt;      raise(ArgumentError, "Error: argument is not Fixnum", caller) if p.class != Fixnum or l.class != Fixnum&lt;br /&gt;      s = self.length_utf8&lt;br /&gt;      #if p &lt; 0 then p = s - p.abs end&lt;br /&gt;      if p &lt; 0 then p.abs &gt; s ? (p = 0) : (p = s - p.abs) end      #  or:  ... p.abs &gt; s ? (return nil) : ...&lt;br /&gt;      return nil if l &gt; s or p &gt; (s - 1)&lt;br /&gt;      ret = ""&lt;br /&gt;      count = 0&lt;br /&gt;      each_utf8_char_with_index do |c,i| &lt;br /&gt;         break if count &gt;= l&lt;br /&gt;         if i &gt;= p &amp;&amp; count &lt; l then count += 1; ret &lt;&lt; c; end&lt;br /&gt;      end&lt;br /&gt;      ret&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def starts_with_utf8?(s)&lt;br /&gt;      return nil if self.empty? or s.empty?&lt;br /&gt;      cut_utf8(0, s.size_utf8) == s &lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def ends_with_utf8?(s)&lt;br /&gt;      return nil if self.empty? or s.empty?&lt;br /&gt;      cut_utf8(-(s.size_utf8), s.size_utf8) == s&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def insert_utf8(i,s)                                  # insert_utf8(index, string)&lt;br /&gt;      return self if s.empty?&lt;br /&gt;      l = self.length_utf8&lt;br /&gt;      if l == 0 then return s end&lt;br /&gt;      if i &lt; 0 then i.abs &gt; l ? (i = 0) : (i = l - i.abs) end          #  or:  ... i.abs &gt; l ? (return nil) : ...&lt;br /&gt;      #return nil if i &gt; (l - 1)                         # return nil ...&lt;br /&gt;      spaces = ""&lt;br /&gt;      if i &gt; (l-1) then spaces = " " * (i - (l-1)) end   # ... or add spaces&lt;br /&gt;      str = self &lt;&lt; spaces&lt;br /&gt;      s1 = str.cut_utf8(0, i)&lt;br /&gt;      s2 = str.cut_utf8(i, l - s1.length_utf8)&lt;br /&gt;      s1 &lt;&lt; s &lt;&lt; s2&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def split_utf8(regex)&lt;br /&gt;      opts = regex.inspect.gsub(/\A(.).*\1([eimnosux]*)\z/mu, '\2')&lt;br /&gt;      if  opts.count('u') == 0 then opts = opts + "u" end&lt;br /&gt;      s = regex.source&lt;br /&gt;      str = "%r{#{s}}" + opts&lt;br /&gt;      r = eval(str)&lt;br /&gt;      split(r)&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def scan_utf8(regex)&lt;br /&gt;      opts = regex.inspect.gsub(/\A(.).*\1([eimnosux]*)\z/mu, '\2')&lt;br /&gt;      if  opts.count('u') == 0 then opts = opts + "u" end&lt;br /&gt;      s = regex.source&lt;br /&gt;      str = "%r{#{s}}" + opts&lt;br /&gt;      r = eval(str)&lt;br /&gt;      if block_given? then scan(r) { |a,*m| yield(a,*m) } else scan(r) end&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def range_utf8(r)&lt;br /&gt;&lt;br /&gt;      return nil if r.class != Range&lt;br /&gt;      #raise(ArgumentError, "No Range object given!", caller) if r.class != Range&lt;br /&gt;&lt;br /&gt;      a = r.to_s[/^[\+\-]?\d+/].to_i&lt;br /&gt;      b = r.to_s[/[\+\-]?\d+$/].to_i&lt;br /&gt;      d = r.to_s[/\.+/]&lt;br /&gt;&lt;br /&gt;      if d.size == 2 then d = 2 else d = d.size end &lt;br /&gt;&lt;br /&gt;      l = self.length_utf8&lt;br /&gt;&lt;br /&gt;      return nil if b.abs &gt; l || a.abs &gt; l || d &lt; 2 || d &gt; 3&lt;br /&gt;&lt;br /&gt;      if a &lt; 0 then a = l - a.abs end&lt;br /&gt;      if b &lt; 0 then b = l - b.abs end&lt;br /&gt;      &lt;br /&gt;      return nil if a &gt; b&lt;br /&gt;&lt;br /&gt;      str = ""&lt;br /&gt;&lt;br /&gt;      each_utf8_char_with_index do |c,i|&lt;br /&gt;         break if i &gt; b&lt;br /&gt;         if d == 2&lt;br /&gt;            (i &gt;= a &amp;&amp; i &lt;= b) ? str &lt;&lt; c : next&lt;br /&gt;         else&lt;br /&gt;            (i &gt;= a &amp;&amp; i &lt; b) ? str &lt;&lt; c : next&lt;br /&gt;         end&lt;br /&gt;      end&lt;br /&gt;&lt;br /&gt;      str&lt;br /&gt;&lt;br /&gt;   end&lt;br /&gt; &lt;br /&gt;   def utf8?&lt;br /&gt;     self =~ UTF8REGEX&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def clean_utf8&lt;br /&gt;       t = ""&lt;br /&gt;       self.scan(/./um) { |c| t &lt;&lt; c if c =~ UTF8REGEX }&lt;br /&gt;       t&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   def utf8_encoded_file?   # check (or rather guess) if (HTML) file encoding is UTF-8 (experimental, so use at your own risk!)&lt;br /&gt;&lt;br /&gt;      file = self&lt;br /&gt;      str = ""&lt;br /&gt;&lt;br /&gt;      if file =~ /^http:\/\//&lt;br /&gt;&lt;br /&gt;         url = file&lt;br /&gt;&lt;br /&gt;         if RUBY_PLATFORM =~ /darwin/i   # Mac OS X 10.4.10&lt;br /&gt;          &lt;br /&gt;            seconds = 30  &lt;br /&gt;&lt;br /&gt;            # check if web site is reachable&lt;br /&gt;            # on Windows try to use curb, http://curb.rubyforge.org (sudo gem install curb)&lt;br /&gt;            var = %x{ /usr/bin/curl -I -L --fail --silent --connect-timeout #{seconds} --max-time #{seconds+10} #{url}; /bin/echo -n $? }.to_i&lt;br /&gt;&lt;br /&gt;            #return false unless var == 0&lt;br /&gt;            raise "Failed to create connection to web site: #{url}  --  curl error code: #{var}  --  " unless var == 0&lt;br /&gt;&lt;br /&gt;            str = %x{ /usr/bin/curl -L --fail --silent --connect-timeout #{seconds} --max-time #{seconds+10} #{url} | \&lt;br /&gt;                      /usr/bin/grep -Eo -m 1 \"(charset|encoding)=[\\"']?[^\\"'&gt;]+\" | /usr/bin/grep -Eo \"[^=\\"'&gt;]+$\" }&lt;br /&gt;            p str&lt;br /&gt;            return true if str =~ /utf-?8/i&lt;br /&gt;            return false if !str.empty? &amp;&amp; str !~ /utf-?8/i&lt;br /&gt;&lt;br /&gt;            # solutions with downloaded file&lt;br /&gt;&lt;br /&gt;            # download HTML file&lt;br /&gt;            #downloaded_file = "/tmp/html"&lt;br /&gt;            downloaded_file = "~/Desktop/html"&lt;br /&gt;            downloaded_file = File.expand_path(downloaded_file)&lt;br /&gt;            %x{ /usr/bin/touch #{downloaded_file} 2&gt;/dev/null }&lt;br /&gt;            raise "No valid HTML download file (path) specified!" unless File.file?(downloaded_file)&lt;br /&gt;            %x{ /usr/bin/curl -L --fail --silent --connect-timeout #{seconds} --max-time #{seconds+10} -o #{downloaded_file} #{url} }&lt;br /&gt;            &lt;br /&gt;            simple_test = %x{ /usr/bin/file -ik #{downloaded_file} }    #  cf. man file&lt;br /&gt;            p simple_test &lt;br /&gt;&lt;br /&gt;            # read entire file into a string&lt;br /&gt;            File.open(downloaded_file).read.each(nil) do |str| &lt;br /&gt;               #return true if str =~ /(charset|encoding) *= *["']? *utf-?8/i&lt;br /&gt;               str.utf8? ? (return true) : (return false) &lt;br /&gt;            end &lt;br /&gt;&lt;br /&gt;            #check each line of the downloaded file&lt;br /&gt;            #count_lines = 0&lt;br /&gt;            #count_utf8 = 0&lt;br /&gt;            #File.foreach(downloaded_file) { |line| return true if line =~ /(charset|encoding) *= *["']? *utf-?8/i; count_lines += 1;  count_utf8 += 1 if line.clean_utf8.utf8?; break if count_lines != count_utf8 }&lt;br /&gt;            #count_lines == count_utf8 ? (return true) : (return false)&lt;br /&gt;            &lt;br /&gt;&lt;br /&gt;            # in-memory solutions&lt;br /&gt;&lt;br /&gt;            #html_file_cleaned_utf8 = %x{ /usr/bin/curl -L --fail --silent --connect-timeout #{seconds} --max-time #{seconds+10} #{url} }.clean_utf8&lt;br /&gt;            #p html_file_cleaned_utf8.utf8?&lt;br /&gt;&lt;br /&gt;            count_lines = 0&lt;br /&gt;            count_utf8 = 0&lt;br /&gt;            #%x{ /usr/bin/curl -L --fail --silent --connect-timeout #{seconds} --max-time #{seconds+10} #{url} }.each(nil) do |line|    # read entire file into string&lt;br /&gt;            %x{ /usr/bin/curl -L --fail --silent --connect-timeout #{seconds} --max-time #{seconds+10} #{url} }.each('\n') do |line| &lt;br /&gt;               #return true if line =~ /(charset|encoding) *= *["']? *utf-?8/i&lt;br /&gt;               count_lines += 1 &lt;br /&gt;               count_utf8 += 1 if line.utf8?&lt;br /&gt;               break if count_lines != count_utf8&lt;br /&gt;            end&lt;br /&gt;            count_lines == count_utf8 ? (return true) : (return false)&lt;br /&gt;&lt;br /&gt;         else&lt;br /&gt;&lt;br /&gt;            # check each line of the HTML file (or the entire HTML file at once)&lt;br /&gt;            # cf. http://www.ruby-doc.org/stdlib/libdoc/open-uri/rdoc/index.html&lt;br /&gt;            count_lines = 0&lt;br /&gt;            count_utf8 = 0&lt;br /&gt;            open(url) do |f|   &lt;br /&gt;               # p f.meta, f.content_encoding, f.content_type&lt;br /&gt;               cs = f.charset&lt;br /&gt;               return true if cs =~ /utf-?8/i&lt;br /&gt;               #f.each(nil) do |str| str.utf8? ? (return true) : (return false) end  # read entire file into string&lt;br /&gt;               f.each_line do |line| &lt;br /&gt;                  count_lines += 1 &lt;br /&gt;                  count_utf8 += 1 if line.utf8?&lt;br /&gt;                  break unless count_lines == count_utf8&lt;br /&gt;               end&lt;br /&gt;            end&lt;br /&gt;            count_lines == count_utf8 ? (return true) : (return false)&lt;br /&gt;&lt;br /&gt;         end&lt;br /&gt;&lt;br /&gt;      else&lt;br /&gt;&lt;br /&gt;         return false unless File.file?(file)&lt;br /&gt;&lt;br /&gt;         if RUBY_PLATFORM =~ /darwin/i then str = %x{ /usr/bin/file -ik #{file} }; return true if str =~ /utf-?8/i end&lt;br /&gt;&lt;br /&gt;         # read entire file into a string&lt;br /&gt;         #File.open(file).read.each(nil) do |str| return true if str =~ /(charset|encoding) *= *["']? *utf-?8/i; str.utf8? ? (return true) : (return false) end &lt;br /&gt;&lt;br /&gt;         # check each line of the file&lt;br /&gt;         count_lines = 0&lt;br /&gt;         count_utf8 = 0&lt;br /&gt;         File.foreach(file) do |line| &lt;br /&gt;            return true if line =~ /(charset|encoding) *= *["']? *utf-?8/i&lt;br /&gt;            count_lines += 1;  &lt;br /&gt;            count_utf8 += 1 if line.utf8?; &lt;br /&gt;            break if count_lines != count_utf8 &lt;br /&gt;         end&lt;br /&gt;&lt;br /&gt;         count_lines == count_utf8 ? (return true) : (return false)&lt;br /&gt;         &lt;br /&gt;      end   &lt;br /&gt;&lt;br /&gt;      str =~ /utf-?8/i ? true : false&lt;br /&gt;&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   # cf. Paul Battley, http://po-ru.com/diary/fixing-invalid-utf-8-in-ruby-revisited/&lt;br /&gt;   def validate_utf8&lt;br /&gt;      Iconv.iconv('UTF-8//IGNORE', 'UTF-8', (self + ' ') ).first[0..-2]&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   # cf. Paul Battley, http://www.ruby-forum.com/topic/70357&lt;br /&gt;   def asciify_utf8&lt;br /&gt;       return nil unless self.utf8?&lt;br /&gt;       #Iconv.iconv('US-ASCII//IGNORE//TRANSLIT', 'UTF-8', (self + ' ') ).first[0..-2]&lt;br /&gt;       # delete all punctuation characters inside words except "-" in words such as up-to-date&lt;br /&gt;       Iconv.iconv('US-ASCII//IGNORE//TRANSLIT', 'UTF-8', (self + ' ') ).first[0..-2].gsub(/(?!-.*)\b[[:punct:]]+\b/, '')&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def latin1_to_utf8     # ISO-8859-1 to UTF-8&lt;br /&gt;      ret = Iconv.iconv("UTF-8//IGNORE", "ISO-8859-1", (self + "\x20") ).first[0..-2]&lt;br /&gt;      ret.utf8? ? ret : nil&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def cp1252_to_utf8     # CP1252 (WINDOWS-1252) to UTF-8&lt;br /&gt;      ret = Iconv.iconv("UTF-8//IGNORE", "CP1252", (self + "\x20") ).first[0..-2]&lt;br /&gt;      ret.utf8? ? ret : nil&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   # cf. Paul Battley, http://www.ruby-forum.com/topic/70357 &lt;br /&gt;   def utf16le_to_utf8&lt;br /&gt;       ret = Iconv.iconv('UTF-8//IGNORE', 'UTF-16LE', (self[0,(self.length/2*2)] + "\000\000") ).first[0..-2]&lt;br /&gt;       ret =~ /\x00\z/ ?  ret.sub!(/\x00\z/, '') : ret&lt;br /&gt;       ret.utf8? ? ret : nil&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def utf8_to_utf16le&lt;br /&gt;      return nil unless self.utf8?&lt;br /&gt;      ret = Iconv.iconv('UTF-16LE//IGNORE', 'UTF-8', self ).first&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def utf8_to_unicode&lt;br /&gt;      return nil unless self.utf8?&lt;br /&gt;      str = ""&lt;br /&gt;      scan(/./mu) { |c| str &lt;&lt; "U+" &lt;&lt; sprintf("%04X", c.unpack("U*").first) }&lt;br /&gt;      str&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def unicode_to_utf8&lt;br /&gt;      return self if self =~ /\A[[:space:]]*\z/m&lt;br /&gt;      str = ""&lt;br /&gt;      #scan(/U\+([0-9a-fA-F]{4,5}|10[0-9a-fA-F]{4})/) { |u| str &lt;&lt; [u.first.hex].pack("U*") }&lt;br /&gt;      #scan(/U\+([[:digit:][:xdigit:]]{4,5}|10[[:digit:][:xdigit:]]{4})/) { |u| str &lt;&lt; [u.first.hex].pack("U*") }&lt;br /&gt;      scan(/(U\+(?:[[:digit:][:xdigit:]]{4,5}|10[[:digit:][:xdigit:]]{4})|.)/mu) do        # for mixed strings such as "U+00bfHabla espaU+00f1ol?"&lt;br /&gt;         c = $1&lt;br /&gt;         if c =~ /^U\+/&lt;br /&gt;            str &lt;&lt; [c[2..-1].hex].pack("U*")&lt;br /&gt;         else&lt;br /&gt;            str &lt;&lt; c&lt;br /&gt;         end       &lt;br /&gt;      end&lt;br /&gt;      str.utf8? ? str : nil&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   # dec, hex, oct conversions (experimental!)&lt;br /&gt;&lt;br /&gt;   def utf8_to_dec&lt;br /&gt;      return nil unless self.utf8?&lt;br /&gt;      str = ""&lt;br /&gt;      scan(/./mu) do |c| &lt;br /&gt;         if c =~ /^\x00$/&lt;br /&gt;            str &lt;&lt; "aaa\x00"  # encode \x00 as "aaa"&lt;br /&gt;         else&lt;br /&gt;            str &lt;&lt; sprintf("%04X", c.unpack("U*").first).hex.to_s &lt;&lt; "\x00"   # convert to decimal&lt;br /&gt;         end&lt;br /&gt;      end     &lt;br /&gt;      str[0..-2]&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def dec_to_utf8   # \x00 is encoded as "aaa"&lt;br /&gt;      return self if self.empty?&lt;br /&gt;      return nil unless self =~ /\A[[:digit:]]+\x00/ &amp;&amp; self =~ /\A[a[:digit:]\x00]+\z/&lt;br /&gt;      str = ""&lt;br /&gt;      split(/\x00/).each do |c|&lt;br /&gt;         if c.eql?("aaa")&lt;br /&gt;            str &lt;&lt; "\x00"&lt;br /&gt;         else&lt;br /&gt;            str &lt;&lt; [c.to_i].pack("U*")&lt;br /&gt;         end&lt;br /&gt;      end&lt;br /&gt;      str&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   def utf8_to_dec_2&lt;br /&gt;      return nil unless self.utf8?&lt;br /&gt;      str = ""&lt;br /&gt;      tmpstr = ""&lt;br /&gt;      null_str = "\x00"&lt;br /&gt;      scan(/./mu) do |c| &lt;br /&gt;         if c =~ /^\x00$/&lt;br /&gt;            str &lt;&lt; "aaa\x00\x00"  # encode \x00 as "aaa"&lt;br /&gt;         else&lt;br /&gt;            tmpstr = ""&lt;br /&gt;            c.each_byte { |x| tmpstr &lt;&lt; x.to_s &lt;&lt; null_str }      # convert to decimal&lt;br /&gt;            str &lt;&lt; tmpstr &lt;&lt; null_str&lt;br /&gt;         end&lt;br /&gt;      end     &lt;br /&gt;      str[0..-3]&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def dec_to_utf8_2   # \x00 is encoded as "aaa"&lt;br /&gt;      return self if self.empty?&lt;br /&gt;      return nil unless self =~ /\A[[:digit:]]+\x00/ &amp;&amp; self =~ /[[:digit:]]+\x00\x00/ &amp;&amp; self =~ /\A[a[:digit:]\x00]+\z/&lt;br /&gt;      str = ""&lt;br /&gt;      split(/\x00\x00/).each do |c|&lt;br /&gt;         if c =~ /\x00/&lt;br /&gt;            c.split(/\x00/).each { |x| str &lt;&lt; x.to_i.chr }&lt;br /&gt;         elsif c.eql?("aaa")&lt;br /&gt;            str &lt;&lt; "\x00"&lt;br /&gt;         else&lt;br /&gt;            str &lt;&lt; c.to_i.chr&lt;br /&gt;         end&lt;br /&gt;      end&lt;br /&gt;      str&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   def utf8_to_hex&lt;br /&gt;      return nil unless self.utf8?&lt;br /&gt;      str = ""&lt;br /&gt;      tmpstr = ""&lt;br /&gt;      null_str = "\x00"&lt;br /&gt;      scan(/./mu) do |c| &lt;br /&gt;         if c =~ /^\x00$/&lt;br /&gt;            str &lt;&lt; "aaa\x00\x00"    # encode \x00 as "aaa"&lt;br /&gt;         else&lt;br /&gt;            tmpstr = ""&lt;br /&gt;            c.each_byte { |x| tmpstr &lt;&lt; sprintf("%X", x) &lt;&lt; null_str }      # convert to hexadecimal&lt;br /&gt;            str &lt;&lt; tmpstr &lt;&lt; null_str&lt;br /&gt;         end&lt;br /&gt;      end     &lt;br /&gt;      str[0..-3]&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def hex_to_utf8   # \x00 is encoded as "aaa"&lt;br /&gt;      return self if self.empty?&lt;br /&gt;      return nil unless self =~ /\A[[:xdigit:]]+\x00/ &amp;&amp; self =~ /[[:xdigit:]]+\x00\x00/ &amp;&amp; self =~ /\A[a[:xdigit:]\x00]+\z/&lt;br /&gt;      str = ""&lt;br /&gt;      split(/\x00\x00/).each do |c|&lt;br /&gt;         if c =~ /\x00/&lt;br /&gt;            c.split(/\x00/).each { |x| str &lt;&lt; x.hex.chr }&lt;br /&gt;         elsif c.eql?("aaa")&lt;br /&gt;            str &lt;&lt; "\x00"&lt;br /&gt;         else&lt;br /&gt;            str &lt;&lt; c.hex.chr&lt;br /&gt;         end&lt;br /&gt;      end&lt;br /&gt;      str&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;   def utf8_to_oct&lt;br /&gt;      return nil unless self.utf8?&lt;br /&gt;      str = ""&lt;br /&gt;      tmpstr = ""&lt;br /&gt;      null_str = "\x00"&lt;br /&gt;      scan(/./mu) do |c| &lt;br /&gt;         if c =~ /^\x00$/&lt;br /&gt;            str &lt;&lt; "aaa\x00\x00"   # encode \x00 as "aaa"&lt;br /&gt;         else&lt;br /&gt;            tmpstr = ""&lt;br /&gt;            c.each_byte { |x| tmpstr &lt;&lt; sprintf("%o", x) &lt;&lt; null_str }      # convert to octal&lt;br /&gt;            str &lt;&lt; tmpstr &lt;&lt; null_str&lt;br /&gt;         end&lt;br /&gt;      end     &lt;br /&gt;      str[0..-3]&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   def oct_to_utf8   # \x00 is encoded as "aaa"&lt;br /&gt;      return self if self.empty?&lt;br /&gt;      return nil unless self =~ /\A[[:digit:]]+\x00/ &amp;&amp; self =~ /[[:digit:]]+\x00\x00/ &amp;&amp; self =~ /\A[a[:digit:]\x00]+\z/&lt;br /&gt;      str = ""&lt;br /&gt;      split(/\x00\x00/).each do |c|&lt;br /&gt;         if c =~ /\x00/&lt;br /&gt;            c.split(/\x00/).each { |x| str &lt;&lt; x.oct.chr }&lt;br /&gt;         elsif c.eql?("aaa")&lt;br /&gt;            str &lt;&lt; "\x00"&lt;br /&gt;         else&lt;br /&gt;            str &lt;&lt; c.oct.chr&lt;br /&gt;         end&lt;br /&gt;      end&lt;br /&gt;      str&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;   # cf. http://node-0.mneisen.org/2007/03/13/email-subjects-in-utf-8-mit-ruby-kodieren/&lt;br /&gt;   def email_subject_utf8&lt;br /&gt;      return nil unless self.utf8?&lt;br /&gt;      "=?utf-8?b?#{[self].pack("m").delete("\n")}?="&lt;br /&gt;   end&lt;br /&gt;&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;puts String.downcase_table_utf8.to_s&lt;br /&gt;&lt;br /&gt;#puts String.letters_utf8.to_s&lt;br /&gt;#String.letters_utf8.each { |c| puts "#{c.inspect} ::  #{c}" }&lt;br /&gt;&lt;br /&gt;str = "&#338;uvres Compl&#232;tes"&lt;br /&gt;str = "&#338;uvres \000Compl&#232;tes"&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;str = str.validate_utf8; p str&lt;br /&gt;str = str.clean_utf8; p str&lt;br /&gt;str.utf8?  ? "#{str}: UTF-8 string seems OK!\n".display : "#{str}: No valid UTF-8 string!\n".display&lt;br /&gt;puts str.asciify_utf8&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;str_in_utf8 = "\303\251"&lt;br /&gt;print "UTF-16:   "; p Iconv.iconv('UTF-16', 'UTF-8', str_in_utf8 ).first&lt;br /&gt;print "UTF-16BE: "; p Iconv.iconv('UTF-16BE', 'UTF-8', str_in_utf8 ).first&lt;br /&gt;print "UTF-16LE: "; p str_in_utf8.utf8_to_utf16le&lt;br /&gt;str_in_utf16le = "c\000a\000f\000\351\000"&lt;br /&gt;puts str_in_utf16le.utf16le_to_utf8&lt;br /&gt;puts str_in_utf16le.utf16le_to_utf8.asciify_utf8&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;puts str.upcase_utf8&lt;br /&gt;puts str.downcase_utf8&lt;br /&gt;puts str.capitalize_utf8&lt;br /&gt;puts str.capitalize_utf8!&lt;br /&gt;puts str.swapcase_utf8&lt;br /&gt;puts "&#224;cA&#32459;f&#233;&#224;".swapcase_utf8&lt;br /&gt;puts "&#224;cA&#32459;f&#233;&#224;".swapcase_utf8!&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;puts str.slice_utf8(/../i)&lt;br /&gt;puts str.slice_utf8(/(.).*?\1/i)&lt;br /&gt;puts "&#224;&#192;".slice_utf8(/(.).*?\1/i)   # =&gt; nil despite the i option!&lt;br /&gt;puts "aA".slice(/(.).*?\1/i)        # =&gt; aA&lt;br /&gt;puts "&#224;&#192; &#224;&#192;".slice_utf8!(/([&#224;&#192;]).*?\1/i)&lt;br /&gt;puts "&#224;&#192; &#224;&#192;".slice_utf8!(/(.).*?\1/ium)&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;&#192;".slice_utf8!(/(.).*?\1/ium)&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;str.capitalize_utf8.each_utf8_char_with_index { |c,i| puts "#{i}: #{c}" }&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;puts str.range_utf8(0..2)&lt;br /&gt;puts str.range_utf8(0..-2)&lt;br /&gt;puts str.range_utf8(-4..-1)&lt;br /&gt;puts str.range_utf8(-3..-1)&lt;br /&gt;puts str.range_utf8(-3...-1)&lt;br /&gt;puts str.range_utf8([-3..-1])&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;p str.scan_utf8(/./)&lt;br /&gt;"&#224;cA&#32459;f&#233;&#224;".scan_utf8(/./) { |c| puts c }&lt;br /&gt;"&#224;cA&#32459;f&#233;&#224;".scan_utf8(/(.)(.)?/) { |a,b| print a,b,"\n" }&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;p "&#224;cA&#32459;f&#233;&#224;".index_utf8('&#32459;')&lt;br /&gt;p "&#224;cA&#32459;f&#233;&#224;".index_utf8('&#32459;f')&lt;br /&gt;p "&#224;cA&#32459;f&#233;&#224;".index_utf8('z')&lt;br /&gt;p "kf&#233;&#224; &#32459;f &#224;c &#32459; 9h&#32459;!fz A&#32459;kf&#233;&#224; &#32459;f 9&#32459;!fz".index_utf8('9&#32459;!fz')&lt;br /&gt;p "kf&#233;&#224; &#32459;f &#224;c &#32459; 9h&#32459;!fz A&#32459;kf&#233;&#224; &#32459;f 9&#32459;!ofz 9&#32459;!fz".index_utf8(/9&#32459;!fz/)&lt;br /&gt;p "kf&#233;&#224; &#32459;f &#224;c &#32459; 9&#32459;!fz A&#32459;kf&#233;&#224; &#32459;f 9&#32459;!ofz 9&#32459;!fz kf&#233;&#224; &#32459;f &#224;c &#32459; 9h&#32459;!fz 9&#32459;!fz A&#32459;kf&#233;&#224; &#32459;f".index_utf8(//)&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;p "kf&#233;&#224; &#32459;f &#224;c &#32459; 9&#32459;!fz A&#32459;kf&#233;&#224; &#32459;f 9&#32459;!ofz 9&#32459;!fz kf&#233;&#224; &#32459;f &#224;c &#32459; 9h&#32459;!fz 9&#32459;!fz A&#32459;kf&#233;&#224; &#32459;f".rindex_utf8('9&#32459;!fz')&lt;br /&gt;p "kf&#233;&#224; &#32459;f &#224;c &#32459; 9&#32459;!fz A&#32459;kf&#233;&#224; &#32459;f 9&#32459;!ofz 9&#32459;!fz kf&#233;&#224; &#32459;f &#224;c &#32459; 9h&#32459;!fz 9&#32459;!fz A&#32459;kf&#233;&#224; &#32459;f".rindex_utf8(/9&#32459;!fz/)&lt;br /&gt;p "kf&#233;&#224; &#32459;f &#224;c &#32459; 9&#32459;!fz A&#32459;kf&#233;&#224; &#32459;f 9&#32459;!ofz 9&#32459;!fz kf&#233;&#224; &#32459;f &#224;c &#32459; 9h&#32459;!fz 9&#32459;!fz A&#32459;kf&#233;&#224; &#32459;f".rindex_utf8(/9..fz/)&lt;br /&gt;p "kf&#233;&#224; &#32459;f &#224;c &#32459; 9&#32459;!fz A&#32459;kf&#233;&#224; &#32459;f 9&#32459;!ofz 9&#32459;!fz kf&#233;&#224; &#32459;f &#224;c &#32459; 9h&#32459;!fz 9&#32459;!fz A&#32459;kf&#233;&#224; &#32459;f".rindex_utf8(//)&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;puts "&#224;cA&#32459;f&#233;&#224;".utf8_to_utf16le.utf16le_to_utf8&lt;br /&gt;puts "&#224;cA&#32459;f&#233;&#224;".utf8_to_utf16le.utf16le_to_utf8.asciify_utf8&lt;br /&gt;puts "&#224;&#192;".slice_utf8(/../i)&lt;br /&gt;puts "&#224;&#192;".slice_utf8!(/../i)&lt;br /&gt;&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;&#192;".count_utf8('&#32459;')&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;&#192;".count_utf8('&#224;&#192;')&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;&#192;".count_utf8('z')&lt;br /&gt;puts "&#32459; &#224;&#192;/ ^&#32459; &#224;&#192;".count_utf8('/&#32459;^')&lt;br /&gt;puts "&#32459; &#224;&#192;/ ^&#32459; &#224;&#192;".count_utf8('^/&#32459;^')  # count all chars except those specified; note that the leading ^ will result in the regex: /[^\/&#32459;^]/u&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;&#192;".delete_utf8('&#224;&#192; ')&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;&#192; &#32459; &#224;&#192; &#32459; &#224;&#192;".delete_utf8!('&#607;&#32459;&#224; &#230;&#165;')&lt;br /&gt;&lt;br /&gt;puts str.cut_utf8(0,5)&lt;br /&gt;puts str.cut_utf8(-5,5)&lt;br /&gt;puts str.cut_utf8(-10,50)&lt;br /&gt;&lt;br /&gt;puts str.length_utf8&lt;br /&gt;puts str.size_utf8&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;&#192;".first_utf8&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;&#192;".last_utf8&lt;br /&gt;p "&#32459; &#224;&#192; &#32459; &#224;&#192;\n".last_utf8&lt;br /&gt;puts "".first_utf8&lt;br /&gt;&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;&#192;".starts_with_utf8?('&#32459;')&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;&#192;".ends_with_utf8?('k')&lt;br /&gt;puts "".ends_with_utf8?('k')&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;&#192;".ends_with_utf8?('')&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;".starts_with_utf8?('&#32459; &#224;&#192; &#32459; &#224;&#192;')&lt;br /&gt;&lt;br /&gt;puts "&#32459; &#224;&#192; &#32459; &#224;".insert_utf8(20, "abc")&lt;br /&gt;puts "&#32459;&#224;&#192;&#32459;&#224;".insert_utf8(2, "abc")&lt;br /&gt;puts "&#32459;&#224;&#192;&#32459;&#224;".insert_utf8(-2, "abc")&lt;br /&gt;puts "&#32459;&#224;&#192;&#32459;&#224;".insert_utf8(-200, "abc")&lt;br /&gt;puts "&#32459;&#224;&#192;&#32459;&#224;".insert_utf8(200, "abc")&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;p "Hello, world!".utf8_to_unicode&lt;br /&gt;p "&#32459;&#224;&#192;&#32459;&#224;".utf8_to_unicode&lt;br /&gt;p "&#32459;&#224;&#192;&#32459;&#224;&#66374;".utf8_to_unicode&lt;br /&gt;&lt;br /&gt;puts "Hello, world!".utf8_to_unicode.unicode_to_utf8&lt;br /&gt;puts "&#32459;&#224;&#192;&#32459;&#224;&#66374;".utf8_to_unicode.unicode_to_utf8&lt;br /&gt;puts "&#32459;&#224;&#192;&#32459;&#224;&#66374;".size_utf8&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;encoded_file = "/ISO-8859-Latin-1.txt"&lt;br /&gt;encoded_file = "/cp1252.txt"&lt;br /&gt;&lt;br /&gt;File.open(encoded_file).read.each(nil) do |str| &lt;br /&gt;   p str&lt;br /&gt;   #str = str.latin1_to_utf8&lt;br /&gt;   str = str.cp1252_to_utf8&lt;br /&gt;   p str&lt;br /&gt;   puts str&lt;br /&gt;   str.utf8? ? (puts "UTF-8 conversion - YES") : (puts "UTF-8 conversion - NO") &lt;br /&gt;end &lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;puts "U+00bfHabla espaU+00f1ol?".unicode_to_utf8&lt;br /&gt;&lt;br /&gt;# cf. http://www.decodeunicode.org/en/miscellaneous_symbols&lt;br /&gt;code_points = &lt;&lt;-EOS&lt;br /&gt;U+2603   SNOWMAN&lt;br /&gt;U+2708   AIRPLANE&lt;br /&gt;U+00a9   COPYRIGHT SIGN&lt;br /&gt;U+2615   HOT BEVERAGE&lt;br /&gt;U+2602   UMBRELLA&lt;br /&gt;U+2614   UMBRELLA WITH RAIN DROPS&lt;br /&gt;U+261D   WHITE UP POINTING INDEX&lt;br /&gt;U+2620   SKULL AND CROSSBONES&lt;br /&gt;U+262F   YIN YANG&lt;br /&gt;U+262E   PEACE SYMBOL&lt;br /&gt;U+263A   WHITE SMILING FACE&lt;br /&gt;EOS&lt;br /&gt;&lt;br /&gt;puts code_points.unicode_to_utf8&lt;br /&gt;&lt;br /&gt;# see:&lt;br /&gt;# - http://intertwingly.net/stories/2004/04/14/i18n.html (I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;n)&lt;br /&gt;# - http://www.intertwingly.net/blog/1763.html (Unicode and weblogs)&lt;br /&gt;# - http://www.intertwingly.net/blog/1768.html (UTF-8 musings)&lt;br /&gt;&lt;br /&gt;puts "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;n".asciify_utf8&lt;br /&gt;puts "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;n".utf8_to_unicode&lt;br /&gt;puts "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;n".utf8_to_unicode.unicode_to_utf8&lt;br /&gt;puts "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;n".size_utf8&lt;br /&gt;puts "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;n".upcase_utf8&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;# NOTE: To convert the following UTF-8 strings containing a \x00 to dec, hex or oct you have to add \x00 to UTF8REGEX:  [\x00\x09\x0A\x0D\x20-\x7E]            # ASCII &lt;br /&gt;p "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;\x00n".utf8_to_dec&lt;br /&gt;puts "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;\x00n".utf8_to_dec&lt;br /&gt;p "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;\x00n".utf8_to_dec.dec_to_utf8&lt;br /&gt;puts "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;\x00n".utf8_to_dec.dec_to_utf8&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;p "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;\x00n".utf8_to_hex&lt;br /&gt;puts "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;\x00n".utf8_to_hex&lt;br /&gt;p "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;\x00n".utf8_to_hex.hex_to_utf8&lt;br /&gt;puts "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;\x00n".utf8_to_hex.hex_to_utf8&lt;br /&gt;    &lt;br /&gt;puts&lt;br /&gt;p "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;\x00n".utf8_to_oct&lt;br /&gt;puts "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;\x00n".utf8_to_oct&lt;br /&gt;p "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;\x00n".utf8_to_oct.oct_to_utf8&lt;br /&gt;puts "I&#241;t&#235;rn&#226;ti&#244;n&#224;liz&#230;ti&#248;\x00n".utf8_to_oct.oct_to_utf8&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;puts '"Hello, world" in Portuguese: "Ol&#225; Mundo" or "Al&#244; Mundo" (Portugu&#234;s)'.email_subject_utf8&lt;br /&gt;&lt;br /&gt;puts&lt;br /&gt;file = "http://www.ruby-forum.com"&lt;br /&gt;file = "http://blade.nagaokaut.ac.jp"&lt;br /&gt;file = "http://blade.nagaokaut.ac.jp/ruby/ruby-talk/index.shtml"&lt;br /&gt;file = "http://www.columbia.edu/kermit/utf8.html"   #  UTF-8 SAMPLER&lt;br /&gt;&lt;br /&gt;p file.utf8_encoded_file?&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;require 'open-uri'  &lt;br /&gt;  &lt;br /&gt;# UnicodeData.txt&lt;br /&gt;unicode_array = []&lt;br /&gt;&lt;br /&gt;open('http://unicode.org/Public/UNIDATA/UnicodeData.txt') do |f| &lt;br /&gt;   #f.each(nil) do |line| line.scan(/^[^;]+/) { |u| unicode_array &lt;&lt; u } end       # all code points&lt;br /&gt;   f.each do |line| line =~ /LATIN|GREEK|CYRILLIC/ ?  ( line.scan(/^[^;]+/) { |u| unicode_array &lt;&lt; u } ) : next end&lt;br /&gt;end&lt;br /&gt;unicode_array.each { |x| u = [x.hex].pack("U*"); u.utf8? ? (puts "U+#{x} ::  #{u.inspect}  ::  #{u}") : (puts "U+#{x} ::  #{u.inspect}  ::  #{u}  :: NO!") } &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;class Array&lt;br /&gt;   def dups_indices   # cf. http://www.ruby-forum.com/topic/122008 and http://snippets.dzone.com/posts/show/4148&lt;br /&gt;      (0...self.size).to_a - self.uniq.map{ |x| index(x) }&lt;br /&gt;   end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;#  CaseFolding.txt&lt;br /&gt;capital_letters_utf8 = []&lt;br /&gt;small_letters_utf8 = []&lt;br /&gt;&lt;br /&gt;open('http://www.unicode.org/Public/UNIDATA/CaseFolding.txt') do |f| &lt;br /&gt;   f.each do |line| &lt;br /&gt;      if line =~ /.*/ &lt;br /&gt;      #if line =~ /LATIN|GREEK|CYRILLIC/ &lt;br /&gt;         line.scan(/^([^;#]+); +\S+ ([^;\s]+)/) { capital_letters_utf8 &lt;&lt; [$1.hex].pack("U*"); small_letters_utf8 &lt;&lt; [$2.hex].pack("U*") }&lt;br /&gt;      end&lt;br /&gt;   end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;puts small_letters_utf8.size, capital_letters_utf8.size&lt;br /&gt;deleted_pairs = []&lt;br /&gt;small_letters_utf8.dups_indices.reverse.each do |i|   # small_letters_utf8 will be array_with_keys below&lt;br /&gt;   deleted_pairs &lt;&lt; [small_letters_utf8.at(i), capital_letters_utf8.at(i)]&lt;br /&gt;   small_letters_utf8.delete_at(i); capital_letters_utf8.delete_at(i)&lt;br /&gt;end&lt;br /&gt;puts small_letters_utf8.size, capital_letters_utf8.size&lt;br /&gt;&lt;br /&gt;# Hash[*array_with_keys.zip(array_with_values).flatten]&lt;br /&gt;upcase_table_utf8 = Hash[*small_letters_utf8.zip(capital_letters_utf8).flatten]&lt;br /&gt;#upcase_table_utf8.each_pair { |k,v| puts "#{k} :: #{v}" }&lt;br /&gt;&lt;br /&gt;puts upcase_table_utf8["a"]&lt;br /&gt;puts upcase_table_utf8["&#7834;"]&lt;br /&gt;puts upcase_table_utf8.value?("A")&lt;br /&gt;&lt;br /&gt;deleted_pairs.each { |s,c| puts "deleted:  #{s}   ::   #{c}" }&lt;br /&gt;&lt;br /&gt;upcase_table_utf8.size.times do |i|&lt;br /&gt;#20.times do |i|&lt;br /&gt;   puts "array index #{i}  ::  #{small_letters_utf8.at(i)}  ::  #{capital_letters_utf8.at(i)}"&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;</description>
      <pubDate>Tue, 11 Sep 2007 18:09:13 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/4527</guid>
      <author>ntk ()</author>
    </item>
    <item>
      <title>Ruby - "meta method" execute</title>
      <link>http://snippets.dzone.com/posts/show/4452</link>
      <description>//&lt;br /&gt;//&lt;br /&gt;// attempt to execute a method indirectly, I don't know if&lt;br /&gt;// it's possible, but I suspect it is, and probably just&lt;br /&gt;// have the syntax wrong&lt;br /&gt;//&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;ar = %w(apples bananas oranges)&lt;br /&gt;&lt;br /&gt;print "\n ar.class = #{ar.class}"&lt;br /&gt;&lt;br /&gt;print "\n ar.methods = #{ar.methods.sort}"&lt;br /&gt;puts "======================================================="&lt;br /&gt;&lt;br /&gt;mets = ar.methods.sort&lt;br /&gt;&lt;br /&gt;#&lt;br /&gt;# this doesn't work ... is there a syntax for doing this?&lt;br /&gt;#                       i.e. calling a 'meta' method ?&lt;br /&gt;#&lt;br /&gt;# call each method of 'ar' (an array)&lt;br /&gt;#&lt;br /&gt;#&lt;br /&gt;&lt;br /&gt;mets.each {|method| ar.method} &lt;br /&gt;&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Thu, 23 Aug 2007 19:19:12 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/4452</guid>
      <author>mstram (Mike Stramba)</author>
    </item>
    <item>
      <title>Constructor overloading Java and PHP5</title>
      <link>http://snippets.dzone.com/posts/show/3237</link>
      <description>Java's constructor overload&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;class Hoge {&lt;br /&gt;&lt;br /&gt;    public Hoge(){&lt;br /&gt;        System.out.println("constructor 0");&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public Hoge(int a){&lt;br /&gt;        System.out.println("constructor 1:" + a);&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public Hoge(int a, String[] hoge){&lt;br /&gt;        System.out.println("constructor 2:" + a + ":" + hoge);&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public Hoge(A a, B b, C c){&lt;br /&gt;        System.out.println("constructor 3:" + a + ":" + b + ":" + c);&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;PHP5 code&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;class Hoge {&lt;br /&gt;&lt;br /&gt;    public function __construct(){&lt;br /&gt;        $num = func_num_args();&lt;br /&gt;        $args = func_get_args();&lt;br /&gt;        switch($num){&lt;br /&gt;        case 0:&lt;br /&gt;            $this-&gt;__call('__construct0', null);&lt;br /&gt;            break;&lt;br /&gt;        case 1:&lt;br /&gt;            $this-&gt;__call('__construct1', $args);&lt;br /&gt;            break;&lt;br /&gt;        case 2:&lt;br /&gt;            $this-&gt;__call('__construct2', $args);&lt;br /&gt;            break;&lt;br /&gt;        case 3:&lt;br /&gt;            $this-&gt;__call('__construct3', $args);&lt;br /&gt;            break;&lt;br /&gt;        default:&lt;br /&gt;            throw new Exception();&lt;br /&gt;        }&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public function __construct0(){&lt;br /&gt;        echo "constructor 0" . PHP_EOL;&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public function __construct1($a){&lt;br /&gt;        echo "constructor 1: " . $a . PHP_EOL;&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public function __construct2($a, array $hoge){&lt;br /&gt;        echo "constructor 2: " . $a . PHP_EOL;&lt;br /&gt;        var_dump($hoge);&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    public function __construct3(A $a, A $b, C $c){&lt;br /&gt;        echo "constructor 3: " . PHP_EOL;&lt;br /&gt;        var_dump($a, $b, $c);&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    private function __call($name, $arg){&lt;br /&gt;        return call_user_func_array(array($this, $name), $arg);&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;example(PHP5)&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;interface C {&lt;br /&gt;    const C = __CLASS__;&lt;br /&gt;}&lt;br /&gt;class A {&lt;br /&gt;    const A = __CLASS__;&lt;br /&gt;    private $a = array(1, 2, 3);&lt;br /&gt;}&lt;br /&gt;class B extends A{&lt;br /&gt;    const B = __CLASS__;&lt;br /&gt;    private $b = array(1, 2, 3);&lt;br /&gt;}&lt;br /&gt;class D extends B implements C {&lt;br /&gt;    const D = __CLASS__;&lt;br /&gt;    private $c = array(1, 2, 3);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;$a = new Hoge();&lt;br /&gt;$b = new Hoge(1);&lt;br /&gt;$c = new Hoge(777, array(1,2,3));&lt;br /&gt;$d = new Hoge(new A(), new B(), new D());&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Thu, 04 Jan 2007 15:39:12 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/3237</guid>
      <author>nowel (hata)</author>
    </item>
    <item>
      <title>#post method in tests with a different controller</title>
      <link>http://snippets.dzone.com/posts/show/2880</link>
      <description>I wanted a #login method in test_helper that would allow me to easily login from any of my functional tests.  However, the #post method won't allow you to set a different controller than the one in the @controller instance variable that's defined in your test's #setup.  Well, by looking at how the &lt;a href="http://api.rubyonrails.org/classes/ActionController/TestProcess.html#M000046"&gt;#process method&lt;/a&gt; works, you can see that it just grabs the controller from @controller.  Redefine that, and you're good to go:&lt;br /&gt;&lt;code&gt;old_controller = @controller&lt;br /&gt;@controller = LoginController.new&lt;br /&gt;post(&lt;br /&gt;  :attempt_login,&lt;br /&gt;  {:user =&gt; {:name =&gt; 'joe', :password =&gt; 'password'}}&lt;br /&gt;)&lt;br /&gt;@controller = old_controller&lt;/code&gt;&lt;br /&gt;If you have several login methods, such as a #login_admin and #login_regular, you could make a wrapper to reduce duplication:&lt;br /&gt;&lt;code&gt;def wrap_with_controller( new_controller = LoginController )&lt;br /&gt;  old_controller = @controller&lt;br /&gt;  @controller = new_controller.new&lt;br /&gt;  yield&lt;br /&gt;  @controller = old_controller&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;def login_admin&lt;br /&gt;  wrap_with_controller do&lt;br /&gt;    post(&lt;br /&gt;      :attempt_login,&lt;br /&gt;      {:user =&gt; {:name =&gt; 'root', :password =&gt; 'password'}}&lt;br /&gt;    )&lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;def login_regular&lt;br /&gt;  wrap_with_controller do&lt;br /&gt;    post(&lt;br /&gt;      :attempt_login,&lt;br /&gt;      {:user =&gt; {:name =&gt; 'joe', :password =&gt; 'password'}}&lt;br /&gt;    )&lt;br /&gt;  end&lt;br /&gt;end&lt;/code&gt;&lt;br /&gt;</description>
      <pubDate>Tue, 24 Oct 2006 18:54:12 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/2880</guid>
      <author>moneypenny ()</author>
    </item>
    <item>
      <title>get the name of the calling methos</title>
      <link>http://snippets.dzone.com/posts/show/2787</link>
      <description>caller_method_name() gets you the name of the calling method.&lt;br /&gt;you could also get the line and file in which the method is called.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&lt;br /&gt;def caller_method_name&lt;br /&gt;    parse_caller(caller(2).first).last&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;def parse_caller(at)&lt;br /&gt;    if /^(.+?):(\d+)(?::in `(.*)')?/ =~ at&lt;br /&gt;        file = Regexp.last_match[1]&lt;br /&gt;		line = Regexp.last_match[2].to_i&lt;br /&gt;		method = Regexp.last_match[3]&lt;br /&gt;		[file, line, method]&lt;br /&gt;	end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Thu, 05 Oct 2006 21:21:00 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/2787</guid>
      <author>derbumi (Michael)</author>
    </item>
    <item>
      <title>Get the currently running method name in Ruby</title>
      <link>http://snippets.dzone.com/posts/show/2785</link>
      <description>from: http://www.ruby-forum.com/topic/75258&lt;br /&gt;&lt;br /&gt;Author:  Robert Klemme&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&lt;br /&gt;module Kernel&lt;br /&gt; private&lt;br /&gt;    def this_method_name&lt;br /&gt;      caller[0] =~ /`([^']*)'/ and $1&lt;br /&gt;    end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;class Foo&lt;br /&gt; def test_method&lt;br /&gt;   this_method_name&lt;br /&gt; end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;puts Foo.new.test_method    # =&gt; test_method&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Thu, 05 Oct 2006 19:00:47 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/2785</guid>
      <author>ntk ()</author>
    </item>
  </channel>
</rss>
