<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DZone Snippets: words code</title>
    <link>http://snippets.dzone.com/posts</link>
    <pubDate>Fri, 16 May 2008 15:05:10 GMT</pubDate>
    <description>DZone Snippets: words code</description>
    <item>
      <title>Ruby word count</title>
      <link>http://snippets.dzone.com/posts/show/5112</link>
      <description>&lt;code&gt;&lt;br /&gt;module StringExtensions&lt;br /&gt;  def words&lt;br /&gt;    s = self.dup&lt;br /&gt;    s.gsub!(/\w+/, 'X')&lt;br /&gt;    s.gsub!(/\W+/, '')&lt;br /&gt;    s.length&lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Wed, 06 Feb 2008 19:29:34 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/5112</guid>
      <author>sikelianos (Zeke Sikelianos)</author>
    </item>
    <item>
      <title>Ruby Title Case</title>
      <link>http://snippets.dzone.com/posts/show/4702</link>
      <description>Capitalize all words in a string with the briefest of snippets!&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;'some string here'.gsub(/\b\w/){$&amp;.upcase}&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Sat, 27 Oct 2007 00:41:10 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/4702</guid>
      <author>eliazar (eliazar parra)</author>
    </item>
    <item>
      <title>Elegant way of shorten a text string</title>
      <link>http://snippets.dzone.com/posts/show/4578</link>
      <description>this method shortens a plain text string down to complete words contained in given scope (count)&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;def shorten (string, count = 30)&lt;br /&gt;	if string.length &gt;= count &lt;br /&gt;		shortened = string[0, count]&lt;br /&gt;		splitted = shortened.split(/\s/)&lt;br /&gt;		words = splitted.length&lt;br /&gt;		splitted[0, words-1].join(" ") + ' ...'&lt;br /&gt;	else &lt;br /&gt;		string&lt;br /&gt;	end&lt;br /&gt;end&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Thu, 27 Sep 2007 11:47:15 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/4578</guid>
      <author>labuschin (Martin Labuschin)</author>
    </item>
    <item>
      <title>Ruby dictionary username generation</title>
      <link>http://snippets.dzone.com/posts/show/4536</link>
      <description>Generate a new random name from dictionary words.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;DICT_PATH = '/usr/share/dict/words'&lt;br /&gt;DICT_SIZE = 234936&lt;br /&gt;&lt;br /&gt;def self.generated_name words = 2, length = 23&lt;br /&gt;  name = 'a'*(length+1)&lt;br /&gt;  while name.length &gt; length&lt;br /&gt;    name = (1..words).map{%x[sed -n '#{rand(DICT_SIZE)} {p;q;}' '#{DICT_PATH}'].chomp.capitalize}.join&lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Thu, 13 Sep 2007 13:57:59 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/4536</guid>
      <author>elliottcable (elliott cable)</author>
    </item>
    <item>
      <title>Simple Haskell script for word counting</title>
      <link>http://snippets.dzone.com/posts/show/4263</link>
      <description>This is just a simple piece of code I put together to play with some Haskell when I realised I've not been writing nearly enough of the stuff. &lt;br /&gt;&lt;br /&gt;It reads text from stdin and prints the words it finds together with how many times each one occurred.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;module Main&lt;br /&gt;where&lt;br /&gt;&lt;br /&gt;import List&lt;br /&gt;import Control.Arrow&lt;br /&gt;&lt;br /&gt;type Comparator a = (a -&gt; a -&gt; Ordering)&lt;br /&gt;&lt;br /&gt;ascending :: (Ord a) =&gt; (b -&gt; a) -&gt; Comparator b&lt;br /&gt;ascending f x y = compare (f x) (f y)&lt;br /&gt;&lt;br /&gt;descending :: (Ord a) =&gt; (b -&gt; a) -&gt; Comparator b&lt;br /&gt;descending = flip . ascending&lt;br /&gt;&lt;br /&gt;secondary :: Comparator a -&gt; Comparator a -&gt; Comparator a&lt;br /&gt;secondary f g x y = case f x y of {&lt;br /&gt;                    EQ -&gt; g x y;&lt;br /&gt;                    z  -&gt; z; }&lt;br /&gt;&lt;br /&gt;-- Returns a list of unique elements together with their frequency. Listed in decreasing order of frequency, followed by&lt;br /&gt;increasing order of the elements.&lt;br /&gt;count :: (Ord a) =&gt; [a] -&gt; [(a, Int)]&lt;br /&gt;count = map (head &amp;&amp;&amp; length) . sortBy (descending length `secondary` ascending head) . group . sort&lt;br /&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main = interact $ unlines . map (\(x, y) -&gt; (take 20 $ x ++ repeat ' ')  ++ " : " ++ show y) . count . words&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Thu, 05 Jul 2007 13:52:45 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/4263</guid>
      <author>DRMacIver (David R. MacIver)</author>
    </item>
    <item>
      <title>Fast stop word detection in Ruby</title>
      <link>http://snippets.dzone.com/posts/show/4236</link>
      <description>Requires &lt;a href="http://snippets.dzone.com/posts/show/4235"&gt;BloominSimple&lt;/a&gt; (a pure Ruby Bloom filter class).&lt;br /&gt;&lt;br /&gt;List of stop words obtained from &lt;a href="http://www.dcs.gla.ac.uk/idom/ir_resources/linguistic_utils/stop_words"&gt;http://www.dcs.gla.ac.uk/idom/ir_resources/linguistic_utils/stop_words&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;code&gt;# Detect stop words QUICKLY&lt;br /&gt;# Uses a bloom filter instead of searching literally through a list of stopwords&lt;br /&gt;# for &gt; 3x speed increase&lt;br /&gt;# &lt;br /&gt;#    using bloom filter: 2.580000   0.030000   2.610000 (  2.698829)&lt;br /&gt;#  using literal search: 7.850000   0.120000   7.970000 (  8.181684)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;require 'bloominsimple'&lt;br /&gt;require 'digest/sha1'&lt;br /&gt;require 'pp'&lt;br /&gt;&lt;br /&gt;# Create a simple bloom filter that uses a SHA1 hash (more effective than BloominSimple's default hashing)&lt;br /&gt;b = BloominSimple.new(50000) do |word|&lt;br /&gt;  Digest::SHA1.digest(word.downcase.strip).unpack("VVV")&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;# Add stopwords to the bloom filter!&lt;br /&gt;stopwords = []&lt;br /&gt;File.open('stopwords').each { |a| b.add(a); stopwords &lt;&lt; a.downcase.strip }&lt;br /&gt;&lt;br /&gt;# Read in a whole dictionary of regular words&lt;br /&gt;words = File.open('/usr/share/dict/words').read.split.collect{|a| a.downcase.strip }&lt;br /&gt;&lt;br /&gt;# Define two ways to detect stopwords for comparison..&lt;br /&gt;using_filter = lambda { |word| b.includes?(word) }&lt;br /&gt;using_array = lambda { |word| stopwords.include?(word.downcase.strip) }&lt;br /&gt;techniques = [using_filter, using_array]&lt;br /&gt;&lt;br /&gt;# Run stopword comparisons with both techniques&lt;br /&gt;t = techniques.collect { |l| words.collect { |a| l[a] } }&lt;br /&gt;&lt;br /&gt;# See how effective the bloom filter has been compared to the literal search&lt;br /&gt;if t[0] == t[1]&lt;br /&gt;  puts "GOOD"&lt;br /&gt;else&lt;br /&gt;  words.zip(t[0],t[1]).each do |x|&lt;br /&gt;    puts x.first if x[1] != x[2]&lt;br /&gt;  end&lt;br /&gt;end&lt;br /&gt;&lt;br /&gt;# Now do speed benchmarks..&lt;br /&gt;techniques.each { |l| puts Benchmark.measure { words.each { |a| l[a] } } }&lt;/code&gt;</description>
      <pubDate>Mon, 02 Jul 2007 03:10:16 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/4236</guid>
      <author>peter (Peter Cooperx)</author>
    </item>
    <item>
      <title>words function</title>
      <link>http://snippets.dzone.com/posts/show/1101</link>
      <description>&lt;code&gt;&lt;br /&gt;	words: func [&lt;br /&gt;		"Returns block of object words, excluding self directive."&lt;br /&gt;		object [object!] "Target object."&lt;br /&gt;	][&lt;br /&gt;	   next first object&lt;br /&gt;	]&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Tue, 10 Jan 2006 05:45:26 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/1101</guid>
      <author>gregg.irwin (Gregg Irwin)</author>
    </item>
    <item>
      <title>Truncate text with word boundaries in Ruby</title>
      <link>http://snippets.dzone.com/posts/show/804</link>
      <description>&lt;code&gt;&lt;br /&gt;  def truncate_words(text, length = 30, end_string = ' &#8230;')&lt;br /&gt;    words = text.split()&lt;br /&gt;    words[0..(length-1)].join(' ') + (words.length &gt; length ? end_string : '')&lt;br /&gt;  end&lt;br /&gt;&lt;/code&gt;</description>
      <pubDate>Wed, 12 Oct 2005 04:17:01 GMT</pubDate>
      <guid>http://snippets.dzone.com/posts/show/804</guid>
      <author>chao (Chao Lam)</author>
    </item>
  </channel>
</rss>
