Never been to DZone Snippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

« Newer Snippets
Older Snippets »
Showing 1-8 of 8 total  RSS 

Ruby word count

module StringExtensions
  def words
    s = self.dup
    s.gsub!(/\w+/, 'X')
    s.gsub!(/\W+/, '')
    s.length
  end
end

Ruby Title Case

Capitalize all words in a string with the briefest of snippets!

'some string here'.gsub(/\b\w/){$&.upcase}

Elegant way of shorten a text string

this method shortens a plain text string down to complete words contained in given scope (count)

def shorten (string, count = 30)
	if string.length >= count 
		shortened = string[0, count]
		splitted = shortened.split(/\s/)
		words = splitted.length
		splitted[0, words-1].join(" ") + ' ...'
	else 
		string
	end
end

Ruby dictionary username generation

Generate a new random name from dictionary words.

DICT_PATH = '/usr/share/dict/words'
DICT_SIZE = 234936

def self.generated_name words = 2, length = 23
  name = 'a'*(length+1)
  while name.length > length
    name = (1..words).map{%x[sed -n '#{rand(DICT_SIZE)} {p;q;}' '#{DICT_PATH}'].chomp.capitalize}.join
  end
end

Simple Haskell script for word counting

This is just a simple piece of code I put together to play with some Haskell when I realised I've not been writing nearly enough of the stuff.

It reads text from stdin and prints the words it finds together with how many times each one occurred.

module Main
where

import List
import Control.Arrow

type Comparator a = (a -> a -> Ordering)

ascending :: (Ord a) => (b -> a) -> Comparator b
ascending f x y = compare (f x) (f y)

descending :: (Ord a) => (b -> a) -> Comparator b
descending = flip . ascending

secondary :: Comparator a -> Comparator a -> Comparator a
secondary f g x y = case f x y of {
                    EQ -> g x y;
                    z  -> z; }

-- Returns a list of unique elements together with their frequency. Listed in decreasing order of frequency, followed by
increasing order of the elements.
count :: (Ord a) => [a] -> [(a, Int)]
count = map (head &&& length) . sortBy (descending length `secondary` ascending head) . group . sort

main :: IO ()
main = interact $ unlines . map (\(x, y) -> (take 20 $ x ++ repeat ' ')  ++ " : " ++ show y) . count . words

Fast stop word detection in Ruby

Requires BloominSimple (a pure Ruby Bloom filter class).

List of stop words obtained from http://www.dcs.gla.ac.uk/idom/ir_resources/linguistic_utils/stop_words

# Detect stop words QUICKLY
# Uses a bloom filter instead of searching literally through a list of stopwords
# for > 3x speed increase
# 
#    using bloom filter: 2.580000   0.030000   2.610000 (  2.698829)
#  using literal search: 7.850000   0.120000   7.970000 (  8.181684)


require 'bloominsimple'
require 'digest/sha1'
require 'pp'

# Create a simple bloom filter that uses a SHA1 hash (more effective than BloominSimple's default hashing)
b = BloominSimple.new(50000) do |word|
  Digest::SHA1.digest(word.downcase.strip).unpack("VVV")
end

# Add stopwords to the bloom filter!
stopwords = []
File.open('stopwords').each { |a| b.add(a); stopwords << a.downcase.strip }

# Read in a whole dictionary of regular words
words = File.open('/usr/share/dict/words').read.split.collect{|a| a.downcase.strip }

# Define two ways to detect stopwords for comparison..
using_filter = lambda { |word| b.includes?(word) }
using_array = lambda { |word| stopwords.include?(word.downcase.strip) }
techniques = [using_filter, using_array]

# Run stopword comparisons with both techniques
t = techniques.collect { |l| words.collect { |a| l[a] } }

# See how effective the bloom filter has been compared to the literal search
if t[0] == t[1]
  puts "GOOD"
else
  words.zip(t[0],t[1]).each do |x|
    puts x.first if x[1] != x[2]
  end
end

# Now do speed benchmarks..
techniques.each { |l| puts Benchmark.measure { words.each { |a| l[a] } } }

words function

	words: func [
		"Returns block of object words, excluding self directive."
		object [object!] "Target object."
	][
	   next first object
	]

Truncate text with word boundaries in Ruby

  def truncate_words(text, length = 30, end_string = '')
    words = text.split()
    words[0..(length-1)].join(' ') + (words.length > length ? end_string : '')
  end
« Newer Snippets
Older Snippets »
Showing 1-8 of 8 total  RSS