Never been to DZone Snippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

« Newer Snippets
Older Snippets »
Showing 1-4 of 4 total  RSS 

itcrowdquote.rb

// Prints a quote from Channel 4's "The IT Crowd"

#!/usr/bin/env ruby

require "rubygems"
require "open-uri"
require "hpricot"
require "htmlentities"

coder=HTMLEntities.new()

doc=open("http://www.channel4.com/entertainment/tv/microsites/I/itcrowd/quote_generator/") { |f| Hpricot(f) }

section=doc/"blockquote"/"p"
(section/"cite").remove()
quote=section.inner_html

# remove leading whitespace
quote=quote.gsub(/^\s+/, "")

# remove trailing whitespace
quote=quote.gsub(/\s+$/, $/)

# remove dash
quote=quote.gsub(/\s\-\s+$/, $/).chomp

# decode HTML entities
quote=coder.decode(quote)

puts quote

Mechanize / Hpricot / Scraping setup

require 'rubygems'
require 'cgi'
require 'open-uri'
require 'hpricot'
require 'mechanize'

agent = WWW::Mechanize.new
doc = Hpricot(agent.get(the_url).parser.to_s)

Scraping Google Search Results with Hpricot

// snagged from http://g-module.rubyforge.org/

require 'rubygems'
require 'cgi'
require 'open-uri'
require 'hpricot'

q = %w{meine kleine suchanfrage}.map { |w| CGI.escape(w) }.join("+")
url = "http://www.google.com/search?q=#{q}"
doc = Hpricot(open(url).read)
lucky_url = (doc/"div[@class='g'] a").first["href"]
system 'open #{lucky_url}'

Parse XML with Hpricot

From http://errtheblog.com/post/8
Simple XML is basically HTML with random tags, yeah? Parse it with Hpricot!

Your XML:
<Export>
  <Product>
    <SKU>403276</SKU>
    <ItemName>Trivet</ItemName>
    <CollectionNo>0</CollectionNo>
    <Pages>0</Pages>
  </Product>
</Export>


The code:
require 'hpricot'
FIELDS = %w[SKU ItemName CollectionNo Pages]

doc = Hpricot.parse(File.read("my.xml"))
(doc/:product).each do |xml_product|
  product = Product.new
  for field in FIELDS
    product[field] = (xml_product/field.intern).first.innerHTML
  end
  product.save
end
« Newer Snippets
Older Snippets »
Showing 1-4 of 4 total  RSS