Never been to DZone Snippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Jyte Spy feed scraper (See related posts)

Scrapes the JSON/HTML Jyte Spy data feed and turns it into something more useful.

   1  
   2  # stdlib includes
   3  require 'net/http'
   4  require 'digest/md5'
   5  
   6  # RubyGems includes
   7  require 'rubygems'
   8  require 'hpricot'
   9  require 'json'
  10  
  11  # If we don't use UTF8, Jonathan Rascher's evil byte order mark claim title
  12  # hack will break the JSON parser.
  13  $KCODE = 'UTF8'
  14  
  15  module Jyte; module Spy
  16    
  17    JYTE_SPY_URL = URI.parse('http://jyte.com/spy/update')
  18    
  19    # Gets an array of Hashes representing events in the Jyte Spy data feed.
  20    def self.get_events
  21      events = []
  22      data   = JSON.parse(Net::HTTP.get(JYTE_SPY_URL))
  23      
  24      data.each do |key, value|
  25        next unless value.is_a?(String)
  26  
  27        html = Hpricot(value)
  28       
  29        if html.at("div/span[@class='agreed']")
  30          action = :agreed
  31        elsif html.at("div/span[@class='disagreed']")
  32          action = :disagreed
  33        elsif value =~ /<\/a> commented on/
  34          action = :comment
  35        else
  36          action = :claim
  37        end
  38        
  39        if action == :claim
  40          claim = {
  41            :name => html.at("div/a").inner_html.strip,
  42            :url  => html.at("div/a")['href']
  43          }
  44          
  45          user = {
  46            :name    => html.at("div/a[2]").inner_html.strip,
  47            :openid  => html.at("div/a[2]")['title'],
  48            :profile => html.at("div/a[2]")['href']
  49          }
  50        else
  51          claim = {
  52            :name => html.at("div/a[2]").inner_html.strip,
  53            :url  => html.at("div/a[2]")['href']
  54          }
  55          
  56          user = {
  57            :name    => html.at("div/a").inner_html.strip,
  58            :openid  => html.at("div/a")['title'],
  59            :profile => html.at("div/a")['href']
  60          }
  61        end
  62        
  63        events << {
  64          :action   => action,
  65          :claim    => claim,
  66          :hash     => Digest::MD5.hexdigest(value),
  67          :position => key[/\d+/].to_i,
  68          :user     => user
  69        }
  70      end
  71      
  72      return events.sort_by{|item| item[:position] }
  73    end
  74    
  75  end; end

You need to create an account or log in to post comments to this site.


Click here to browse all 5555 code snippets

Related Posts