DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world
Scraping Google Search Results With Hpricot
// snagged from http://g-module.rubyforge.org/
require 'rubygems'
require 'cgi'
require 'open-uri'
require 'hpricot'
q = %w{meine kleine suchanfrage}.map { |w| CGI.escape(w) }.join("+")
url = "http://www.google.com/search?q=#{q}"
doc = Hpricot(open(url).read)
lucky_url = (doc/"div[@class='g'] a").first["href"]
system 'open #{lucky_url}'






Comments
Peter Szinek replied on Wed, 2007/06/13 - 1:10am
Peter Szinek replied on Wed, 2007/06/13 - 1:10am
require 'rubygems' require 'scrubyt' google_data = Scrubyt::Extractor.define do fetch 'http://www.google.com/ncr' fill_textfield 'q', 'ruby' submit link "Ruby Programming Language/@href" next_page "Next", :limit => 2 end puts google_data.to_xmlResult:http://www.ruby-lang.org/ http://www.ruby-lang.org/en/20020101.html http://en.wikipedia.org/wiki/Ruby_programming_language http://en.wikipedia.org/wiki/Ruby http://www.rubyonrails.org/ http://www.rubycentral.com/ http://www.rubycentral.com/book/ http://www.w3.org/TR/ruby/ http://www.zenspider.com/Languages/Ruby/QuickRef.html http://poignantguide.net/ http://www.rubynz.com/ http://www.ruby-doc.org/ http://tryruby.hobix.com/ http://www.rubycentral.org/ http://www.gemstone.org/ruby.html http://whytheluckystiff.net/ruby/pickaxe/ http://intertwingly.net/blog/ http://lotusmedia.org/ http://rubyforge.org/frs/?group_id=167 http://www.oreillynet.com/ruby/For those who think this is not robust (it isn't indeed, since if you change the search query, it breaks), scRUBYt! is able to export a production extractor:require 'rubygems' require 'scrubyt' google_data = Scrubyt::Extractor.define do fetch("http://www.google.com/ncr") fill_textfield("q", "anything else") submit link "/html/body/div/div/div/a" next_page "Next", :limit => 2 end puts google_data.to_xml