By rob_dimarco
via innovationontherun.com
Submitted: Sep 05 2007 / 18:01
Scraping static web sites to verify functionality or to access data has been around as long as there has been a web (example of scraping of a static web page with Ruby). But with the advent of AJAX and other techniques that use JavaScript to dynamically insert HTML into a web page, scraping has gotten more challenging. With the 1.12 release of HtmlUnit, this headless web browser can now support parsing and executing JavaScript and when combined with JRuby, is a great technology for easily construction of a script that parses a dynamic site.
Add your comment