DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Snippets has posted 5883 posts at DZone. View Full User Profile

Extract The Body Of An HTML Document

01.04.2007
| 1151 views |
  • submit to reddit
        For example, print out just the body of Google's home page:

use LWP::UserAgent;
use HTML::TreeBuilder;

$ua = LWP::UserAgent->new;
my $req = HTTP::Request->new(GET => 'http://www.google.com/');
my $res = $ua->request($req);

if ($res->is_success) {
  my $tree = HTML::TreeBuilder->new_from_content($res->content);
  $tree->elementify();
  my $body = $tree->find('body');
  foreach $e ($body->content_list())
  {
    print $e->as_HTML();
  }
}