DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Snippets has posted 5883 posts at DZone. View Full User Profile

Perl Crawler

04.11.2006
| 2888 views |
  • submit to reddit
        The following code is designed to print all the links found on the google home page.  I found it laying around in my old source-code folder, it may not be fully working.

#!/usr/bin/perl
use IO::Socket;

$socket = IO::Socket::INET->new(PeerAddr => 'google.com',
				PeerPort => 80,
				Proto => 'tcp',
				Type => SOCK_STREAM)
	or die "Couldn't connect";
print $socket "GET / HTTP/1.0\n\n";
#$page = <$socket>;
while (defined($line = <$socket>)) {
	$line =~ m{href="(.*?)"}ig;
	print "$1";
    }
close($socket);