Link Details

Link 896473 thumbnail
User 478055 avatar

By mitchp
Published: Dec 26 2012 / 10:14

I've been experimenting with using Pig on some Fannie-Mae MBS data lately. While I don't mind writing MapReduce programs to process data (especially the fairly simple tasks I'm doing now), I really do appreciate the "magic" Pig does under the blanket, you might say. If you don't know, Pig, a member of the Hadoop ecosystem (and now a first-class Apache project at, is a framework for analyzing large data sets. In this mini-tutorial we'll see how Pig works with Hadoop and HDFS, and just how much you can accomplish with only a few lines of script. I am using Pig version 0.10.0 on Hadoop 1.1.0 (on Ubuntu 12.04, on VirtualBox 4.2.4, on Windows 7SP1, on the third floor of a tri-level at 1728 m above sea level, but that could change -- see this story about another "PIG").
  • 3
  • 0
  • 891
  • 909

Add your comment

Html tags not supported. Reply is editable for 5 minutes. Use [code lang="java|ruby|sql|css|xml"][/code] to post code snippets.

Voters For This Link (3)

Voters Against This Link (0)

    Apache Hadoop
    Written by: Piotr Krewski
    Featured Refcardz: Top Refcardz:
    1. Play
    2. Akka
    3. Design Patterns
    4. OO JS
    5. Cont. Delivery
    1. Play
    2. Java Performance
    3. Akka
    4. REST
    5. Java