Link Details

Link 896473 thumbnail
User 478055 avatar

By mitchp
via architects.dzone.com
Published: Dec 26 2012 / 10:14

I've been experimenting with using Pig on some Fannie-Mae MBS data lately. While I don't mind writing MapReduce programs to process data (especially the fairly simple tasks I'm doing now), I really do appreciate the "magic" Pig does under the blanket, you might say. If you don't know, Pig, a member of the Hadoop ecosystem (and now a first-class Apache project at pig.apache.org), is a framework for analyzing large data sets. In this mini-tutorial we'll see how Pig works with Hadoop and HDFS, and just how much you can accomplish with only a few lines of script. I am using Pig version 0.10.0 on Hadoop 1.1.0 (on Ubuntu 12.04, on VirtualBox 4.2.4, on Windows 7SP1, on the third floor of a tri-level at 1728 m above sea level, but that could change -- see this story about another "PIG").
  • 3
  • 0
  • 876
  • 909

Add your comment


Html tags not supported. Reply is editable for 5 minutes. Use [code lang="java|ruby|sql|css|xml"][/code] to post code snippets.

Voters For This Link (3)



Voters Against This Link (0)



    Play Framework
    Written by: Ryan Knight
    Featured Refcardz: Top Refcardz:
    1. Akka
    2. Design Patterns
    3. OO JS
    4. Cont. Delivery
    5. HTML5 Mobile
    1. Akka
    2. JUnit/EasyMock
    3. Java Performance
    4. REST
    5. Java