Link Details

Link 896473 thumbnail
User 478055 avatar

By mitchp
via architects.dzone.com
Published: Dec 26 2012 / 10:14

I've been experimenting with using Pig on some Fannie-Mae MBS data lately. While I don't mind writing MapReduce programs to process data (especially the fairly simple tasks I'm doing now), I really do appreciate the "magic" Pig does under the blanket, you might say. If you don't know, Pig, a member of the Hadoop ecosystem (and now a first-class Apache project at pig.apache.org), is a framework for analyzing large data sets. In this mini-tutorial we'll see how Pig works with Hadoop and HDFS, and just how much you can accomplish with only a few lines of script. I am using Pig version 0.10.0 on Hadoop 1.1.0 (on Ubuntu 12.04, on VirtualBox 4.2.4, on Windows 7SP1, on the third floor of a tri-level at 1728 m above sea level, but that could change -- see this story about another "PIG").
  • 3
  • 0
  • 773
  • 909

Add your comment


Html tags not supported. Reply is editable for 5 minutes. Use [code lang="java|ruby|sql|css|xml"][/code] to post code snippets.

Voters For This Link (3)



Voters Against This Link (0)



    Spring Integration
    Written by: Soby Chacko
    Featured Refcardz: Top Refcardz:
    1. Search Patterns
    2. Python
    3. C++
    4. Design Patterns
    5. OO JS
    1. PhoneGap
    2. Spring Integration
    3. Regex
    4. Git
    5. Java