Subversion
Written by: Lorna Jane Mitchell
Featured Refcardz: Top Refcardz:
  1. Git
  2. DNS
  3. Data Mining
  4. Spring Data
  5. Subversion
  1. Spring Data
  2. Subversion
  3. Spring Config.
  4. Spring Annotations
  5. Data Mining

Link Details

Link 896473 thumbnail
User 478055 avatar

By mitchp
via architects.dzone.com
Published: Dec 26 2012 / 10:14

I've been experimenting with using Pig on some Fannie-Mae MBS data lately. While I don't mind writing MapReduce programs to process data (especially the fairly simple tasks I'm doing now), I really do appreciate the "magic" Pig does under the blanket, you might say. If you don't know, Pig, a member of the Hadoop ecosystem (and now a first-class Apache project at pig.apache.org), is a framework for analyzing large data sets. In this mini-tutorial we'll see how Pig works with Hadoop and HDFS, and just how much you can accomplish with only a few lines of script. I am using Pig version 0.10.0 on Hadoop 1.1.0 (on Ubuntu 12.04, on VirtualBox 4.2.4, on Windows 7SP1, on the third floor of a tri-level at 1728 m above sea level, but that could change -- see this story about another "PIG").
  • 3
  • 0
  • 518
  • 904

Add your comment


Html tags not supported. Reply is editable for 5 minutes. Use [code lang="java|ruby|sql|css|xml"][/code] to post code snippets.

Voters For This Link (3)



Voters Against This Link (0)