Link Details

Link 877341 thumbnail
User 199731 avatar

By krisjava
Published: Nov 13 2012 / 09:17

I have created my first (longer than 3 lines) Pig script and started playing with it using the Mathematica Stack Exchange data dump as an example, the reason for it is that its actually quite small compared to some other sites, especially StackOverflow (80% of all analyzed data). Because the SE data is available in the form of XML files (after unpacking) I had to create my custom Pig Data Loader (XML Loader from piggybank didn’t work for me).....
  • 6
  • 0
  • 787
  • 513

Add your comment

Html tags not supported. Reply is editable for 5 minutes. Use [code lang="java|ruby|sql|css|xml"][/code] to post code snippets.

Voters For This Link (6)

Voters Against This Link (0)

    Apache Hadoop
    Written by: Piotr Krewski
    Featured Refcardz: Top Refcardz:
    1. Play
    2. Akka
    3. Design Patterns
    4. OO JS
    5. Cont. Delivery
    1. Play
    2. Java Performance
    3. Akka
    4. REST
    5. Java