Link Details

Link 877341 thumbnail
User 199731 avatar

By krisjava
Published: Nov 13 2012 / 09:17

I have created my first (longer than 3 lines) Pig script and started playing with it using the Mathematica Stack Exchange data dump as an example, the reason for it is that its actually quite small compared to some other sites, especially StackOverflow (80% of all analyzed data). Because the SE data is available in the form of XML files (after unpacking) I had to create my custom Pig Data Loader (XML Loader from piggybank didn’t work for me).....
  • 6
  • 0
  • 790
  • 514

Add your comment

Html tags not supported. Reply is editable for 5 minutes. Use [code lang="java|ruby|sql|css|xml"][/code] to post code snippets.

Upvoters (6)

Downvoters (0)

    Apache Hadoop
    Written by: Piotr Krewski
    Featured Refcardz: Top Refcardz:
    1. Play
    2. Akka
    3. Design Patterns
    4. OO JS
    5. Cont. Delivery
    1. Play
    2. Java Performance
    3. Akka
    4. REST
    5. Java