Subversion
Written by: Lorna Jane Mitchell
Featured Refcardz: Top Refcardz:
  1. Git
  2. DNS
  3. Data Mining
  4. Spring Data
  5. Subversion
  1. Spring Data
  2. Subversion
  3. Spring Config.
  4. Spring Annotations
  5. Data Mining

Link Details

Link 886229 thumbnail
User 478055 avatar

By mitchp
via drbunsen.org
Published: Dec 05 2012 / 08:06

Few tools are more indispensable to my work than Unix. Manipulating data into different formats, performing transformations, and conducting exploratory data analysis (EDA) is the lingua franca of data science.1 The coffers of Unix hold many simple tools, which by themselves are powerful, but when chained together facilitate complex data manipulations. Unix's use of functional composition eliminates much of the tedious boilerplate of I/0 and text parsing found in scripting languages. This design creates a simple and succinct interface for manipulating data and a foundation upon which custom tools can be built. Although languages like R and Python are invaluable for data analysis, I find Unix to be superior in many scenarios for quick and simple data cleaning, idea prototyping, and understanding data. This post is about how I use Unix for EDA.
  • 4
  • 0
  • 298
  • 553

Add your comment


Html tags not supported. Reply is editable for 5 minutes. Use [code lang="java|ruby|sql|css|xml"][/code] to post code snippets.

Voters For This Link (4)



Voters Against This Link (0)