Big Data/BI Zone is brought to you in partnership with:
  • submit to reddit
Eric Gregory04/26/13
1686 views
0 replies

Understanding Bayes Theorem with Mario Kart

Trying to understand Bayes' theorem? Here are some resources, including a quick and concise video that uses it to analyze banana-related kart accidents.

Christopher Taylor04/26/13
270 views
0 replies

The Widening Gap of Loyalty Programs

As loyalty programs engage (or fail to engage) with massive quantities of customer data, some prosper while others falter.

Eric Gregory04/25/13
2953 views
0 replies

Make Yourself a Data Scientist

Troy Sadkowsky runs through some common challenges in becoming a data scientist, how to overcome them, and his own professional story.

Arthur Charpentier04/25/13
787 views
0 replies

Data News: Dangerous Predictions, Killing Jargon, and More

In this installment of Arthur Charpentier's data news roundup, we look at when predictions become dangerous, hunt down jargon and kill it, and dive even deeper into Reinhart-Rogoff-gate.

Eric Gregory04/25/13
1325 views
0 replies

Being a Data Scientist at Tumblr and Kickstarter

Data scientists from Tumblr, Kickstarter, and other sites discuss leveraging big data in a startup situation, in this panel from DataGotham 2012.

John Cook04/25/13
668 views
0 replies

Playful and Purposeful, Pure and Applied

Insight on applied and pure science from Edwin Land, inventor of the Polaroid camera.

Eric Gregory04/24/13
2763 views
0 replies

Realtime Analytics for Big Data: A Facebook Case Study

This deep dive into analytics at Facebook explores their choice of HBase over Cassandra, and how to learn from Facebook's choices.

John Sonmez04/24/13
2975 views
1 replies

Privacy is Dead. Time to Prepare.

Before you get worried that I am going all political on you, don’t worry - I’m not condoning the invasion of personal privacy or the eradication of it, but at the same time I’m not supporting it either. I’m simply looking at the patterns that are emerging as the technology of our society increases, making this transformation inevitable.

John Cook04/24/13
1105 views
0 replies

Why j for Imaginary Unit?

Electrical engineers use j for the square root of -1 while nearly everyone else uses i. The usual explanation is that EE’s do this because they use i for current. But here’s one advantage to using j that has nothing to do with electrical engineering.

Prathap Givanth...04/24/13
310 views
0 replies

MySQL Stored Procedure

These days I'm involved with a new project in my office and in this project we are using mysql store procedures heavily. So I decided to write up a post on mysql store procedures.

Rob J Hyndman04/23/13
1756 views
0 replies

My New Forecasting Book is Finally Finished

My new online fore­cast­ing book (writ­ten with George Athana­sopou­los) is now com­pleted. I pre­vi­ously described it on this blog nearly a year ago.

Justin Bozonier04/23/13
11000 views
1 replies

Algorithm of the Week: Generate Music Algorithmically

The algorithm of the week is a Markov Chain. Using this technique you leverage a little bit of probability to do some light machine learning. In this case, input a song and have the computer create an original work based off the patterns you’ve taught it.

Arthur Charpentier04/23/13
1248 views
0 replies

I'll Be in Vegas, Trying to Win Against the House

Suppose I go to Las Vegas with an initial wealth s (say $100). The goal is to find the strategy which maximizes the probability to leave Las Vegas with 2s (here $200). Should I play big, or small? A probabilist goes to Vegas.

Kay Cichini04/23/13
289 views
0 replies

Getting CORINE Land Cover Seamless Vector Data with R

A script to programmatically grab seamless vector land data with R.

Steven Lott04/22/13
2644 views
1 replies

Legacy Code Preservation

When software is revised for a new framework or operating system or database or when an algorithm is converted to a new language, then we're "converting" (or "migrating") software. We're preserving code, and preserving the knowledge encoded.

Eric Gregory04/22/13
844 views
0 replies

Cloud Deployments: Using Hadoop on Clouds

Packt Publishing has provided Chapter 10 of their forthcoming Hadoop MapReduce Cookbook for DZone Readers, covering Hadoop and Amazon ElasticMapReduce.

Erich Styger04/22/13
514 views
0 replies

Why I don’t like printf()

I have a strong opinion, and a rule for using printf(): don't use it.

Eric Gregory04/22/13
606 views
0 replies

Actuarial Analytics with R

Jim Guszcza from the Wisconsin School of Business leads this tutorial on Actuarial analytics with R.

Eric Gregory04/21/13
1917 views
0 replies

Statistical Aspects of Data Mining with R

David Mease's "Statistical Aspects of Data Mining" course, taught a few years back at both Stanford and Google, is a great introduction to data mining and R.

Arthur Charpentier04/21/13
1165 views
0 replies

Data News: "Reverse Causality," the Online Population, and More

In this data link roundup from Arthur Charpentier, there's more on Reinhart-Rogoff, a look at "reverse-causality," plus: what percentage of the world population is actually online?

Christopher Taylor04/20/13
2363 views
0 replies

Weighing Privacy in the Age of Ubiquitous Data

The speed with which the Boston Marathon bombing suspects were identified was a remarkable sign that we’re in the age of ubiquitous photos and video of the public square, albeit at a major international event.

Arthur Charpentier04/20/13
1281 views
0 replies

Data News: Reinhart-Rogoff, Rule by Algorithm, and More

Lots of data news lately: Arthur Charpentier's roundup covers Reinhart-Rogoff, Kaggle, what algorithms tell us about the language of news, and much more.

Eric Gregory04/19/13
4339 views
0 replies

Links You Don't Want To Miss (Apr. 19)

Today: Mozilla's pluggable collaboration tool, CISPA, homemade drones, a radical new CSS best practice, and Code Monkey Saves World.

John Cook04/19/13
2187 views
0 replies

Moments of Mixtures in Python

I needed to compute the higher moments of a mixture distribution for a project I’m working on. I’m writing up the code here in case anyone else finds this useful. (And in case I’ll find it useful in the future.)

Eric Gregory04/19/13
1269 views
0 replies

8 Predictive Analytics Questions Answered by the Guy Who Wrote the Book

Eric Siegal, author of the recent Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, takes on eight Big Questions.