Trying to understand Bayes' theorem? Here are some resources, including a quick and concise video that uses it to analyze banana-related kart accidents.
As loyalty programs engage (or fail to engage) with massive quantities of customer data, some prosper while others falter.
Troy Sadkowsky runs through some common challenges in becoming a data scientist, how to overcome them, and his own professional story.
In this installment of Arthur Charpentier's data news roundup, we look at when predictions become dangerous, hunt down jargon and kill it, and dive even deeper into Reinhart-Rogoff-gate.
Data scientists from Tumblr, Kickstarter, and other sites discuss leveraging big data in a startup situation, in this panel from DataGotham 2012.
Insight on applied and pure science from Edwin Land, inventor of the Polaroid camera.
This deep dive into analytics at Facebook explores their choice of HBase over Cassandra, and how to learn from Facebook's choices.
Before you get worried that I am going all political on you, don’t worry - I’m not condoning the invasion of personal privacy or the eradication of it, but at the same time I’m not supporting it either. I’m simply looking at the patterns that are emerging as the technology of our society increases, making this transformation inevitable.
Electrical engineers use j for the square root of -1 while nearly everyone else uses i. The usual explanation is that EE’s do this because they use i for current. But here’s one advantage to using j that has nothing to do with electrical engineering.
These days I'm involved with a new project in my office and in this project we are using mysql store procedures heavily. So I decided to write up a post on mysql store procedures.
My new online forecasting book (written with George Athanasopoulos) is now completed. I previously described it on this blog nearly a year ago.
The algorithm of the week is a Markov Chain. Using this technique you leverage a little bit of probability to do some light machine learning. In this case, input a song and have the computer create an original work based off the patterns you’ve taught it.
Suppose I go to Las Vegas with an initial wealth s (say $100). The goal is to find the strategy which maximizes the probability to leave Las Vegas with 2s (here $200). Should I play big, or small? A probabilist goes to Vegas.
A script to programmatically grab seamless vector land data with R.
When software is revised for a new framework or operating system or database or when an algorithm is converted to a new language, then we're "converting" (or "migrating") software. We're preserving code, and preserving the knowledge encoded.
Packt Publishing has provided Chapter 10 of their forthcoming Hadoop MapReduce Cookbook for DZone Readers, covering Hadoop and Amazon ElasticMapReduce.
I have a strong opinion, and a rule for using printf(): don't use it.
Jim Guszcza from the Wisconsin School of Business leads this tutorial on Actuarial analytics with R.
David Mease's "Statistical Aspects of Data Mining" course, taught a few years back at both Stanford and Google, is a great introduction to data mining and R.
In this data link roundup from Arthur Charpentier, there's more on Reinhart-Rogoff, a look at "reverse-causality," plus: what percentage of the world population is actually online?
The speed with which the Boston Marathon bombing suspects were identified was a remarkable sign that we’re in the age of ubiquitous photos and video of the public square, albeit at a major international event.
Lots of data news lately: Arthur Charpentier's roundup covers Reinhart-Rogoff, Kaggle, what algorithms tell us about the language of news, and much more.
Today: Mozilla's pluggable collaboration tool, CISPA, homemade drones, a radical new CSS best practice, and Code Monkey Saves World.
I needed to compute the higher moments of a mixture distribution for a project I’m working on. I’m writing up the code here in case anyone else finds this useful. (And in case I’ll find it useful in the future.)
Eric Siegal, author of the recent Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, takes on eight Big Questions.