Arthur Charpentier05/18/13
932 views
0 replies
Arthur Charpentier's regular data link roundup takes a look at the mass media era, how often the oldest living person dies, Pi as random number generator, and much more,
Mike Driscoll05/17/13
4324 views
1 replies
In this tutorial, we’ll take a look at some of the barcodes that Reportlab can generate. If you don’t already have Reportlab, go to their website and get it before jumping into the article.
Chris Spagnuolo05/17/13
2956 views
0 replies
If you haven't heard of him before, Geoff Moore writes and speaks about the technology adoption lifecycle and the marketing and business strategies for successfully navigating this lifecycle.
Christopher Taylor05/17/13
2645 views
0 replies
What’s truly changed is the amount of information that we now have on every imaginable demographic. Today’s systems track and store every purchase by every consumer in a way that turns us from extrapolation to intimate knowledge instead.
Rob J Hyndman05/17/13
1653 views
0 replies
I’ve come across this problem before in my consulting work, although I don’t think I’ve ever published my solution. So here it is. If x is your monthly time series, then you can construct annual totals as follows...
Arthur Charpentier05/17/13
1691 views
0 replies
Is it possible to reproduce the random generator? Yes, we can. And it is quite simple, if you use the appropriate library and the appropriate function.
John Cook05/17/13
1340 views
0 replies
A week ago I wrote about Perrin numbers, numbers Pn defined by a recurrence relation similar to Fibonacci numbers. If n is prime, Pn mod n = 0, and the converse is nearly always true. That is, if Pn mod n = 0, n is usually prime. The exceptions are called Perrin pseudoprimes.
Arthur Charpentier05/16/13
869 views
0 replies
Arthur Charpentier's regular data link roundup explores the paradox of the proof, the sex in economics, the long wait for natural language processing, and much more.
John Cook05/16/13
124 views
0 replies
Perspective and success for the applied mathematician.
Eric Gregory05/15/13
3056 views
0 replies
Today: The Universal Bytecode, an old math problem solved, the sound of sorting (algorithms), and a solution for automating development environments. Plus: the long history of selfish generations.
Mark Needham05/15/13
1766 views
0 replies
Nate Silver is famous for having correctly predicted the winner of all 50 states in the 2012 United States elections and Sid recommended his book so I could learn more about statistics for the A/B tests that we were running.
Marko Rodriguez05/15/13
1097 views
0 replies
New data processing technologies and theories in education are moving much of the learning experience into the digital space — into massive open online courses (MOOCs). Two years ago Pearson contacted Aurelius about applying graph theory and network science to this burgeoning space.
John Cook05/15/13
1667 views
0 replies
Suppose you want to know when your great-grandmother was born. You can’t find the year recorded anywhere. But you did discover an undated letter from her father that mentions her birth and one curious detail: the 13-year and 17-year cicadas were swarming.
Nick Johnson05/14/13
4623 views
0 replies
A secure permutation is one in which an attacker, given any subset of the permutation, cannot determine the order of any other elements. A simple example of this would be to take a cryptographically secure pseudo-random number generator, seed it with a secret key, and use it to shuffle your sequence.
Chase Seibert05/14/13
2439 views
0 replies
Though there is some decent documentation, I found that setting up Hive with a HBase back-end to be somewhat fiddly. Hopefully this guide will help you get started quicker.
Michael Mccandless05/14/13
1890 views
1 replies
For the past few weeks I've been building a simple Lucene search application, searching all Lucene and Solr Jira issues, and using it instead of Jira's search whenever I need to go find an issue.
Arthur Charpentier05/14/13
174 views
0 replies
The good thing is that even complex functions (logistic regression, regression trees, etc) produce the same kind of outputs. But we found a problem that we could not fix: generating identical training subsets of observations…
Paul Miller05/13/13
1796 views
0 replies
"Everything should be made as simple as possible, but not simpler." These words have resonated with me recently, as I’ve heard pitches from one company after another, all of which are trying to cut through the complexity of data to make it accessible.
John Cook05/13/13
1534 views
0 replies
The floor of a real number x is the largest integer n ≤ x, written ⌊x⌋. The ceiling of a real number x is the smallest integer n ≥ x, written ⌈x⌉. The floor and ceiling have the following symmetric relationship
Arthur Charpentier05/13/13
631 views
0 replies
In my courses on R, I usually show how to insert a picture as a background for a graph. But it is also to see the picture as an object, and to insert it in a graph everywhere we like to see it.
Arthur Charpentier05/13/13
428 views
0 replies
In my course on claims reserving techniques, I mentioned the use of Poisson regression, even if incremental payments were not integers. For instance, we did consider incremental triangles...
Christopher Taylor05/12/13
3028 views
0 replies
Matt Schumpert took the stage at the InterOp Big Data Workshop in Las Vegas yesterday to talk about the myths and realities of Big Data. Matt is Director of Solutions Engineering at Datameer and has customers that include Visa, Sears, and three out of the worlds largest five banks. He’s an expert on Big Data and brought the following insights...
Arthur Charpentier05/12/13
917 views
0 replies
Arthur Charpentier's data link roundup takes a look at the mathematics of life in the city, the Batman equation, an accurate geek CT scan, and much more.
Eli Bendersky05/11/13
2365 views
0 replies
After months of intensive discussion (more than a 1000 emails in dozens of threads spread over two mailing lists, and a couple of hundred additional private emails), PEP 435 has been accepted and Python will finally have an enumeration type in 3.4!
Christopher Taylor05/11/13
1602 views
1 replies
The Big Data Workshop at InterOp Las Vegas wrapped up the morning with a presentation on Big Data requirements by John West, CTO and Founder of Fabless Labs. John kicked off with the challenge of having your enormous data set all ready to work with when you discover any one of the following problems...