Big Data/Analytics Zone is brought to you in partnership with:
  • submit to reddit
Linda Gimmeson10/17/14
2541 views
0 replies

FAQ of Executives Regarding Apache Hadoop

Apache Hadoop has slowly been infiltrating the mainstream business world, but many executives are still left with doubts about whether adopting Hadoop is a sound strategy for their organization. Is Hadoop enterprise friendly? Is it economical for an organization to use?

Tomasz Sobczak10/16/14
702 views
1 replies

Review of "Scaling Apache Solr" Book

Review of "Scaling Apache Solr" book.

Alec Noller10/15/14
6147 views
1 replies

Dev of the Week: Ashwini Kuntamukkala

Every week here and in our newsletter, we feature a new developer/blogger from the DZone community to catch up and find out what he or she is working on now and what's coming next. This week we're talking to Ashwini Kuntamukkala, Software Architect at SciSpike, Inc.

Adam Diaz10/15/14
5281 views
0 replies

Hadoop and the mystery of the version number

When I’m working with people on Hadoop I ask what you would think is a simple question. What version of Hadoop are you using? In reality though it’s not as straight forward as you might think.

Mikio Braun10/14/14
2704 views
0 replies

Parts But No Car

One question which pops up again and again when I talk about streamdrill is whether that cannot be done by X, where X is one of Hadoop, Spark, Go, or some other piece of Big Data infrastructure. The truth is that there’s a huge gap between “in principle” and “in reality”, and I’d like to spell this difference out in this post.

David Mai10/11/14
1561 views
0 replies

22 Big Data & BI Events (U.S.) that You Must Attend Before the End of 2014

With so many events taking place it can be a very daunting task finding the one that perfectly fits your interests and needs. That being said, I’ve done some research and compiled a comprehensive list of 22 Big Data and Business Intelligence events that you must attend during Q4 of 2014.

Kevin Daly10/11/14
5472 views
0 replies

Hadoop 2.0 as Part of a Data Platform: It’s Not Just About Mapreduce!

Examining exactly what is a data platform? Get a better understanding of big data and it's application. In this article I’ll be talking about the HortonWorks Data Platform as a reference platform.

Borislav Iordanov10/10/14
5357 views
0 replies

Jayson Skima - Validating JavaScript Object Notation Data

A crash course on JSON Schema. A nearly complete coverage of the Draft 4 specification, in brief.

Mark Needham10/10/14
6484 views
1 replies

R: A first attempt at linear regression

I’ve been working through the videos that accompany the Introduction to Statistical Learning with Applications in R book and thought it’d be interesting to try out the linear regression algorithm against my meetup data set.

David Mai10/10/14
2191 views
0 replies

9 Influential Women Writers in Big Data and Business Intelligence

In my own experience as an editor who covers BI, I read numerous BI articles and I have found that despite the disproportionately low number of women in technology, many of the articles that I’ve read were authored by women. In BI, the works of women have provided great insight and thought leadership to the BI community and I personally want to list nine of the the top women writers who have helped shape my view on BI.

Mark Needham10/09/14
5345 views
0 replies

R: Deriving a new data frame column based on containing string

I’ve been playing around with R data frames a bit more and one thing I wanted to do was derive a new column based on the text contained in the existing column.

Arthur Charpentier10/09/14
826 views
0 replies

How to Import Some Parts of a Large Database

In the introduction of Computational Actuarial Science with R, there was a short paragraph on how could we import only some parts of a large database, by selecting specific variables.

Mark Needham10/08/14
5600 views
0 replies

R: Filtering data frames by column type ('x' must be numeric)

I’ve been working through the exercises from An Introduction to Statistical Learning and one of them required you to create a pair wise correlation matrix of variables in a data frame.

John Cook10/08/14
1616 views
0 replies

The great reformulation of algebraic geometry

At the Heidelberg Laureate Forum I had a chance to interview John Tate. In his remarks below, Tate briefly comments on his early work on number theory and cohomology. Most of the post consists of his comments on the work of Alexander Grothendieck.

Veeresham Kardas10/06/14
1103 views
0 replies

CSV Operations using OpenCSV

OpenCSV is one of the best tools for CSV operations. We will see how to use OpenCSV for basic reading and writing operations.

Adam Diaz10/02/14
3680 views
0 replies

The Evolution of MapReduce and Hadoop

Recently I authored a section of the DZone Guide for Big Data 2014. I wrote about MapReduce and the evolution of Hadoop.

Sander Mak10/01/14
5632 views
0 replies

The Developer’s Guide to Data Science

When developers talk about using data, they are usually concerned with ACID, scalability, and other operational aspects of managing data. But data science is not just about making fancy business intelligence reports for management. Data drives the user experience directly, not after the fact.

Isaac Sacolick10/01/14
4149 views
0 replies

Solving the Data Scientist Shortfall by Deploying a Self Service BI Program

Want to learn more about what "self-service" BI programs? Why many organizations are looking to leverage these technologies and programs on their quest to become more data-driven.

Mark Needham09/29/14
3213 views
0 replies

R: ggplot - Plotting multiple variables on a line chart

In my continued playing around with meetup data I wanted to plot the number of members who join the Neo4j group over time. I wanted to plot the actual count alongside a rolling average for which I created the following data frame:

Linda Gimmeson09/28/14
3027 views
0 replies

10 Big Data Tools

Hadoop isn't the only big data tool out there. Check out this list of big data tools available.

Armel Nene09/27/14
7835 views
0 replies

Big Data Architecture Best Practices

The marketing department of software vendors have done a good job making Big Data go mainstream, whatever that means. The promise of we can achieve anything if we make use of Big Data; business insight and beating our competitions to submission. Yet, there is no well-publicised Big Data successful implementation. The question is: why not?

Adam Diaz09/26/14
5918 views
0 replies

The Evolution of MapReduce and Hadoop

With MapReduce, companies no longer need to delete old logs that are ripe with insights—or dump them onto unmanageable tape storage—before they’ve had a chance to analyze them. Today, the Apache Hadoop project is the most widely used implementation of MapReduce.

Evert Pot09/25/14
2167 views
0 replies

Accessing protected properties from objects that share the same ancestry.

I realized something odd about accessing protected properties the other day. It's possible in PHP to access protected properties from other objects, as long as they are from the same class, as illustrated here:

Benjamin Ball09/25/14
3821 views
0 replies

The No Fluff Introduction to Big Data

Due to the obstacles presented by large scale data management, the goal for developers and data scientists is two-fold: first, systems must be created to handle large scale data, and two, business intelligence and insights should be acquired from analysis of the data.

Alec Noller09/24/14
11887 views
0 replies

Dev of the Week: Sander Mak

This week we're talking to Sander Mak, Senior Software Engineer at Luminis Technologies, JavaOne Rockstar, and featured author in DZone's 2014 Guide to Big Data.