TL; DR: Google Analytics stores a massive amount of statistical data from web sites across the globe. Retrieving reports quickly from such a large amount of data requires Google to use a custom solution that is easily scalable whenever more data needs to be stored.
Over the last 6 months my colleagues and I have been running hands on Neo4j based sessions every few weeks and I was recently asked if I could write up the lessons we’ve learned. So in no particular order, here are some of the things that we’ve learnt.
We consistently hear that getting started with MongoDB is easy, but scaling to large configurations that include replication and sharding can be challenging. With MMS, it is now much easier.
Shindig is a mobile app (iOS, Android) that helps you explore new drinks and share them with the world. Take a picture of what you’re drinking, tag it with taste tags, share it, earn rewards and gameification points, follow famous mixologists and drink aficionados and search for the best drinks nearby.
Many enterprises are turning to us to help add a cache to an existing application or evolve applications to next generation technologies. For these level two cache implementations we’ve helped develop a data access layer for applications in the Spring project.
In this post we will go through some recommendations when running a sharded cluster at scale. Scalability is one of the core benefits of sharding in MongoDB but this can give you a false sense of security; even with that flexibility, you still have to make smart decisions about how and when you deploy resources.
An incremental software development process requires an incremental database migration strategy.
We’ve started running some sessions on graph modelling in London and during the first session it was pointed out that the process I’d described was very similar to that when modelling for a relational database.
In this post I will present how to connect to MongoDB from a stateless Java EE application, to take advantage of the built-in pool of connections to the database offered by the MongoDB Java Driver. This might be the case if you develop a REST API, that executes operations against a MongoDB.
Make sure you didn't miss anything with this list of the Best of the Week in the NoSQL Zone. This week's best include part 1 of a series on MongoDB sharding pitfalls, the release of Redis Cluster as a minimum viable product, a new source for MEAN stack resources, and more.
Redis is blazing fast and can easily handle hundreds of thousands to millions of operations per second (of course, YMMV depending on your setup), but there are cases in which you may feel that it is underperforming.
An approach to modeling that I often see while working with Neo4j users is creating very generic relationships (e.g. HAS, CONTAINS, IS) and filtering on a relationship property or on a property/label at the end node.
The MongoDB Aggregation pipeline is a framework for data aggregation. Documents enter a multi-stage pipeline that transforms the documents into an aggregated results. It was introduced in MongoDB 2.2 to do aggregation operations without needing to use map-reduce.
One of the important roles operations has is going to an existing server and checking if everything is fine. This is routine maintenance stuff. It can be things like checking if we have enough disk space for our expected growth, or if we don’t have too many indexes.
Recently, I had the pleasure of doing a talk at the Brussels Data Science meetup. Some really cool people there, with interesting things to say. My talk was about how graph databases like Neo4j can contribute to HR Analytics. Here are the slides of the talk.
Unit testing requires isolating individual components from their dependencies. Dependencies are replaced with mocks, which simulate certain use cases.
This week, DZone released its latest Refcard. If you're interested in learning more about MongoDB or sharpening your skills, we decided to dig into the DZone archives and find some of the most popular posts we've had on the topic.
Many clients don’t quite realize how much powerful ad-hoc query capability they’re losing by leaving SQL. But how can we possibly have the best of both worlds? Well, luckily for us, Postgres is working on a very handy solution.
After looking at all the pretty pictures, let us take a look at what we have available for us for behind the cover for ops. The first such change is abandoning performance counters.
Sharding is a popular feature in MongoDB, primarily used for distributing data across clusters for horizontal scaling. But as you add complexity to a distributed system, you increase the chances of hitting a problem.
One of my favourite functions in Neo4j’s cypher query language is COLLECT, which allows us to group items into an array for later consumption. However, I’ve noticed that people sometimes have trouble working out how to collect multiple items with COLLECT and struggle to find a way to do so.
Make sure you didn't miss anything with this list of the Best of the Week in the NoSQL Zone. This week's best include the rise (and fall?) of NoSQL, a look at using MongoDB with Go and mgo, the dissection of Fall 2014's NoSQL benchmark, and more.
It all comes down to preferences. While there are Redis users who are familiar with the Redis command line interface (CLI) and rely on it, there are those who prefer using a GUI. There are several Redis GUIs available, for different platforms, and in this article I'll try to review a few of them.
That is one scary headline, isn’t it? A customer called me in a state of panic: their database was not loading, and nothing they tried worked. Here is the story as I got it from the customer in question, only embellished to give the proper context for the story.