In a recent MongoDB and R project, the author faced a new problem. He was using R to process source data present in MongoDB, and if he gave a large number of documents to R for analysis, it was becoming slower and a bottleneck.
Since version 2.1, Neo4j provides out-of-the box support for CSV ingestion. But hear my words of advice before you jump directly into using it. There are some tweaks and configuration aspects that you should know to be successful on the first run.
One question that keeps coming up, according to Jonathan Lacefield, is why shared storage is not recommended for Cassandra. The short version? Performance suffers, and it introduces a single point of failure. Lacefield's explanation, however, aims to clarify what "performance suffers" really means.
UbiGraph is a graph rendering server that is controlled remotely and also interactively with a XML-RPC API (which is a weird choice). It comes with example clients in Java, Python, Ruby and C. In this article, you'll learn the basics of UbiGraph and how to render Neo4j with it.
Make sure you didn't miss anything with this list of the Best of the Week in the NoSQL Zone. This week's best include a look at performing CRUD operations on MongoDB in a Node.js app, the need for DBaaS in the app economy, and thoughts on how to control an ever-expanding MongoDB database.
If you're looking for something new and a bit different when it comes to NoSQL solutions, you might be interested in Cayley, an open source graph database written in Go and based on Freebase and Google's Knowledge Graph.
Thumbtack published an excellent blog post highlighting the preliminary results of performance tests executed with Couchbase Server, MongoDB and DataStax Enterprise (Apache Cassandra). The final results will be included in a benchmark report.
Take the following scenario. You have a time-series data application for which you would like to store a rolling period of data. With basic MongoDB, you would likely create a collection with a “TTL”, or “time to live” index. While simple to use, this solution can run into performance problems.
There are a number of drivers created by the community to interact with MongoDB from a Node.js app. The official mongodb driver seems to be the simplest of them. In this post, we will learn to perform simple CRUD operations on a MongoDB document store using the mongodb driver.
You may have heard about Jonathan Ellis criticizing Thumbtack Technology's NoSQL benchmarks - in short, he suggested that the benchmarks were improperly configured and understated Cassandra's performance. Well, Ben Engber at Thumbtack Technology heard about it, and according to his response, Ellis is way off.
Everybody loves comparing databases. Not everybody agrees on how to do it, though. One prime example is Thumbtack Technology's benchmarks comparing Cassandra, Couchbase, MongoDB, and Aerospike. The problem, according to Jonathan Ellis, is that the benchmarks give Cassandra a raw deal.
The Python team at MongoDB is partially rewriting PyMongo. The next version, 3.0, aims to be faster, more flexible, and more maintainable than the current 2.x series. One strategy is to minimize methods, period. In this article, you'll find the author's rules of thumb when it comes to methods and functions.
The key question asked by DJ Walker-Morgan in this recent post from MongoHQ is an important one: do you actually know how big your database is? As Walker-Morgan points out, most people probably have a number they can point to, but the number may not be communicating exactly what they think it is.
NoSQL is a buzzword now-a-days among the developers and software professionals. In this article, you'll find a quick guide to NoSQL, including what it is, where to use it, advantages and disadvantages, and some of the more popular NoSQL options.
Make sure you didn't miss anything with this list of the Best of the Week in the NoSQL Zone. This week's best include a tutorial on building a TV show tracker with MongoDB, Node.js, AngularJS, a look at MongoDB and Grails, 16 of the top NoSQL and NewSQL databases, and more.
Developers are continually upping the ante by creating better, smarter and more valuable apps. However, these apps also have increasingly sophisticated data requirements, and the ability to take them to the next level may be stymied by an archaic approach to databases.
Last time, the author gave a technical explanation of the performance characteristics of partitioned collections in TokuMX 1.5 and partitioned tables in relational databases. Given those characteristics, in this post, he will present some best practices when using this feature in TokuMX or TokuDB.
While building up the Neo4j World Cup Graph, the author has been making use of the LOAD CSV function and he frequently found himself needing to do different things depending on the value in one of the columns. In this article, the author explores handling conditionals to do so.
In TokuMX 1.5 that is right around the corner, the big feature will be partitioned collections. This feature is similar to partitioned tables in Oracle, MySQL, SQL Server, and Postgres. A question many have is “why should I use partitioned tables?” In short, it’s complicated.
While replicating data from RavenDB to SQL Server or the like does make sense, every report can take a while to generate. Replicating to Elasticsearch provides real-time view of the data, and fast reporting capabilities on it. Now, how do we get data to it from a RavenDB database?
So recently, I had a requirement to store unstructured JSON data that was coming back from a web service.
From Doug Henschen comes a list of the top 16 NoSQL and NewSQL databases, each complete with a profile including description, notable customers, company type, and comments on some notable aspects of the offering.
In his spare time, the author's been working on a Neo4j application that runs on top of meetup.com's API, and recently he learned how to wire up some of the queries to use the Rneo4j library. In this article, you'll see how it's done.
By default, Mule uses in-memory object-stores behind the scenes. Things get more interesting, however, when your Mule application is distributed across multiple Mule nodes. In this blog post, the author shows a simple example of how to synchronize object-stores across multiple applications using MongoDB.
Feedback from the author's last post made him realize that some users may not immediately understand the differences between partitioning a collection and sharding a collection. In this post, he hopes to clear that up.