Category Archives: Random

Delivering a Graph Based Search solution to slightly wrong data

oops

When it comes to databases, having good clean data is always important. More so with Graphs which deal with concepts as nodes and their relationships between them. Inevitably, you will run into messy data and have to deal with it. In a lot of the projects our customers work on they are dealing with connecting multiple data sources to get to a “golden record” or single source of truth. A lofty goal, sometimes impossible to achieve, but we can use the relationships of the data to help us come close.

One option is to extract the features (or tags) of a composite object and see if any other object shares most of these features. If that is the case then they are possibly the same object and should be merged instead of creating a new record. A partial subgraph match is something akin to a recommendation engine in Neo4j and pretty trivial to write. Take a look back at a few old blog posts for ideas.
Continue reading

Tagged , , , , , , , , , ,

Benchmarks and Superchargers

Interceptor

For the most part, I hate competitive benchmarks. The vendor who publishes them always seems to come out on top regardless. The numbers are always amazing, but once you start digging in a little bit you start to see faults in what is actually being measured and it never applies to real world workloads. For example you have Cassandra claiming 1 Million writes per second on 300 servers. Then Aerospike claiming 1 Million writes per second on 50 servers. MongoDB claiming almost 32k writes per second on a single server, but claiming Cassandra can only do 6k w/s and Couch can only do 1.2k w/s on a single server… Then ScyllaDB has almost 2 Million writes per second on 3 servers blowing everybody away.
Continue reading

Tagged , , , , , ,

Modeling Airline Flights in Neo4j

Actor Leonardo DiCaprio as Frank Abagnale in the Steven Spielberg movie "Catch Me If You Can"

Actor Leonardo DiCaprio as Frank Abagnale in the Steven Spielberg movie “Catch Me If You Can”

If you’ve come to any of the Neo4j Data Modeling classes I’ve taught, you’ve must have heard me say “your model depends on both your data and your queries” about a million times. Let us take a closer dive into what this means by looking at how one might model airline flight data in Neo4j.
Continue reading

Tagged , , , ,

Importing the Hacker News Interest Graph

HackerNews-799e9e47

Graphs are everywhere. Think about the computer networks that allow you to read this sentence, the road or train networks that get you to work, the social network that surrounds you and the interest graph that holds your attention. Everywhere you look, graphs. If you manage to look somewhere and you don’t see a graph, then you may be looking at an opportunity to build one. Today we are going to do just that. We are going to make use of the new Neo4j Import tool to build a graph of the things that interest Hacker News.
Continue reading

Tagged , , , , , , , , , , , ,

Caching Immutable Id lookups in Neo4j

GiveMeTheCache

If you’ve been following my blog for a while, you probably know I like using YourKit and Gatling for testing end to end requests in Neo4j. Today however we are going to do something a little different. We are going to be micro-benchmarking a very small piece of code within our Unmanaged Extension using a Java library called JMH.

Continue reading

Tagged , , , ,

Kickstarting a Neo4j Video Series

Learn how to build high performance @neo4j applications with this video training course.

I’m on Kickstarter to ask for your help in order to create a set of videos to teach you how to build high performance Neo4j applications. I am going to capture the lessons I’ve learned over the past 4 years working with graph databases and share them with you.

These videos will teach you everything you need to know about building high performance applications using Neo4j.
Continue reading

Tagged , , , , , ,

Online Payment Risk Management with Neo4j

credit_cards_512

I really like this saying by Corey Lanum:

Finding the relationships that should not be there is a great use case for Neo4j, and today I want to highlight an example of why. When you purchase something online, the merchant hands off your information to the payment gateway which processes your actual payment. Before they accept the transaction, they run it via series of risk management tests to validate that it is a real transaction and protect themselves from fraud. One of the hardest things for SQL based systems to do is cross check the incoming payment information against existing data looking for relationships that shouldn’t be there.
Continue reading

Tagged , , , , , , , ,

The Power of Open Source Software

opensource-400

One of the benefits of Open Source Software is that if you want to change how something is done, you can. At Neo Technology, we have a small team of “Field Engineers” who don’t really work ON the product but rather WITH the product. We help our customers with issues of all kinds, answer questions, give suggestions and whatever we need to do to make people’s project successful. A little while back I had a support ticket for a traversal that was taking longer than they hoped it would.

Think about a social network, one of the things you may want to do is tell the user how big their friends network is. But why stop there? How about their friends of friends or even friends of friends of friends network? These are the kind of questions graph databases excel at compared to relational databases. Let’s take a look at what they were doing:
Continue reading

Tagged , , , , , ,

Connected

connected

Connected: The Surprising Power of Our Social Networks and How They Shape Our Lives is a mind bending look at how no matter how individual we think we are, the people around us have a great amount of influence in our lives. One of the authors James Fowler was at GraphConnect 2012 and gave a presentation on this idea:
Continue reading

Tagged , , , , , , ,

Neo4j 2.0 is coming

neoiscoming

House Neo4j of Graph Databases is one of the Great Houses of NOSQL and the principal noble house of The Graph; many lesser houses are sworn to them. In days of old they ruled as Kings of the Graph; since the Aggregate Store Conquest they have been Wardens of the Path. Their seat, San Mateo, is an ancient castle renowned for its sushi. Their sigil is a octopus racing across a field of white, and their words are “Neo4j 2.0 Is Coming,” one of only a few house mottoes to be a warning rather than a boast. Members of the family tend to be lean of build and long of face, with golden hair and blue eyes.

Continue reading

Tagged , , , , , ,