Tag Archives: performance

Building a Boolean Logic Rules Engine in Neo4j

A boolean logic rules engine was the first project I did for Neo4j before I joined the company some 5 years ago. I was working for some start-up at the time but took a week off to play consultant. I had never built a rules engine before, but as far as I know ignorance has never stopped anyone from trying. Neo4j shipped me to the client site, and put me in a room with a projector and a white board where I live coded with an audience of developers staring at me, analyzing every keystroke and cringing at every typo and failed unit test. I forgot what sleep was, but managed to figure it out and I lost all sense of fear after that experience.

The data model chained together fact nodes with criss crossing relationships each chain containing the same path id property we followed until reaching an end node which triggered a rule. There were a few complications along the way and more complexity near the end for ordering and partial matches. The traversal ended up being some 40 lines of the craziest Gremlin code I ever wrote, but it worked. After the proof of concept, the project was rewritten using the Neo4j Java API because at the time only a handful of people could look at a 40 line Gremlin script and not shudder in horror. I think we’re up to two handfuls now.
Continue reading

Tagged , , , , , , , , , ,

Finding Triplets with Neo4j

A user had an interesting Neo4j question on Stack Overflow the other day:

I have two types of nodes in my graph. One type is Testplan and the other is Tag. Testplans are tagged to Tags. I want most common pairs of Tags that share the same Testplans with a Tag having a specific name. I have been able to achieve the most common Tags sharing the same Testplan with one Tag, but getting confused when trying to do it for pairs of Tags.

Continue reading

Tagged , , , , , , , ,

Using a Cuckoo Filter for Unique Relationships

We often see a pattern in Neo4j applications where a user wants to create one and only one relationship between two nodes. For example a User follows another User on a social network. We don’t want to accidentally create a second follows relationship because that may create errors such as duplicate entries on their feed, or errors unfollowing or blocking them, or even skew recommendation algorithms. Also it is just plain wasteful, and while an occasional duplicate relationship won’t be a big deal, millions of them could.

So how do we deal with this?
Continue reading

Tagged , , , , , , , , , , , , ,

Flight Search with Neo4j

I think I am going to take the opportunity to explain why I love graphs in this blog post. I’m going to try to explain why looking at problems from the graph point of view opens you up to creative solutions and makes back-end development fun again. The context of our post is flight search, but our true mission is to figure out how to traverse a graph quickly and efficiently so we can apply our knowledge to other problems.

A long while back, I showed you different ways to model airline flight data. When it comes to modeling in graphs, the lesson to take away is that there is no right way. The optimal model is heavily dependent on the queries you want to ask. Just to prove the point, I’m going to show you yet another way to model the airline flight data that is truly optimized for flight search. If you recall, our last model looked like:
Continue reading

Tagged , , , , , , , , , , ,

Building a Twitter Clone with Neo4j – Part Eight

In our last post we started the front end of our Twitter Clone application and managed to register and login a user. Now we need to build the actual functionality of our application. We’re going to need a screen to display the timeline of the logged in user. A screen to display a single users posts, and a screen to display the followers of a user and the users being followed. All of these should fit within the same main template, so maybe we can start with that.

Continue reading

Tagged , , , , , , , , , ,

Building a Twitter Clone with Neo4j – Part Two

One of the aspects of my job that I love is the week long proof of concept bootcamps. What it entail is me (or one of my team members) coming onsite to work with your team to build out a POC in just one week. They all vary some what, but I try to stick to a formula that works for me. I spend the first day with the whole team ironing out the Model. This is the trickiest part to get right, because if the model is right, the queries will fall right into place. If the model has to be changed significantly on day 3 let’s say, then a ton of work has to be redone or at least greatly modified. The goal of the end of day one is to have something that looks like the following:
Continue reading

Tagged , , , , , , , ,

Building a Twitter Clone with Neo4j – Part One

Would you believe there is no shortage of Twitter Clone example applications…maybe because they are easy to replicate (ba dum tss, I’ll be here all week.) The earliest one I remember was written by Salvatore Sanfilippo creator of Redis.
It’s a pretty good read, where he explains the basics of Redis (a Key Value store on steroids) and how to model a social network in it. One of the interesting bits to me is how the status updates (tweets) are handled.
Continue reading

Tagged , , , , , , , , ,

Searching for objects using multiple dimensions

Lets take a look at a scenario where you are trying to search for things by their attributes, not their description. They can be users, documents, or any object that could be described by discrete values in multiple dimensions. What does that mean exactly? Well, let me give you an example: searching for a dog. My family includes 2 four legged furry creatures named Tyler and Ronnie. They are my half lab, half golden retrievers. Dogs come in all shapes and sizes, from teacup breeds with adult weights around 5 lbs, to giant Mastiff breeds over 150 lbs. But most people don’t care exactly how much a dog weights, only their general size.


Continue reading

Tagged , , , , , , , , ,

Neo4j is faster than MySQL in performing recursive query

5mysql

A user on StackOverflow was wondering about the performance between Neo4j and MySQL for performing a recursive query. They started with Neo4j performing the query in 240 seconds. Then an optimized cypher query got them down to 40 seconds. Then I got them down to…
Continue reading

Tagged , , , , , , , , , , ,

Our own Multi-Model Database – Part 6

shitty6

Back in Part 2 we ran some JMH tests to see how many empty nodes we could create. Let’s try that test one more time, but adding some properties. Our nodes will have a username, an age and a weight randomly assigned. It’s not a long test, but just enough to give us a ballpark.
Continue reading

Tagged , , , , , , ,