Category Archives: Random

Observations of the LDBC SNB Benchmark

RageDB Mascot running

I’m still trying to figure out the right look for the language folks will use to talk to RageDB. Instead of waiting until I have it figured out, I decided I should write all the queries for the LDBC SNB Benchmark to prepare for a full run in the next few months. Now that we added “stored procedures” to RageDB, the benchmark code is trivial. I send a post request to the Lua url with the name of the query plus any parameters it may need which are in a CSV file. Here is Short Query 4 for example and they all look like this besides the different parameters:

Continue reading
Tagged , , , , , ,

When the demo is over

At the end of the NODES conference opening Keynote, Neo4j presents a demo of sub 20ms query performance on a graph of 11 shards, 101 shards and 1129 shards. Quite an impressive feat. Then the CEO asks “is this too good to be True?”. TLDR: Yes. Well, let’s find out why.

The “code” for the demo was released on github so we can dive in. We will start with the weirdest part of the code:

Continue reading

Declarative Query Languages are the Iraq War of Computer Science

It’s Memorial Day weekend in the United States. Some people are staying home, others are observing the holiday quietly and others still are using it as an excuse to party because they have seemed to have forgotten that the entire world is once again at war. At war with a tiny enemy, so small some people think it’s a hoax. The worst part is the enemy is in each other, our friends and neighbors. But Memorial day is not about remembering the wars, but rather remembering the fallen. To remember those who gave all. Whatever you may think of war, all are terrible, some were necessary. I never served, so that’s about all I get to say about that.

About 14 years ago Ted Neward wrote a very long blog post on “The Vietnam of Computer Science”. There is a follow up, and a short summary by Jeff Atwood as well. If you have never read them, I ask you to do so now…and with that, I believe Query Languages are the Iraq War of Computer Science.

Continue reading

Tagged , , , , , , , , ,

Going Faster in 2020

Everybody loves benchmarks. Well let me rephrase that, everyone who publishes benchmarks loves the benchmark they publish. Nobody publishes benchmarks that make them look bad, it would be a terrible idea. But you my friend are in luck. I’m full of terrible ideas and today I’m going to publish some benchmarks that makes us look bad. Why? Because we are actively trying to improve this part of Neo4j and I want to experiment with some ideas and see what a ballpark theoretical limit should look like…and I’ll throw in some “inside baseball” about the graph database space from my point of view if you stick around until the end.
Continue reading

Tagged , , , , , , , ,

Neo4j Stored Procedures for Devs that don’t know Java (yet)

When I joined Neo4j, I didn’t know how to write Java. I was a SQL developer who knew some Ruby and that’s about it. Luckily I had Michael Hunger, Stefan Armbruster, David Montag and others to help me out. I realize however that you may not be so lucky. So today I’m going to share with you a set of slides to help you start you on your journey of using the full power of Neo4j.
Continue reading

Tagged , , , , , , , , , ,

Multiple origin multiple destination 3 relationships queries for knowledge graphs using Neo4j

The multiple-origin-multiple-destination (MOMD) problem is an NP-Hard problem sometimes seen in logistics planning where paths can stretch out really far. A far simpler problem presents itself when we limit the size of the paths. Now you may be wondering, why would we do that? Well… outside logistics we have plenty of graphs where relevance drops as we get further and further away. Think about an Article on Wikipedia. It has links to many other articles that are relevant, and those have links to other articles that are relevant to them but less relevant to our starting Article, and those have links to other articles that may be relevant to them, but have very little to do with our starting Article. I think if we keep going we end up in Philosophy or something like that.
Continue reading

Tagged , , , , , , , , , ,

Adding gRPC to Neo4j

You are probably sick of me saying it, but one of the things I love about Neo4j is that you can customize it any way you want. Extensions, stored procedures, plugins, custom indexes, custom apis, etc. If you want to do it, then you can do it with Neo4j.

So the other day I was like what about this gRPC thing? Many companies standardize their backend using RESTful APIs, others are trying out GraphQL, and some are using gRPC. Neo4j doesn’t support gRPC out of the box, partially because we have our own custom binary protocol “Bolt”, but we can add a rudimentary version of gRPC support quite easily.
Continue reading

Tagged , , , , , , , ,

Building a Boolean Logic Rules Engine in Neo4j

A boolean logic rules engine was the first project I did for Neo4j before I joined the company some 5 years ago. I was working for some start-up at the time but took a week off to play consultant. I had never built a rules engine before, but as far as I know ignorance has never stopped anyone from trying. Neo4j shipped me to the client site, and put me in a room with a projector and a white board where I live coded with an audience of developers staring at me, analyzing every keystroke and cringing at every typo and failed unit test. I forgot what sleep was, but managed to figure it out and I lost all sense of fear after that experience.

The data model chained together fact nodes with criss crossing relationships each chain containing the same path id property we followed until reaching an end node which triggered a rule. There were a few complications along the way and more complexity near the end for ordering and partial matches. The traversal ended up being some 40 lines of the craziest Gremlin code I ever wrote, but it worked. After the proof of concept, the project was rewritten using the Neo4j Java API because at the time only a handful of people could look at a 40 line Gremlin script and not shudder in horror. I think we’re up to two handfuls now.
Continue reading

Tagged , , , , , , , , , ,

Our own Multi-Model Database – Part 6

shitty6

Back in Part 2 we ran some JMH tests to see how many empty nodes we could create. Let’s try that test one more time, but adding some properties. Our nodes will have a username, an age and a weight randomly assigned. It’s not a long test, but just enough to give us a ballpark.
Continue reading

Tagged , , , , , , ,

Our own Multi-Model Database – Part 3

shitty3

If you haven’t read part 1 and part 2 then do that first or you’ll have no clue what I’m doing, and I’d like to be the only one not knowing what I’m doing.

We’ve built the beginnings of this database but so far it’s just a library and for it to be a proper database we need to be able to talk to it. Following the Neo4j footsteps, we will wrap a web server around our database and see how it performs.

There are a ton of Java based frameworks and micro-frameworks out there. Not as bad as the Javascript folks, but that still leaves us with a lot of choices. So as any developer would do I turn to benchmarks done by other people of stuff that doesn’t apply to me, and you won’t believe what I found –scratch that, yes you will, I got benchmarks.
Continue reading

Tagged , , , , , , ,