Tag Archives: graph databases

Let’s build something Outrageous – Part 13: Finding Things Faster

In the previous blog post we started at 32 and managed to reach 724 requests per second. We also ended it saying that I’m going to investigate a way to vectorize it so we can check multiple node properties in batches instead of doing things one at a time. Well, I found a way and the results are nothing short of outrageous.

In our Find method we were looking at a list of property values and comparing them to a single value one at a time. Computer since the early 90s have been able to perform SIMD (Single Instruction Multiple Data) operations and we’re going to take advantage of that.

Continue reading
Tagged , , , , , ,

Let’s build something Outrageous – Part 12: Finding Things

Knowing yourself is the beginning of all wisdom

Aristotle

At some point in your life you will look at the mirror and not recognize the person staring back at you. It’s not only the way you look, but the choices you made and the results of those choices that will seem puzzling. You may decide right then and there to embark in the most important quest of your life. Finding yourself.

Nodes and Relationships don’t get the luxury of having mirrors, but we still need to find them every once in a while. So far all we are able to do is get a node by a label and key or use the id to get a node or relationship. This is great and covers a lot of use cases, but sometimes the primary id is not what we are looking for. We may be looking for all nodes of a type that share a property value equal to something, or greater/less than something else. How can we go about finding the answers to these types of queries in our graph?

Continue reading
Tagged , , , , ,

Let’s build something Outrageous – Part 11: Testing

Helene has been testing software since we met in college back in 1999. Today she is the head of QA at S&P Market Intelligence managing hundreds of people. I don’t know how she does it but she is the kind of person that breaks every piece of software she touches… even when she doesn’t want to… like the in-flight entertainment system on a long flight to France. About 15 years ago, I made a terrible mistake and asked her to test a web application I had written. She ripped it apart in minutes. I felt like the worst developer in the world, and I probably wasn’t far off. I never wanted to feel that way again, so I got better at writing tests. Now I’m only ranked second worst, right after that dev who doesn’t test at all.

Continue reading
Tagged , , , ,

Let’s build something Outrageous – Part 10: Nulls

We decided that our Nodes and Relationships will have Schema in our database. If we create a User Node Type and set an “age” property type as an Integer, then all User nodes will have that property as an Integer. This idea seem simple enough, but what happens if we truly don’t know the value of a property for a node? The user hasn’t told us their age, or for one of many perfectly valid reasons do not know what their age is? Or what if that age property that was once set is deleted? What do we do now?

In Part 5 of this series we talked about how we store properties, but not really how we delete them. In my first design I had “tombstone” (a value indicating this data is no longer here) by type. So for Strings it would be an empty string, for Integers it was the lowest negative value allowed, etc… but what do we do about Booleans? It’s either true or false, it doesn’t lend it self to a tombstone style value… and for the ones that do, does it even make sense?

Continue reading
Tagged , , , , ,

Let’s build something Outrageous – Part 9: Docker

In the movie “Field of Dreams“, a voice can be heard saying “If you build it, he will come“. For some reason people got that mixed up with “if you build it, they will come” and it became a bit of a trap that many engineers fall for. The myth being that if you build a better contraption everyone will want to use it. But that is not how the world works. You have to win the hearts and minds of the people who may want to use your product… just ask Apple which posted record results yet again.

Continue reading
Tagged , , , ,

Let’s build something Outrageous – Part 8: Queries with Lua

Jamie Brandon wrote a monster of a blog post the other day against SQL. This is my favorite part:

To take an example close to my heart: Differential dataflow is a dataflow engine that includes support for automatic parallel execution, horizontal scaling and incrementally maintained views. It totals ~16kloc and was mostly written by a single person. Materialize adds support for SQL and various data sources. To date, that has taken ~128kloc (not including dependencies) and I estimate ~15-20 engineer-years. Just converting SQL to the logical plan takes ~27kloc, more than than the entirety of differential dataflow.

Similarly, sqlite looks to have ~212kloc and duckdb ~141kloc. The count for duckdb doesn’t even include the parser that they (sensibly) borrowed from postgres, which at ~47kloc is much larger than the entire ~30kloc codebase for lua.

Jamie Brandon – Against SQL
Continue reading
Tagged , , , ,

Let’s build something Outrageous – Part 7: Performance

I don’t know why, but I need things to be fast. Nothing drives me up the wall more than waiting a long time for some database query to finish. It’s some kind of disease I tell you. So today we’re doing to do a little performance checking to see where RageDB is at. So far all we can do is create nodes and relationships, but that’s enough to get us started.

Continue reading
Tagged , , , ,

Let’s build something Outrageous – Part 6: Relationships

Good relationships are hard. They don’t just happen. They take time, patience and about two thousand lines of code to work together. I’m not convinced I have it right, maybe one of you out there has a better design we can implement. In the original design I was storing the full relationships complete with starting/ending node ids, properties and type information. This time Relationships are only temporarily created when requested, and we’re just going to store their pieces in different vectors. Let’s dive in to the code:

Continue reading
Tagged , , , ,

Let’s build something Outrageous – Part 5: Properties

Whatever your views on the game Monopoly, you play by getting properties, changing them and making sure anybody that uses them pays a tidy sum. That’s also true of graph databases. Finding a property, changing a property, filtering on a property and sometimes even retrieving properties can be really expensive. Part of the issue is that it was decided at some point that it would be great if any nodes of the same label could have different properties and if they had the same property keys, they didn’t necessarily have to have the same property types. This mean one Person node could have a height property and the other not. One node’s height property could be an integer representing inches or centimeters while another could use a float and another just write out “170 cm” or ” 5 foot 6″ as strings. Dealing with properties was completely left to the developer. Many times, there isn’t even an option to enforce property types.

Continue reading
Tagged , , , ,

Let’s build something Outrageous – Part 4: Creating and Retrieving Nodes

When I was first introduced to graph databases I had a hard time trusting them. When a node gets created, where does it go? There are no tables in graph databases, so I was missing that loving embrace, I mean arrangement of rows and columns. It made me a little paranoid, like what if I lost them? I built some projects storing all the data in Postgres first, so if anything happened to the graph I could rebuild it. That warm protective security blanket is something we’re bringing back. We’re taking another walk through a door and arranging all nodes and properties of each type in a set of Vectors (well, one per Shard of course).

Continue reading
Tagged , , , ,