It is interesting to see how node, relationship and property records are stored differently on disk and in the cache.
It is all linked lists of fixed size records on disk. Properties are stored as a linked list of property records, each holding a key and value and pointing to the next property. Each node and relationship references its first property record. The Nodes also reference the first relationship in its relationship chain. Each Relationship references its start and end node. It also references the previous and next relationship record for the start and end node respectively.
On disk most of the information is contained in the relationship records, with the nodes just referencing their first relationship. In the Node cache this is turned around: the nodes hold references to all of its relationships. The relationships are simple, only holding its properties. The relationships for each node is grouped by RelationshipType to allow fast traversal of a specific type. All references (dotted arrows) are by ID, and traversals do indirect lookup through the cache.
View a video presentation of these slides on Skill Matter.
[…] Max De Marzi points to the slides for a presentation by Tobias Lindaaker on Neo4j internals (January, 2012). […]
[…] the object cache which is configured differently. In memory, the nodes have pointers to all the relationships that connect them to the graph, so even if your node size is small you’ll want to take into […]
[…] you’ve seen the Neo4j Internals blog post you will immediately recognize why it’s so much faster. In the search case we are […]
[…] in Neo4j are stored in a Relationship Chain by Type. See the bottom picture for a visual guide. In a traversal by specifying just the 2 “dated” relationship types […]
[…] ways is that I’m kinda trying to replicate what we do at Neo4j. I hope you have seen this internals blog post before, but if you haven’t go ahead and take a look. The important piece is the […]
[…] instead of storing the relationships in a list, they are broken up and stored in groups by type and direction. These are the so called “dense” nodes in the graph. What’s neat about this is […]