Parallel K-Hop Counts

As a foreigner I was a little perplexed the first time I went to IHOP. You are served a stack of pancakes 3-5 high. How do you eat them? Do you pour syrup over the top and cut down through all the layers and eat them that way… or do you unstack them, pour syrup over each one and eat one at a time? If you are American, you eat them stacked. If you see someone eat them one at a time, you know they are shape-shifting lizard people. But doesn’t that mean the bottom layers are dry and don’t get any butter or syrup on them? Well you would think, but Americans are an ingenious people and they found a way to fix that problem. More syrup, more and more, and then a bit more to be sure… and a side of bacon. Now that you know all about IHOP, let’s switch gears to KHOP. Let’s say you wanted to find out how many nodes there were k-hops away from a starting node. What would be the best way to do that?

Finding Motifs in Cypher for Fun and Profit

If you are friends with Jessie, and Jessie is friends with Amy, there is a good chance you’ll eventually become friends with Amy too. In terms of a graph, this would be like a graph with three nodes and two relationships eventually building a third relationship to form a clique. This simple concept is one of the basis for recommendation engines. There are fancy terms for it, like “triadic closure” but basically it just means we are making triangles. But what about Amy’s friend Delilah? Is there a good chance now that you are friends with Amy that you’ll become friends with her? What about Jessie and Delilah? Can we extend the pattern to four nodes or five nodes and go beyond our simple triangle? Continue reading

Graph Analytics Book Jupyter Notebook for Chapter 8

If you’ve been going through the free Graph Algorithms book from Mark Needham and Amy E. Hodler you’ll eventually get to “Chapter 8: Using Graph Algorithms to Enhance Machine Learning”. This is a long chapter which walks us through how to use Graph Features to build and improve machine learning models. If you need a little help with it, take advantage of this public Jupyter notebook on Anaconda. Give it a shot, and let me know if you run into any issues.

Modeling Events in Neo4j

No. Not modeling events, I’m talking about modeling events. Things that happen at different times typically in some known sequence. If you are a long time follower of my blog you know I love promoting the date property of an event into the relationship type to make use of Neo4j’s individual Node-RelationshipType partitioning to speed up my queries, but I’m going to show you something different today.

Filtering Connected Dynamic Forms

Sometimes I contrast Neo4j against relational databases by saying Neo4j is more like a dynamic typed language, and relational databases are more like a static typed language. In Neo4j you don’t have Tables or table definitions, any property can be of any valid value (Java primitives, arrays of Java primitives as well as time and spatial types). Two nodes with the same Label can have completely different properties, and any key can be of any type for different nodes. So for example a User labeled node may have the “id” property be “xyz”, while the “id” property for a Location labeled node may be a spatial type… but another User labeled node may also have the “id” property be a number or an array of floats, or whatever. This kind of freedom can drive people crazy, but it can also be leveraged to make very dynamic applications easy.

Network Routing in Neo4j

People use Neo4j to manage enterprise architectures all the time. If you haven’t seen this presentation from Thomas Lawrence from Amadeus, then you owe it to yourself to watch it. But what about lower level networks? Can we use Neo4j to model routing in a physical network? Of course we can, and today I’ll show you how.

Calculating the best Rail Road paths in Neo4j

Did you know that Chicago is the most important railroad center in North America? Chicago has more lines of track radiating in more directions than from any other city. The windy city has long been the most important interchange point for freight traffic between the nation’s major railroads and it is the hub of Amtrak, the intercity rail passenger system. You may not realize it, but railroad tracks and graph theory have a history together. Back in the mid 1950s the US Military had an interest in finding out how much capacity the Soviet railway network had to move cargo from the Western Soviet Union to Eastern Europe. This lead to the Maximum Flow problem and the Ford–Fulkerson algorithm to solve it.

Neo4j Stored Procedures for Devs that don’t know Java (yet)

When I joined Neo4j, I didn’t know how to write Java. I was a SQL developer who knew some Ruby and that’s about it. Luckily I had Michael Hunger, Stefan Armbruster, David Montag and others to help me out. I realize however that you may not be so lucky. So today I’m going to share with you a set of slides to help you start you on your journey of using the full power of Neo4j.