Extending Neo4j

One of the great things about Neo4j is how easy it is to extend it. You can extend Neo4j with Plugins and Unmanaged Extensions. Two great examples of plugins are the Gremlin Plugin (which lets you use the Gremlin library with Neo4j) and the Spatial Plugin (which lets you perform spatial operations like searching for data within specified regions or within a specified distance of a point of interest).

Plugins are meant to extend the capabilities of the database, nodes, or relationships. Unmanaged extensions are meant to let you do anything you want. This great power comes with great responsibility, so be careful what you do here. David Montag cooked up an unmanaged extension template for us to use on github so lets give it a whirl. We are going to clone the project, compile it, download Neo4j, configure Neo4j to use the extension, test the extension and tweak it a bit.
Continue reading

Tagged , , , , , , , , , ,

CrunchBase on Neo4j

NeoTechnology was featured on TechCrunch after raising a Series B round, and it has an entry on CrunchBase. If you look at CrunchBase closely you’ll notice it’s a graph. Who invested in what, who co-invested, what are the common investment themes between investors, how are companies connected by board members, etc. These are questions we can ask of the graph and are well suited for graph databases.
Continue reading

Tagged , , , , , , ,

Matches are the New Hotness

How do you help a person without a job find one online? A search screen. How do you help a person find love online? A search screen. How do you find which camera to buy online? A search screen. How do you help a sick person self diagnose online? I have no idea, I go to the doctor. Doesn’t matter, what I want to tell you is that there is another way.
Continue reading

Hubway Data Visualization Challenge with Neo4j

Michael Hunger imported the Hubway Challenge dataset into a Neo4j graph database, and made it available for us to play with.
Continue reading

Tagged , , , , , , ,

Hunting Trolls with Neo4j!

Allison Sparrow shared a link to Patentula, a company interested in finding better ways to explore patent data and hunt patent trolls. What caught my attention is this quote from the video below:

What we tried to do with it, is bypass any sort of keyword processing in order to find similar patents. The reason we’ve done this is to avoid the problems encountered by other systems that rely on natural language processing or semantic analysis simply because patents are built to avoid detection by similar keywords…we use network topology (specifically citation network topology) to mine the US patent database in order to predict similar documents.

Continue reading

Tagged , , , ,

Networks, Crowds, and Markets


I’ve had “Networks, Crowds, and Markets: Reasoning About a Highly Connected World” by David Easley and Jon Kleinberg on my bookshelf for a few months now, and a conversation with a client reminded me that I hadn’t finished reading it (barely started really). It is available from Cambridge University Press, but also on the web and in PDF format.
Continue reading

Tagged , , , , , , ,

NeoSocial: Connecting to Facebook with Neo4j

Social applications and Graph Databases go together like peanut butter and jelly. I’m going to walk you through the steps of building an application that connects to Facebook, pulls your friends and likes data and visualizes it. I plan on making a video of me coding it one line at a time, but for now let’s just focus on the main elements.
Continue reading

Tagged , , , , , , , ,

Getting a Big Neo4j Test Box for Cheap!

When embarking on a new Neo4j project, one of the things you have to figure out is where to run it. Most of the time the answer is just your laptop. Other times, using Heroku works great. However, if you are at the stage of your testing where you have billions of nodes and relationships, you need something a little bigger.

If you are not ready to commit to purchasing a 100k server for testing, then I suggest you borrow one for a short time. You can try to spin up an Amazon EC2 instance, the high memory large ones go up to 60 gigs of RAM. But what if you need more? Lots more?
Continue reading

Tagged , ,

Neo4j Internals

An overview of Neo4j Internals
View more presentations from Tobias Lindaaker

It is interesting to see how node, relationship and property records are stored differently on disk and in the cache.

It is all linked lists of fixed size records on disk. Properties are stored as a linked list of property records, each holding a key and value and pointing to the next property. Each node and relationship references its first property record. The Nodes also reference the first relationship in its relationship chain. Each Relationship references its start and end node. It also references the previous and next relationship record for the start and end node respectively.
Continue reading

Tagged , ,

Summarize Opinions with a Graph – Part 1

How does the saying go? Opinions are like bellybuttons, everybody’s got one? So let’s say you have an opinion that NOSQL is not for you. Maybe you read my blog and think this Graph Database stuff is great for recommendation engines and path finding and maybe some other stuff, but you got really hard problems and it can’t help you.

I am going to try to show you that a graph database can help you solve your really hard problems if you can frame your problem in terms of a graph. Did I say “you”? I meant anybody, specially Ph.D. students. One trick is to search for “graph based approach to” and your problem.
Continue reading

Tagged , ,
Follow

Get every new post delivered to your Inbox.

Join 791 other followers