neo4j | Max De Marzi

May 13 2013

Knowledge Bases in Neo4j

From the second we are born we are collecting a wealth of knowledge about the world. This knowledge is accumulated and interrelated inside our brains and it represents what we know. If we could export this knowledge and give it to a computer, it would look like ConceptNet. ConceptNet is a semantic network that…

…is built from nodes representing concepts, in the form of words or short phrases of natural language, and labeled relationships between them. These are the kinds of things computers need to know to search for information better, answer questions, and understand people’s goals.

Continue reading →

Tagged bloom filter, graph database, knowledge base, neo4j, network, ruby, software, technology, wikipedia, word graph

Mar 25 2013

1 Comment

Java, Problems

Permission Resolution with Neo4j – Part 3

Let’s add a couple of performance tests to the mix. We learned about Gatling in a previous blog post, we’re going to use it here again. The first test will randomly choose users and documents (from the graph we created in part 2) and write the results to a file, the second test will re-use the results of the first one and run consistently so we can change hardware, change Neo4j parameters, tune the JVM, etc. and see how they affect our performance.

The full code for the Random Permissions test is here, I’ll just highlight the main parts:
Continue reading →

Tagged cypher, github, graph, graph database, neo4j, nosql, performance, testing

Mar 24 2013

1 Comment

Neography, Problems

Permission Resolution with Neo4j – Part 2

Let’s try tackling something a little bigger. In Part 1 we created a small graph to test our permission resolution graph algorithm and it worked like a charm on our dozen or so nodes and edges. I don’t have fast hands, so instead of typing out a million node graph, we’ll build a graph generator and use the batch importer to load it into Neo4j. What I want to create is a set of files to feed to the batch-importer.
Continue reading →

Tagged github, graph, graph algorithm, graph database, neo4j, ruby

Mar 18 2013

8 Comments

Java, Problems

Permission Resolution with Neo4j – Part 1

People produce a lot of content. Messages, text files, spreadsheets, presentations, reports, financials, etc, the list goes on. Usually organizations want to have a repository of all this content centralized somewhere (just in case a laptop breaks, gets lost or stolen for example). This leads to some kind of grouping and permission structure. You don’t want employees seeing each other’s HR records, unless they work for HR, same for Payroll, or unreleased quarterly numbers, etc. As this data grows it no longer becomes easy to simply navigate and a search engine is required to make sense of it all.

But what if your search engine returns 1000 results for a query and the user doing the search is supposed to only have access to see 4 things? How do you handle this? Check the user permissions on each file realtime? Slow. Pre-calculate all document permissions for a user on login? Slow and what if new documents are created or permissions change between logins? Does the system scale at 1M documents, 10M documents, 100M documents?
Continue reading →

Tagged graph database, java, neo4j, nosql, permissions, relationship graph

Mar 15 2013

1 Comment

Java, Random

A Peek behind the Neo4j Lucene Index Curtain

Did you know you can write Javascript in the Neo4j console to access the Neo4j API?
Try it. Open up your Neo4j Web Admin Console and type:

neo4j-sh (0)$ eval db EmbeddedGraphDatabase [data/graph.db]

OMG! I know, Neo4j is crazy. So much to play with, I’ve been at it for a few years and I haven’t even dug into this area. What else can we do here?
Continue reading →

Tagged graph database, java, javascript, lucene, luke, neo4j

Feb 14 2013

6 Comments

Cypher, Deployment, Random

Neo4j and Gatling sitting in a tree, Performance T-E-S-T-ing

neo4j_loves_gatling

I was introduced to the open-source performance testing tool Gatling a few months ago by Dustin Barnes and fell in love with it. It has an easy to use DSL, and even though I don’t know a lick of Scala, I was able to figure out how to use it. It creates pretty awesome graphics and takes care of a lot of work for you behind the scenes. They have great documentation and a pretty active google group where newbies and questions are welcomed.

It ships with Scala, so all you need to do is create your tests and use a command line to execute it. I’ll show you how to do a few basic things, like test that you have everything working, then we’ll create nodes and relationships, and then query those nodes.
Continue reading →

Tagged cypher, gatling, graph database, neo4j, nosql, performance, scala, testing

Dec 14 2012

5 Comments

Deployment

Setting up a Neo4j Cluster on Amazon

There are multiple ways to setup a Neo4j Cluster on Amazon Web Services (AWS) and I want to show you one way to do it.

Overview:

Create a VPC
Launch 1 Instance
Install Neo4j HA
Clone 2 Instances
Configure the Instances
Start the Coordinators
Start the Neo4j Cluster
Create 2 Load Balancers
Next Steps

We’ll start off by logging on to Amazon Web Services and creating a Virtual Private Cloud:

Continue reading →

Tagged amazon, aws, ec2, graph database, java, load balancers, neo4j, network

Nov 27 2012

1 Comment

Java

Pathfinding with Neo4j Unmanaged Extensions

In Extending Neo4j I showed you how to create an unmanaged extension to warm up the node and relationship caches. Let’s try doing something more interesting like exposing the A* (A Star) search algorithm through the REST API. The graph we created earlier looks like this:
Continue reading →

Tagged github, graph database, java, neo4j, nosql

Nov 26 2012

19 Comments

Cypher, Java

Extending Neo4j

One of the great things about Neo4j is how easy it is to extend it. You can extend Neo4j with Plugins and Unmanaged Extensions. Two great examples of plugins are the Gremlin Plugin (which lets you use the Gremlin library with Neo4j) and the Spatial Plugin (which lets you perform spatial operations like searching for data within specified regions or within a specified distance of a point of interest).

Plugins are meant to extend the capabilities of the database, nodes, or relationships. Unmanaged extensions are meant to let you do anything you want. This great power comes with great responsibility, so be careful what you do here. David Montag cooked up an unmanaged extension template for us to use on github so lets give it a whirl. We are going to clone the project, compile it, download Neo4j, configure Neo4j to use the extension, test the extension and tweak it a bit.
Continue reading →

Tagged cypher, extension template, github, graph database, java, mvn, neo4j, nosql, software, spatial operations, technology

Nov 14 2012

6 Comments

Cypher, Heroku, Neography, Visualization

CrunchBase on Neo4j

NeoTechnology was featured on TechCrunch after raising a Series B round, and it has an entry on CrunchBase. If you look at CrunchBase closely you’ll notice it’s a graph. Who invested in what, who co-invested, what are the common investment themes between investors, how are companies connected by board members, etc. These are questions we can ask of the graph and are well suited for graph databases.
Continue reading →

Tagged cypher, github, graph database, heroku, neo4j, network, ruby, visualization

Max De Marzi

Graphs, Graphs, and nothing but the Graphs

Tag Archives: neo4j

Knowledge Bases in Neo4j

Permission Resolution with Neo4j – Part 3

Permission Resolution with Neo4j – Part 2

Permission Resolution with Neo4j – Part 1

A Peek behind the Neo4j Lucene Index Curtain

Neo4j and Gatling sitting in a tree, Performance T-E-S-T-ing

Setting up a Neo4j Cluster on Amazon

Pathfinding with Neo4j Unmanaged Extensions

Extending Neo4j

CrunchBase on Neo4j