java | Max De Marzi

Mar 23 2014

Caching Partial Traversals in Neo4j

cache_all_the_things

Sometimes you’ll find yourself looking at a traversal and thinking… “I’m going to be doing this one thing over and over again.” That sounds kind of wasteful and years of recycling have taught us not to be wasteful. Let’s take a look at an example from our past. Look back at the Neo Love application, the one with the picture of Marilyn Monroe and Groucho Marx. Let’s see what a Neo4j 2.0 version of that query would look like:

Continue reading →

Tagged collaborative filtering, cypher, graph database, java, movies, neo4j, performance, recommendation engine, recommendations

Feb 27 2014

6 Comments

Deployment, Java, Problems

Neo4j at Ludicrous Speed

In the last blog post we saw how we could get about 1,250 requests per second (with a 10ms latency) using an Unmanaged Extension running inside the Neo4j server… but what if we wanted to go faster?

The easy answer is to Scale Up. However, trying to add more cores to my Apple laptop doesn’t sound like a good time. Another answer is running a Neo4j Cluster and (almost) linearly scaling our read requests as we add more servers. So a 3 server cluster would give us between 3,500 and 3,750 requests per second.

But can we go faster on a single server without new hardware? Well… yes.
Continue reading →

Tagged github, graph database, java, neo4j, performance

Feb 12 2014

8 Comments

Java, Problems, Random

Online Payment Risk Management with Neo4j

I really like this saying by Corey Lanum:

"Almost all fraud cases involve the fabrication of a relationship, so… model your data to highlight relationships" — @corey_lanum

— Max De Marzi is building RageDB (@maxdemarzi) November 21, 2013

Finding the relationships that should not be there is a great use case for Neo4j, and today I want to highlight an example of why. When you purchase something online, the merchant hands off your information to the payment gateway which processes your actual payment. Before they accept the transaction, they run it via series of risk management tests to validate that it is a real transaction and protect themselves from fraud. One of the hardest things for SQL based systems to do is cross check the incoming payment information against existing data looking for relationships that shouldn’t be there.
Continue reading →

Tagged credit cards, fraud, graph database, java, neo4j, network, payment, performance, risk

Dec 31 2013

3 Comments

Java, Problems, Random

The Power of Open Source Software

opensource-400

One of the benefits of Open Source Software is that if you want to change how something is done, you can. At Neo Technology, we have a small team of “Field Engineers” who don’t really work ON the product but rather WITH the product. We help our customers with issues of all kinds, answer questions, give suggestions and whatever we need to do to make people’s project successful. A little while back I had a support ticket for a traversal that was taking longer than they hoped it would.

Think about a social network, one of the things you may want to do is tell the user how big their friends network is. But why stop there? How about their friends of friends or even friends of friends of friends network? These are the kind of questions graph databases excel at compared to relational databases. Let’s take a look at what they were doing:
Continue reading →

Tagged github, graph database, java, neo4j, network, performance, testing

Mar 18 2013

8 Comments

Java, Problems

Permission Resolution with Neo4j – Part 1

People produce a lot of content. Messages, text files, spreadsheets, presentations, reports, financials, etc, the list goes on. Usually organizations want to have a repository of all this content centralized somewhere (just in case a laptop breaks, gets lost or stolen for example). This leads to some kind of grouping and permission structure. You don’t want employees seeing each other’s HR records, unless they work for HR, same for Payroll, or unreleased quarterly numbers, etc. As this data grows it no longer becomes easy to simply navigate and a search engine is required to make sense of it all.

But what if your search engine returns 1000 results for a query and the user doing the search is supposed to only have access to see 4 things? How do you handle this? Check the user permissions on each file realtime? Slow. Pre-calculate all document permissions for a user on login? Slow and what if new documents are created or permissions change between logins? Does the system scale at 1M documents, 10M documents, 100M documents?
Continue reading →

Tagged graph database, java, neo4j, nosql, permissions, relationship graph

Mar 15 2013

1 Comment

Java, Random

A Peek behind the Neo4j Lucene Index Curtain

Did you know you can write Javascript in the Neo4j console to access the Neo4j API?
Try it. Open up your Neo4j Web Admin Console and type:

neo4j-sh (0)$ eval db EmbeddedGraphDatabase [data/graph.db]

OMG! I know, Neo4j is crazy. So much to play with, I’ve been at it for a few years and I haven’t even dug into this area. What else can we do here?
Continue reading →

Tagged graph database, java, javascript, lucene, luke, neo4j

Dec 14 2012

5 Comments

Deployment

Setting up a Neo4j Cluster on Amazon

There are multiple ways to setup a Neo4j Cluster on Amazon Web Services (AWS) and I want to show you one way to do it.

Overview:

Create a VPC
Launch 1 Instance
Install Neo4j HA
Clone 2 Instances
Configure the Instances
Start the Coordinators
Start the Neo4j Cluster
Create 2 Load Balancers
Next Steps

We’ll start off by logging on to Amazon Web Services and creating a Virtual Private Cloud:

Continue reading →

Tagged amazon, aws, ec2, graph database, java, load balancers, neo4j, network

Nov 27 2012

1 Comment

Java

Pathfinding with Neo4j Unmanaged Extensions

In Extending Neo4j I showed you how to create an unmanaged extension to warm up the node and relationship caches. Let’s try doing something more interesting like exposing the A* (A Star) search algorithm through the REST API. The graph we created earlier looks like this:
Continue reading →

Tagged github, graph database, java, neo4j, nosql

Nov 26 2012

19 Comments

Cypher, Java

Extending Neo4j

One of the great things about Neo4j is how easy it is to extend it. You can extend Neo4j with Plugins and Unmanaged Extensions. Two great examples of plugins are the Gremlin Plugin (which lets you use the Gremlin library with Neo4j) and the Spatial Plugin (which lets you perform spatial operations like searching for data within specified regions or within a specified distance of a point of interest).

Plugins are meant to extend the capabilities of the database, nodes, or relationships. Unmanaged extensions are meant to let you do anything you want. This great power comes with great responsibility, so be careful what you do here. David Montag cooked up an unmanaged extension template for us to use on github so lets give it a whirl. We are going to clone the project, compile it, download Neo4j, configure Neo4j to use the extension, test the extension and tweak it a bit.
Continue reading →

Tagged cypher, extension template, github, graph database, java, mvn, neo4j, nosql, software, spatial operations, technology

Jul 02 2012

6 Comments

Java

Batch Importer – Part 3

At the end of February, we took a look at Michael Hunger’s Batch Importer. It is a great tool to load millions of nodes and relationships into Neo4j quickly. The only thing it was missing was Indexing… I say was, because I just submitted a pull request to add this feature. Let’s go through how it was done so you get an idea of what the Neo4j Batch Import API looks like, and in the next blog post I’ll show you how to generate data to take advantage of it.
Continue reading →

Tagged big data, github, graph database, java, neo4j

Max De Marzi

Graphs, Graphs, and nothing but the Graphs

Tag Archives: java