Tag Archives: java

Caching Partial Traversals in Neo4j

cache_all_the_things

Sometimes you’ll find yourself looking at a traversal and thinking… “I’m going to be doing this one thing over and over again.” That sounds kind of wasteful and years of recycling have taught us not to be wasteful. Let’s take a look at an example from our past. Look back at the Neo Love application, the one with the picture of Marilyn Monroe and Groucho Marx. Let’s see what a Neo4j 2.0 version of that query would look like:

Continue reading

Tagged , , , , , , , ,

Neo4j at Ludicrous Speed

spaceballs_ludicrous_speed

In the last blog post we saw how we could get about 1,250 requests per second (with a 10ms latency) using an Unmanaged Extension running inside the Neo4j server… but what if we wanted to go faster?

The easy answer is to Scale Up. However, trying to add more cores to my Apple laptop doesn’t sound like a good time. Another answer is running a Neo4j Cluster and (almost) linearly scaling our read requests as we add more servers. So a 3 server cluster would give us between 3,500 and 3,750 requests per second.

But can we go faster on a single server without new hardware? Well… yes.
Continue reading

Tagged , , , ,

Online Payment Risk Management with Neo4j

credit_cards_512

I really like this saying by Corey Lanum:

Finding the relationships that should not be there is a great use case for Neo4j, and today I want to highlight an example of why. When you purchase something online, the merchant hands off your information to the payment gateway which processes your actual payment. Before they accept the transaction, they run it via series of risk management tests to validate that it is a real transaction and protect themselves from fraud. One of the hardest things for SQL based systems to do is cross check the incoming payment information against existing data looking for relationships that shouldn’t be there.
Continue reading

Tagged , , , , , , , ,

The Power of Open Source Software

opensource-400

One of the benefits of Open Source Software is that if you want to change how something is done, you can. At Neo Technology, we have a small team of “Field Engineers” who don’t really work ON the product but rather WITH the product. We help our customers with issues of all kinds, answer questions, give suggestions and whatever we need to do to make people’s project successful. A little while back I had a support ticket for a traversal that was taking longer than they hoped it would.

Think about a social network, one of the things you may want to do is tell the user how big their friends network is. But why stop there? How about their friends of friends or even friends of friends of friends network? These are the kind of questions graph databases excel at compared to relational databases. Let’s take a look at what they were doing:
Continue reading

Tagged , , , , , ,

Permission Resolution with Neo4j – Part 1

i_can_haz_permissions

People produce a lot of content. Messages, text files, spreadsheets, presentations, reports, financials, etc, the list goes on. Usually organizations want to have a repository of all this content centralized somewhere (just in case a laptop breaks, gets lost or stolen for example). This leads to some kind of grouping and permission structure. You don’t want employees seeing each other’s HR records, unless they work for HR, same for Payroll, or unreleased quarterly numbers, etc. As this data grows it no longer becomes easy to simply navigate and a search engine is required to make sense of it all.

But what if your search engine returns 1000 results for a query and the user doing the search is supposed to only have access to see 4 things? How do you handle this? Check the user permissions on each file realtime? Slow. Pre-calculate all document permissions for a user on login? Slow and what if new documents are created or permissions change between logins? Does the system scale at 1M documents, 10M documents, 100M documents?
Continue reading

Tagged , , , , ,

A Peek behind the Neo4j Lucene Index Curtain

wizard-of-oz

Did you know you can write Javascript in the Neo4j console to access the Neo4j API?
Try it. Open up your Neo4j Web Admin Console and type:

neo4j-sh (0)$ eval db
EmbeddedGraphDatabase [data/graph.db]

OMG! I know, Neo4j is crazy. So much to play with, I’ve been at it for a few years and I haven’t even dug into this area. What else can we do here?
Continue reading

Tagged , , , , ,

Setting up a Neo4j Cluster on Amazon

There are multiple ways to setup a Neo4j Cluster on Amazon Web Services (AWS) and I want to show you one way to do it.

Overview:

  1. Create a VPC
  2. Launch 1 Instance
  3. Install Neo4j HA
  4. Clone 2 Instances
  5. Configure the Instances
  6. Start the Coordinators
  7. Start the Neo4j Cluster
  8. Create 2 Load Balancers
  9. Next Steps

We’ll start off by logging on to Amazon Web Services and creating a Virtual Private Cloud:


Continue reading

Tagged , , , , , , ,

Pathfinding with Neo4j Unmanaged Extensions

In Extending Neo4j I showed you how to create an unmanaged extension to warm up the node and relationship caches. Let’s try doing something more interesting like exposing the A* (A Star) search algorithm through the REST API. The graph we created earlier looks like this:
Continue reading

Tagged , , , ,

Extending Neo4j

One of the great things about Neo4j is how easy it is to extend it. You can extend Neo4j with Plugins and Unmanaged Extensions. Two great examples of plugins are the Gremlin Plugin (which lets you use the Gremlin library with Neo4j) and the Spatial Plugin (which lets you perform spatial operations like searching for data within specified regions or within a specified distance of a point of interest).

Plugins are meant to extend the capabilities of the database, nodes, or relationships. Unmanaged extensions are meant to let you do anything you want. This great power comes with great responsibility, so be careful what you do here. David Montag cooked up an unmanaged extension template for us to use on github so lets give it a whirl. We are going to clone the project, compile it, download Neo4j, configure Neo4j to use the extension, test the extension and tweak it a bit.
Continue reading

Tagged , , , , , , , , , ,

Batch Importer – Part 3

At the end of February, we took a look at Michael Hunger’s Batch Importer. It is a great tool to load millions of nodes and relationships into Neo4j quickly. The only thing it was missing was Indexing… I say was, because I just submitted a pull request to add this feature. Let’s go through how it was done so you get an idea of what the Neo4j Batch Import API looks like, and in the next blog post I’ll show you how to generate data to take advantage of it.
Continue reading

Tagged , , , ,