Neo4j at Ludicrous Speed

spaceballs_ludicrous_speed

In the last blog post we saw how we could get about 1,250 requests per second (with a 10ms latency) using an Unmanaged Extension running inside the Neo4j server… but what if we wanted to go faster?

The easy answer is to Scale Up. However, trying to add more cores to my Apple laptop doesn’t sound like a good time. Another answer is running a Neo4j Cluster and (almost) linearly scaling our read requests as we add more servers. So a 3 server cluster would give us between 3,500 and 3,750 requests per second.

But can we go faster on a single server without new hardware? Well… yes.

Neo4j started its life as an embeddable Java library, and running embedded is still a valid way to deploy Neo4j. However it’s mostly reserved to customers who use Java as their primary language. For us non-Java folks, lets take a peek at one way to do this.

I was looking around for a really simple and performant Java web server and ran in to Undertow.

undertow_banner

Undertow looks really easy to use and scores high marks in the TechEmpower Web Framework Benchmarks

We will re-use the graph database we created in the previous blog post, and just point to it. We’ll use the GraphDatabaseFactory to accomplish this:

    static GraphDatabaseService graphDb = new GraphDatabaseFactory()
            .newEmbeddedDatabaseBuilder( STOREDIR )
            .loadPropertiesFromFile( PATHTOCONFIG + "neo4j.properties" )
            .newGraphDatabase();

The first thing we’ll do is register a Shutdown Hook for our Neo4j database so it shuts down correctly when you stop the server. Next we’ll build our Undertow server, having it listen on the same port as Neo4j. We’ll have two paths here. The root path will just say “Hello World” just to make sure everything is wired up and we can use it to test the baseline performance of Undertow on our system. The second path will match our Unmanaged Extension so we can reuse our performance test.

    public static void main(final String[] args) {
        registerShutdownHook(graphDb);
        Undertow server = Undertow.builder()
                .addListener(7474, "localhost")
                .setHandler(new PathHandler()
                        .addPath("/", new HelloWorldHandler())
                        .addPath("/example/service/crossreference", new CrossReferenceHandler(objectMapper, graphDb))
                ).build();

        server.start();
    }

Let’s take a look at that HelloWorldHandler first. It handles the request by simply responding with the plain text Hello World.

public class HelloWorldHandler implements HttpHandler {
    @Override
    public void handleRequest(final HttpServerExchange exchange) throws Exception {
        exchange.getResponseHeaders().put(Headers.CONTENT_TYPE, "text/plain");
        exchange.getResponseSender().send("Hello World");
    }
}

We can test the performance of this request with a little Gatling:

  val base = scenario("Get Hello World")
    .during(30) {
      exec(
        http("Get Base Request")
          .get("/")
          .check(status.is(200))
      )
      .pause(0 milliseconds, 1 milliseconds)
  }

 setUp(
    base.users(16).protocolConfig(httpConf)
  )

…and our results are just under 20k requests per second:

Screen Shot 2014-02-27 at 11.25.39 AM

Now let’s take a look at our CrossReferenceHandler. You’ll notice we are passing in the graphDB we created earlier and the objectMapper. It’s not needed since we only have one end point, but if you had many you don’t want to recreate those objects every time. In fact, Neo4j won’t let you since it needs exclusive access to the graph.db directory.

public class CrossReferenceHandler implements HttpHandler {
    private static final RelationshipType RELATED = DynamicRelationshipType.withName("RELATED");
    ObjectMapper objectMapper;
    GraphDatabaseService graphDb;

    public CrossReferenceHandler(ObjectMapper objectMapper, GraphDatabaseService graphDb){
        this.objectMapper = objectMapper;
        this.graphDb = graphDb;
    }

When we handle the request, we first check to see if the request is a POST method, then we will get the InputStream of the request (a JSON blob) and convert it to a HashMap. The funny startBlocking() call is required to read the inputStream.

                   public void handleRequest(final HttpServerExchange exchange) throws Exception {
                        try {
                            if (exchange.getRequestMethod().equals(Methods.POST)) {
                                exchange.startBlocking();
                                final InputStream inputStream = exchange.getInputStream();
                                final String body = new String(ByteStreams.toByteArray(inputStream), Charsets.UTF_8);
                                HashMap input = objectMapper.readValue(body, HashMap.class);
                                ...       
       

From here our code follows exactly what our unmanaged extension did, and responds with the answer in JSON format:

exchange.getResponseHeaders().put(Headers.CONTENT_TYPE, "application/json; charset=utf-8");
exchange.getResponseSender().send(ByteBuffer.wrap(objectMapper.writeValueAsBytes(results)));       

When we test this using the existing performance test we get:

Screen Shot 2014-02-27 at 11.29.50 AM

Over 8000 requests per second with a latency of 1ms. Thats about 6.5x the number of requests we we able to do before and our mean latency dropped to just 1ms…on my laptop. Now that’s fast! The complete source code is available on github as always, try it out with your own Neo4j projects.

I’ll leave you with other options including running Neo4j embedded with RatPack on Stefan Armbruster’s blog…. and even more fun ideas from Nigel Small:

If you have a problem that is graphy in nature and your Relational Database just isn’t cutting it, you gotta try Neo4j.

Tagged , , , ,

4 thoughts on “Neo4j at Ludicrous Speed

  1. Javad Karabi says:

    How would one scale horizontally, using this method? Essentially the neo4j store directory would have to be located in a network file system of some sort, correct?

    Also, what is the difference between using undertow + embedded neo4j , and the web server that neo4j comes with by default? isnt using an unmanaged extension effectively the same thing as having access to embedded neo4j?

  2. […] the last blog post we managed to run Neo4j at Ludicrous Speed over http using Undertow and get to about 8000 requests per second. If we needed more speed we can […]

  3. […] going to be using Undertow again for our Server, and we’ll start off with creating our database using the GraphDatabaseFactory in our Neo4j […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,682 other followers

%d bloggers like this: