Scaling Cypher Writes

salt-pepa-writes

Let’s talk about writes, baby. Let’s talk about you and me. Let’s talk about all the good things. And the bad things that may be. Let’s talk about writes, and indexing and batching, and transactions in Neo4j. Let’s start with my environment. A 3 year old MacBook Pro (dying to get the new ones… once they finally come out) running a 4 core 2.3 GHz Intel Core i7 that is hyper-threading and pretending to have 8. An Apple SM256E SSD that is about average as far as SSDs go. So definitely not a production grade server, so bear that in mind.

We’ll use Gatling for our tests and simulate 8 concurrent app servers sending requests all at once for 30 seconds to Neo4j 2.3.2. Let’s get off the ground running quickly by writing a ton of data per second into the database using a trick Michael Hunger showed me.

FOREACH (r in range(1,100) | CREATE (_0 {type:'User'}), (_0)-[:REL { weight:1 }]->(_0))

600k per second

So this query is writing 100 nodes, 100 node properties, 100 relationships and 100 relationship properties in each request and executing 1500 requests per second for a total of 600k data points per second. That’s not too shabby, I think DJ Spinderella would be impressed. But let’s not kid ourselves, this is not a real world scenario. How often do you create 100 identical nodes connected to themselves? Right never. So we’ll try a more common scenario, where we will create nodes with different user ids.

  val random = new util.Random
  val feeder = Iterator.continually(Map("user_id" -> random.nextInt()))

  val cypher = """CREATE ( me { user_id: {user_id} } )"""
  val statements = """{"statements" : [{"statement" : "%s", "parameters": {"user_id": %s}}]}"""

create node

Whoa, what happened? We’re down to around 8000 nodes created per second. Each request is a full blown transaction down to disk and complete with an http overhead (see Neo4j 3.0.0-M03 and Bolt, the new remoting protocol for how we can speed things up). Still, that’s not too bad, but is that a realistic query? We’re creating unlabeled nodes, which is what we used to do in “Neo4j < 2.0". Let's add a label of "User" to see what happens.

val cypher = """CREATE ( me:User { user_id: {user_id} } )"""

create labeled node

Oh no! We’re down to around 600 labeled nodes created per second. That’s a 13x drop from unlabeled nodes, what’s going on? Well at the moment, the way Neo4j handles those labels is by adding the node id to a Lucene Index. A compressed bitmap would probably be more efficient. But is this a realistic thing to do? Shouldn’t we be indexing those user_ids as well? Yeah, we should. We’ll have our Gatling test create a schema index just once and we’ll silently ignore it from our result set and try it again.

  val setup = scenario("Create Index")
    .exec(http("Create Index")
      .post("/db/data/schema/index/User")
      .basicAuth("neo4j", "swordfish")
      .body(StringBody("""{"property_keys" : ["user_id"] } """))
      .asJSON
      .check(status.in(200,409))
      .silent
    )

create indexed node

OMGWTFBBQ! We’ve just cut our performance in half and are down to 300 indexed nodes created per second! What in the name of all that is transactional is going on around here? Well… Neo4j is now updating 2 indexes on every single request: the Label index and the Schema property index for user_id.

refresh search manager

Those Lucene refreshes are eating all of our performance. So what can we do about it? If you are a regular reader, you already know. Put a queue in front of Neo4j or put a queue inside Neo4j in order to batch your writes. Let’s see what that would look like, what if we wrote 1000 nodes at a time?

  val cypher = """CREATE ( me:User { user_id: {user_id} } )"""
  val oneThousand = JSONArray.apply(
    List.fill(1000)(
      JSONObject.apply(
        Map("statement" -> cypher,
          "parameters" -> JSONObject.apply(
            Map("user_id" -> random.nextInt())))
      )
    )
  )
  val statements = JSONObject.apply(Map("statements" -> oneThousand))

create 1000 labeled indexed nodes

Ok wow, we’re back in business. That’s 50 requests per second but since each request is 1000 statements, that’s 50k indexed nodes created per second. A whooping 166x better than before. You can now see that refreshing 2 Lucene indexes 50 times a second with 1000 new entries is much better than doing it one at time. What happens if we remove the index?

create 1000 labeled nodes

Trick question! We still have the index backing the Labels. So not surprisingly our speed doubled to 100k labeled nodes created per second. Let’s just get rid of the label altogether and try it again.

create one thousand nodes

Cool. Now we’re up to 165k nodes created per second. So it’s clear, we need to batch our writes if we want high throughput. Wouldn’t it be nice if there was some Unmanaged Extension that could do this for us? Introducing, the cypher batch writer.

Let’s try the extension with an indexed node to measure its performance and then I’ll explain how it works.

  val scn = scenario("Create Indexed Nodes via Extension")
    .during(30 seconds) {
      feed(feeder)
        .exec(
          http("create indexed node via ext")
            .post("/v1/service/batch")
            .basicAuth("neo4j", "swordfish")
            .body(StringBody(statements.format(cypher, "${user_id}")))
            .asJSON
            .check(status.is(200))
        )
    }

create indexed nodes using extension

Boo yeah! That’s 10,000 indexed nodes created per second, which is 33x faster than before, but not quite the 50k we had earlier. Two things are slowing us down. For one, the extension has an array of background queues writing cypher statements in batches that run every second, but we can’t guarantee there will be an optimal number of statements to write (1000 per transaction would be good for my hardware). Second, we have http overhead to contend to. So we can’t get 50k http requests to the Neo4j server in order to be able to even try. Maybe Bolt will help in this area. Stay tuned for a port of this Unmanaged Extension into a Cypher Procedure.

Bear in mind that the 10k request per second is just how many request writes we have received, the queues still have to drain and write in batches. However each queue is limited to 2000 items (see here ) so at most we are 2000 * # of queues behind in our writes, or for my laptop about 16k. Considering we wrote just under 300k nodes in that test I think we’re fine. Caveat of course is that you can lose writes if someone pulls the plug while the queues are not empty. It may be worth it to switch out the LinkedBlockingQueues for Chronicle Queue to get persistence and “Reading the Chronicle after a shutdown” for recovery.

Anyway, as always the test code and extension code is on github. Give it a shot and let me know what you think.

4 thoughts on “Scaling Cypher Writes”

Dan Dutrow says:

April 4, 2016 at 4:08 PM

Sorry for what might be a newb question, but I would like to create an application that does what you have recommended in a few of your posts: one that reads off of a queue (like RabbitMQ/Kafka) and inserts writes in batch to Neo4j. I believe that creating an unmanaged extension is preferable to creating a client that runs an embedded Neo4j, in particular because I would also like to query Neo4j through REST from a variety of distributed clients, and I would like for the installation and configuration of Neo4j to happen external to the application I write.

Can the unmanaged exceptions be set up to run without requiring a POST to trigger them? I really just want a headless application that consumes CYPHER strings from the message queue in the background and writes them to Neo4j. The hope is that this will increase my insertion rate at least 10X but hopefully 100X over the transactional REST API.

- maxdemarzi says:
  
  April 4, 2016 at 5:10 PM
  
  Sure, create a scheduled service that reads from the queue and set it up like the one linked in the source code on this post.
  
  - Dan Dutrow says:
    
    April 7, 2016 at 8:54 AM
    
    Thanks. It makes sense that the service starts up automatically and gets the context injected. Is there anything built-in to Neo4j Unmanaged Exceptions for system property configuration (like setting RabbitMQ url endpoints) — can I use neo4j-server.properties? If not, where would be the best place to put config files — what is the default root directory for unmanaged extensions?
  - Dan Dutrow says:
    
    April 8, 2016 at 2:59 PM
    
    A couple of things I have discovered:
    – The service constructor is not called at neo4j start time. I’m still trying to figure out how to trigger it. Hopefully it doesn’t require a service call to kick it off although that seems to be the default behavior. This is probably a JAX-RS question not specific to neo4j.
    – The service constructor is called every time the service is called, making the use of static instances imperative
    – the conf directory is included in the classpath resources, so you can load properties with InputStream inputStream = MyService.class.getClassLoader().getResourceAsStream(PROPERTIES_FILE);
    Properties props = new Properties();
    props.load(inputStream)

Max De Marzi

Graphs, Graphs, and nothing but the Graphs