Scheduling Meetings with Neo4j

One of the symptoms of any fast growing company is the lack of available meeting rooms. The average office worker gets immense satisfaction to their otherwise mundane workday when they get to kick someone else out of the meeting room they booked. Of course that joy can be cut short (along with their career) once realizing some unnoticed VIP was unceremoniously kicked out. It’s not a super exciting use case, but today I’m going to show you how to use Neo4j to perform some scheduling gymnastics.

Let’s start with what the data model looks like:

So we have a Person that sits in a Cubicle, that is located in a Floor that has meeting Rooms where Meetings are booked and these Meetings are attended by People. That’s a nice circle model right there. Let’s build an example with 3 people each in their own cubicle, 2 floors, 4 meeting rooms, 2 in each floor, and a bunch of meetings. We’ll also have one of the people booked in one of the existing meetings. We will use Longs for the times representing Unix Epoc Time in milliseconds. In Neo4j 3.4, we will have legitimate date and datetime data types so you will be able to create date times like: localdatetime({year:1984, month:10, day:11, hour:12, minute:31, second:14}) instead of this hot mess, but regardless here is the Cypher for this example:

 
CREATE (person1:Person {name: "Max"})
CREATE (person2:Person {name: "Alex"})
CREATE (person3:Person {name: "Andrew"})
CREATE (cube1A:Cubicle {name: "F1A"})
CREATE (cube1B:Cubicle {name: "F1B"})
CREATE (cube2A:Cubicle {name: "F2A"})
CREATE (floor1:Floor {name: "Floor 1"})
CREATE (floor2:Floor {name: "Floor 2"})
CREATE (person1)-[:SITS_IN]->(cube1A)
CREATE (person2)-[:SITS_IN]->(cube1B)
CREATE (person3)-[:SITS_IN]->(cube1C)
CREATE (cube1A)-[:LOCATED_IN]->(floor1)
CREATE (cube1B)-[:LOCATED_IN]->(floor1)
CREATE (cube1C)-[:LOCATED_IN]->(floor2)
CREATE (room1:Room {name:"Room 1"})
CREATE (room2:Room {name:"Room 2"})
CREATE (room3:Room {name:"Room 3"})
CREATE (room4:Room {name:"Room 4"})
CREATE (room1)-[:LOCATED_IN]->(floor1)
CREATE (room2)-[:LOCATED_IN]->(floor1)
CREATE (room3)-[:LOCATED_IN]->(floor2)
CREATE (room4)-[:LOCATED_IN]->(floor2)
CREATE (m1:Meeting {start_time: 1521534600000, end_time:1521538200000}) // 8:30-9:30am
CREATE (m2:Meeting {start_time: 1521543600000, end_time:1521550800000}) // 11-1pm
CREATE (m3:Meeting {start_time: 1521550800000, end_time:1521558000000}) // 1-3pm
CREATE (m4:Meeting {start_time: 1521534600000, end_time:1521543600000}) // 8:30-11am
CREATE (m5:Meeting {start_time: 1521550800000, end_time:1521554400000}) // 1-2pm
CREATE (m6:Meeting {start_time: 1521561600000, end_time:1521565200000}) // 4-5pm
CREATE (m7:Meeting {start_time: 1521558000000, end_time:1521561600000}) // 3-4pm
CREATE (room1)-[:IS_BOOKED_ON_2018_03_20]->(m1)
CREATE (room1)-[:IS_BOOKED_ON_2018_03_20]->(m2)
CREATE (room1)-[:IS_BOOKED_ON_2018_03_20]->(m3)
CREATE (room2)-[:IS_BOOKED_ON_2018_03_20]->(m4)
CREATE (room2)-[:IS_BOOKED_ON_2018_03_20]->(m5)
CREATE (room2)-[:IS_BOOKED_ON_2018_03_20]->(m6)
CREATE (room4)-[:IS_BOOKED_ON_2018_03_20]->(m7)
CREATE (person2)-[:HAS_MEETING_ON_2018_03_20]->(m7)

This cypher script creates this lovely set of data:

By looking at it we can see how it is all connected, but it is not immediately obvious what times in what rooms people are able to meet. That is going to be our question. Give a set of meeting attendees and a datetime range, find me the available meeting times in the rooms that are in the same floor as at least one of the attendees. Let’s try building this query together once piece at a time. So the first thing I want to do is find out what times ranges I have to eliminate because one of the attendees is already booked on another meeting.

 
MATCH (p:Person)
WHERE p.name IN ["Max", "Alex", "Andrew"]
OPTIONAL MATCH (p)-[:HAS_MEETING_ON_2018_03_20]->(m:Meeting)
RETURN p, m
 
╒═════════════════╤═════════════════════════════════════════════════════╕
│"p"              │"m"                                                  │
╞═════════════════╪═════════════════════════════════════════════════════╡
│{"name":"Max"}   │null                                                 │
├─────────────────┼─────────────────────────────────────────────────────┤
│{"name":"Alex"}  │{"end_time":1521561600000,"start_time":1521558000000}│
├─────────────────┼─────────────────────────────────────────────────────┤
│{"name":"Andrew"}│null                                                 │
└─────────────────┴─────────────────────────────────────────────────────┘

So it looks like Alex is busy from 3-4pm. Next we need to figure out where everyone sits, what floor they are in, and what rooms we are able to meet in. So our query looks like this:

 
MATCH (p:Person)-[:SITS_IN]->(c:Cubicle)-[:LOCATED_IN]->(f:Floor)<-[:LOCATED_IN]-(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
RETURN DISTINCT r
ORDER BY r.name

This gets us Rooms 1-4 as expected:

 
╒═════════════════╕
│"r"              │
╞═════════════════╡
│{"name":"Room 1"}│
├─────────────────┤
│{"name":"Room 2"}│
├─────────────────┤
│{"name":"Room 3"}│
├─────────────────┤
│{"name":"Room 4"}│
└─────────────────┘

Ok, so far so good. Now we need to know if those rooms have already been booked for other meetings today.

 
MATCH (p:Person)-[:SITS_IN]->(c:Cubicle)-[:LOCATED_IN]->(f:Floor)<-[:LOCATED_IN]-(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
WITH r
OPTIONAL MATCH (r)-[:IS_BOOKED_ON_2018_03_20]->(m:Meeting)
WITH r, m ORDER BY m.start_time
RETURN r.name, COLLECT(DISTINCT m) AS meetings
ORDER BY r.name

This query tells us that Room 1 has 3 meetings scheduled, Room 2 has 3 as well, Room 3 is wide open, and Room 4 just has one. But it is really hard to see the actual times since they are shown as longs.

 
╒════════╤══════════════════════════════════════════════════════════════════════╕
│"r.name"│"meetings"                                                            │
╞════════╪══════════════════════════════════════════════════════════════════════╡
│"Room 1"│[{"end_time":1521538200000,"start_time":1521534600000},{"end_time":152│
│        │1550800000,"start_time":1521543600000},{"end_time":1521558000000,"star│
│        │t_time":1521550800000}]                                               │
├────────┼──────────────────────────────────────────────────────────────────────┤
│"Room 2"│[{"end_time":1521543600000,"start_time":1521534600000},{"end_time":152│
│        │1554400000,"start_time":1521550800000},{"end_time":1521565200000,"star│
│        │t_time":1521561600000}]                                               │
├────────┼──────────────────────────────────────────────────────────────────────┤
│"Room 3"│[]                                                                    │
├────────┼──────────────────────────────────────────────────────────────────────┤
│"Room 4"│[{"end_time":1521561600000,"start_time":1521558000000}]               │
└────────┴──────────────────────────────────────────────────────────────────────┘

If you are using the Neo4j APOC plugin we can use the “apoc.date.format” function to make them friendlier. In Neo4j 3.4 you will be able to use a built in function “datetime.FromEpochMillis” for the same thing.

 
MATCH (p:Person)-[:SITS_IN]->(c:Cubicle)-[:LOCATED_IN]->(f:Floor)<-[:LOCATED_IN]-(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
WITH r
OPTIONAL MATCH (r)-[:IS_BOOKED_ON_2018_03_20]->(m:Meeting)
WITH r, m ORDER BY m.start_time
RETURN r.name, EXTRACT (x IN COLLECT(DISTINCT m) | apoc.date.format(x.start_time,'ms','HH:mm') 
                                            + ' to ' + apoc.date.format(x.end_time,'ms','HH:mm')) AS meetings
ORDER BY r.name

Here we go, now that is way more readable:

 
╒════════╤════════════════════════════════════════════════════╕
│"r.name"│"meetings"                                          │
╞════════╪════════════════════════════════════════════════════╡
│"Room 1"│["08:30 to 09:30","11:00 to 13:00","13:00 to 15:00"]│
├────────┼────────────────────────────────────────────────────┤
│"Room 2"│["08:30 to 11:00","13:00 to 14:00","16:00 to 17:00"]│
├────────┼────────────────────────────────────────────────────┤
│"Room 3"│[]                                                  │
├────────┼────────────────────────────────────────────────────┤
│"Room 4"│["15:00 to 16:00"]                                  │
└────────┴────────────────────────────────────────────────────┘

Alright, let’s combine the two two queries together and see what rooms we can meet in and what times we can’t meet in those rooms because they are either already booked, or one of our attendees is busy:

 
MATCH (p:Person)
WHERE p.name IN ["Max", "Alex", "Andrew"]
OPTIONAL MATCH (p)-[:HAS_MEETING_ON_2018_03_20]->(m:Meeting)
WITH COLLECT(m) AS occupied
MATCH (p:Person)-[:SITS_IN]->(c:Cubicle)-[:LOCATED_IN]->(f:Floor)<-[:LOCATED_IN]-(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
WITH r, occupied
OPTIONAL MATCH (r)-[:IS_BOOKED_ON_2018_03_20]->(m:Meeting)
WITH r, COLLECT(DISTINCT m) + occupied AS meetings
UNWIND meetings AS m
WITH r, m ORDER BY m.start_time
RETURN r.name, EXTRACT (x IN COLLECT(m) | apoc.date.format(x.start_time,'ms','HH:mm') 
                               + ' to ' + apoc.date.format(x.end_time,'ms','HH:mm')) AS meetings
ORDER BY r.name
 
╒════════╤═════════════════════════════════════════════════════════════════════╕
│"r.name"│"meetings"                                                           │
╞════════╪═════════════════════════════════════════════════════════════════════╡
│"Room 1"│["08:30 to 09:30","11:00 to 13:00","13:00 to 15:00","15:00 to 16:00"]│
├────────┼─────────────────────────────────────────────────────────────────────┤
│"Room 2"│["08:30 to 11:00","13:00 to 14:00","15:00 to 16:00","16:00 to 17:00"]│
├────────┼─────────────────────────────────────────────────────────────────────┤
│"Room 3"│["15:00 to 16:00"]                                                   │
├────────┼─────────────────────────────────────────────────────────────────────┤
│"Room 4"│["15:00 to 16:00","15:00 to 16:00"]                                  │
└────────┴─────────────────────────────────────────────────────────────────────┘

Now, we could stop here and let our Application mark those times as unavailable and call it a day. But what we really want is the opposite of that. We want the times that the rooms and attendees are available. So how do we figure that out? Well for each meeting, we want to find the next meeting start time for each room. The time slot between meetings is what we are after defined by the entry’s end time and the start time of the next event. To perform this, we are going to use a double-unwind which is basically “for each thing in the list, I want to pair it (get a cross product) with every other thing in the list”. Typically this is the last thing you want to do since making a cross product can be very expensive, but it makes perfect sense for this query. We only care about the times where one meeting start time is greater than or equal to the other end time, and from these we will grab our time slot as the query below shows:

 
MATCH (p:Person)
WHERE p.name IN ["Max", "Alex", "Andrew"]
OPTIONAL MATCH (p)-[:HAS_MEETING_ON_2018_03_20]->(m:Meeting)
WITH COLLECT(m) AS occupied
MATCH (p:Person)-[:SITS_IN]->(c:Cubicle)-[:LOCATED_IN]->(f:Floor)<-[:LOCATED_IN]-(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
WITH r, occupied
OPTIONAL MATCH (r)-[:IS_BOOKED_ON_2018_03_20]->(m:Meeting)
WITH r, [{start_time:1521565200000, end_time:1521534600000}] + COLLECT(m) + occupied AS meetings
UNWIND meetings AS m
WITH r, [min(m.start_time), max(m.end_time)] AS rslot, COLLECT(m) AS mm
WITH r, rslot, mm
UNWIND mm AS m1
UNWIND mm AS m2
WITH r, rslot, m1, m2 WHERE (m2.start_time >= m1.end_time)
WITH r, rslot, [m1.end_time, min(m2.start_time)] AS slot
ORDER BY slot[0]
RETURN r.name, EXTRACT (x IN COLLECT(slot) | apoc.date.format(x[0],'ms','HH:mm') 
                               + ' to ' + apoc.date.format(x[1],'ms','HH:mm')) AS available
ORDER BY r.name

Our output looks close, but it’s not quite there. Rooms 3 and 4 look correct, but for Room 1 and 2 we have start time and end times that are the same:

 
╒════════╤══════════════════════════════════════════════════════════════════════╕
│"r.name"│"available"                                                           │
╞════════╪══════════════════════════════════════════════════════════════════════╡
│"Room 1"│["08:30 to 08:30","09:30 to 11:00","13:00 to 13:00","15:00 to 15:00","│
│        │16:00 to 17:00"]                                                      │
├────────┼──────────────────────────────────────────────────────────────────────┤
│"Room 2"│["08:30 to 08:30","11:00 to 13:00","14:00 to 15:00","16:00 to 16:00","│
│        │17:00 to 17:00"]                                                      │
├────────┼──────────────────────────────────────────────────────────────────────┤
│"Room 3"│["08:30 to 15:00","16:00 to 17:00"]                                   │
├────────┼──────────────────────────────────────────────────────────────────────┤
│"Room 4"│["08:30 to 15:00","16:00 to 17:00"]                                   │
└────────┴──────────────────────────────────────────────────────────────────────┘

So let’s fix that by not allowing any slots that start and end at the same time:

 
MATCH (p:Person)
WHERE p.name IN ["Max", "Alex", "Andrew"]
OPTIONAL MATCH (p)-[:HAS_MEETING_ON_2018_03_20]->(m:Meeting)
WITH COLLECT(m) AS occupied
MATCH (p:Person)-[:SITS_IN]->(c:Cubicle)-[:LOCATED_IN]->(f:Floor)<-[:LOCATED_IN]-(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
WITH r, occupied
OPTIONAL MATCH (r)-[:IS_BOOKED_ON_2018_03_20]->(m:Meeting)
WITH r, [{start_time:1521565200000, end_time:1521534600000}] + COLLECT(m) + occupied AS meetings
UNWIND meetings AS m
WITH r, [min(m.start_time), max(m.end_time)] AS rslot, COLLECT(m) AS mm
WITH r, rslot, mm
UNWIND mm AS m1
UNWIND mm AS m2
WITH r, rslot, m1, m2 WHERE (m2.start_time >= m1.end_time)
WITH r, rslot, [m1.end_time, min(m2.start_time)] AS slot
ORDER BY slot[0]
WITH r, [[1521534600000, rslot[0]]] + collect(slot) + [[rslot[1], 1521565200000]] AS open
WITH r, filter(x IN open WHERE x[0]<>x[1]) AS available
UNWIND available AS dups
WITH r, COLLECT(DISTINCT dups) AS tslots
RETURN r.name AS Room , EXTRACT (x IN tslots | apoc.date.format(x[0],'ms','HH:mm') 
                                    + ' to ' + apoc.date.format(x[1],'ms','HH:mm')) AS Available
ORDER BY r.name

…and there we go:

 
╒════════╤═══════════════════════════════════╕
│"Room"  │"Available"                        │
╞════════╪═══════════════════════════════════╡
│"Room 1"│["09:30 to 11:00","16:00 to 17:00"]│
├────────┼───────────────────────────────────┤
│"Room 2"│["11:00 to 13:00","14:00 to 15:00"]│
├────────┼───────────────────────────────────┤
│"Room 3"│["08:30 to 15:00","16:00 to 17:00"]│
├────────┼───────────────────────────────────┤
│"Room 4"│["08:30 to 15:00","16:00 to 17:00"]│
└────────┴───────────────────────────────────┘

Pretty neat right? To be totally honest, I didn’t come up with this query by myself. I had a ton of help from Alex Price and Andrew Bowman.

I asked Michael Hunger, and he had another idea ordering the meeting times and using lists and ranges instead of a double unwind to get the same answer. Here he is also using apoc.date.parse(‘2018-03-20 08:30:00’) instead of 1521534600000 to make the query more readable. Yes, these dates will be much nicer to work with in Neo4j 3.4… I can’t wait either.

 
MATCH (p:Person)
WHERE p.name IN ["Max", "Alex", "Andrew"]
OPTIONAL MATCH (p)-[:HAS_MEETING_ON_2018_03_20]->(m:Meeting)
WITH COLLECT(m) AS occupied
MATCH (p:Person)‐[:SITS_IN]‐>(c:Cubicle)‐[:LOCATED_IN]‐>(f:Floor)<‐[:LOCATED_IN]‐(r:Room)
WHERE p.name IN ["Max", "Alex", "Andrew"]
WITH DISTINCT r, occupied
OPTIONAL MATCH (r)‐[:IS_BOOKED_ON_2018_03_20]‐>(m:Meeting)
WITH r, occupied + COLLECT(m {.start_time, .end_time}) AS meetings
UNWIND meetings AS m
WITH r, m order by m.start_time
WITH r, COLLECT(m) as meetings
WITH r,meetings,
               {end_time:apoc.date.parse('2018-03-20 08:30:00')} + 
               meetings + 
               {start_time:apoc.date.parse('2018-03-20 17:00:00')} AS bookedSlots
WITH r, meetings,[idx in range(0,size(bookedSlots)-2) | {start_time:(bookedSlots[idx]).end_time,end_time:(bookedSlots[idx+1]).start_time}] as allSlots
WITH r, meetings,[slot IN allSlots WHERE slot.end_time - slot.start_time > 10*60*1000] as openSlots
WITH r, [slot IN openSlots WHERE 
              NONE(m IN meetings WHERE slot.start_time < m.start_time < slot.end_time 
                                    OR slot.start_time < m.end_time < slot.end_time)] as freeSlots
RETURN r, [slot IN freeSlots | apoc.date.format(slot.start_time,'ms','HH:mm')+" to "+apoc.date.format(slot.end_time,'ms','HH:mm')] as free
ORDER BY r.name;

If you want expert help with your Cypher queries (and anything else Neo4j) be sure to join our Neo4j Users Slack Group where over 7500 Neo4j users hang out.

Tagged , , , , , , , , ,

2 thoughts on “Scheduling Meetings with Neo4j

  1. […] infamous Max De Marzi demonstrates how to use Neo4j for a common meeting room scheduling task. Quite impressive Cypher queries in […]

  2. […] infamous Max De Marzi demonstrates how to use Neo4j for a common meeting room scheduling task. Quite impressive Cypher queries in […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

%d bloggers like this: