Facebook Graph Search with Cypher and Neo4j

Update: Facebook has disabled this application

Your app is replicating core Facebook functionality.

neo_graph_search_screen_shot

Facebook Graph Search has given the Graph Database community a simpler way to explain what it is we do and why it matters. I wanted to drive the point home by building a proof of concept of how you could do this with Neo4j. However, I don’t have six months or much experience with NLP (natural language processing). What I do have is Cypher. Cypher is Neo4j’s graph language and it makes it easy to express what we are looking for in the graph. I needed a way to take “natural language” and create Cypher from it. This was going to be a problem.

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

It’s an old programmer joke, but that is what came to mind. Some kind of fuzzy regular expressions. In the IPhone world, we usually hear people say “There’s an App for that”. In Ruby world, we go with “there’s a Gem for that”… so I asked google for some help and came upon Semr.

Semr is the gateway drug framework to supporting natural language processing in your application. It’s goal is to follow the 80/20 rule where 80% of what you want to express in a DSL is possible in familiar way to how developers normally solve solutions. (Note: There are other more flexible solutions but also come with a higher learing curve, i.e. like treetop)

Awesome, a ray of light to solve my problem… but the Gem is 4 years old. I could not get it to install. Bummer… Wait what was that about Treetop?

Treetop is a language for describing languages. Combining the elegance of Ruby with cutting-edge parsing expression grammars, it helps you analyze syntax with revolutionary ease.

Score! Now I had no idea how to write a proper language grammar, but that’s never stopped anyone before. Someone who has more than a couple hours of experience with Treetop is going to laugh at this but I’ll show you part of what I did:

rule friends
  "friends" <Friends>
end

rule likes
  "who like" <Likes>
end

rule likeand
  likes space thing space "and" space thing <LikeAnd>
end

rule thing
  [a-zA-Z0-9]+ <Thing>
end

I am creating some rules for things, and the likes relationship, and also the idea of “likes this and that”.
The “natural language” is run by these rules and a syntax tree is generated with the matching rules. These are then turned into hashes representing pieces of cypher. Looking at the code above and below you can see how “friends who like Neo4j” gets parsed into Friends, Likes, Thing.

  class Friends < Treetop::Runtime::SyntaxNode
    def to_cypher
        return {:start  => "me = node({me})", 
                :match  => "me -[:friends]-> people",
                :return => "people",
                :params => {"me" => nil }}
    end 
  end

  class Likes < Treetop::Runtime::SyntaxNode
    def to_cypher
        return {:match => "people -[:likes]-> thing"}
    end 
  end

  class Thing < Treetop::Runtime::SyntaxNode
    def to_cypher
        return {:start  => "thing = node:things({thing})",
                :params => {"thing" => "name: " + self.text_value } }
    end 
  end

Then these hashes are combined and turned into a proper Cypher string:

  class Expression < Treetop::Runtime::SyntaxNode
    def to_cypher
      cypher_hash =  self.elements[0].to_cypher
      cypher_string = ""
      cypher_string << "START "   + cypher_hash[:start].uniq.join(", ")
      cypher_string << " MATCH "  + cypher_hash[:match].uniq.join(", ") unless cypher_hash[:match].empty?
      cypher_string << " RETURN DISTINCT " + cypher_hash[:return].uniq.join(", ")
      params = cypher_hash[:params].empty? ? {} : cypher_hash[:params].uniq.inject {|a,h| a.merge(h)}
      return [cypher_string, params].compact
    end
  end

Finally I built a Sinatra web application that imports your data from Facebook and a search page so you can try this out for yourself. As always, the code is available on Github, and hosted on Heroku.

While reproducing a “kinda” Facebook Graph Search is interesting, what would be more interesting is seeing other people use this idea on their own data. If you would like to know more about this proof of concept, contact me or come to the Neo4j Meetups in Virginia (Feb 26th) or in Boston (Feb 28th) or in Chicago (TBD) and somewhere near you.

Tagged , , , , , ,

17 thoughts on “Facebook Graph Search with Cypher and Neo4j

  1. Very cool, Max! Thanks for making the source open too!

    Hope to meet you at http://info.neotechnology.com/0226-dc-register.html

  2. Ben says:

    Is there a suggested equivalent Java framework to treetop that you would suggest?

  3. Anon Coward says:

    Hi. Playing around with the demo – are you background tasks running? No data seems to have been loaded 30m into it.

    • maxdemarzi says:

      Yeah… got a little too popular there and it had thousands of jobs queued up. I killed them all, so you should be able to try it now.

  4. […] via Facebook Graph Search with Cypher and Neo4j | Max De Marzi. […]

  5. […] Facebook Ggraph Search with Cypher and Neo4j by Max De Marzi. […]

  6. Awesome stuff. I was trying to understand the technology behind the scene. I like reading about different tech-stacks. Thanks for sharing the code and the post.

  7. […] And adding an engaging user interface that understands natural language (like the one my colleague Max de Marzi wrote in a weekend), we could declare functional equivalence to the fundamental Facebook Graph Search functionality. […]

  8. […] [5]https://maxdemarzi.com/2013/01/28/facebook-graph-search-with-cypher-and-neo4j/ 作者:skyline0623 发表于2013-2-22 0:48:51 原文链接 阅读:127 评论:1 查看评论 […]

  9. Is it possible to do this using Python’s Natural Language Toolkit?

  10. Daniel Corbett says:

    Hmmm… Login through facebook not working at the moment.

    • maxdemarzi says:

      Had a ton of visitors today. Try it again, or try the demo locally by cloning the github repository.

  11. […] So how do these companies succeed with graphs? Well, just over a year ago, my colleague Max de Marzi undertook a little exercise to show just how easy it to answer difficult questions with a graph. Max built a version of Facebook’s Graph Search that can answer even more questions than the original – over a single weekend using a graph database as his backend! You can take a look at the full story at: https://maxdemarzi.com/2013/01/28/facebook-graph-search-with-cypher-and-neo4j/ […]

  12. […] use of these databases is not that difficult, as the example of Max de Marzi shows. He built a system of Facebook Graph Search over a single weekend (the system […]

Leave a comment