I done wrestled with an alligator, I done tussled with a whale, only last week I murdered a rock, injured a stone, hospitalized a brick. I’m so mean I make medicine sick.
He floats like a butterfly and stings like a bee, but is Muhammad Ali truly the greatest? Greater than the Beatles? Greater than Alexander? Greater than Sliced Bread? Let’s find out.
We begin by requiring neography and creating a function to help us create the greats.
require 'rubygems' require 'neography' def create_great(name) Neography::Node.create("name" => name) end
There are a ton of greats out there, but we’ll keep it simple and create just 8 of the greatest greats.
game = create_great('The 1958 NFL Championship Game') brando = create_great('Marlon Brando') alex = create_great('Alexander the Great') circus = create_great('The Ringling Bros. and Barnum and Bailey') beatles = create_great('The Beatles') ali = create_great('Muhammad Ali') bread = create_great('Sliced Bread') gatsby = create_great('The Great Gatsby') greats = [game,brando,alex,circus,beatles,ali,bread,gatsby]
That’s a pretty good list. We can’t say one great is greater than another, but given any two greats, we can say that they are “As Great” as each other.
def as_great(great, other_greats) other_greats.each do |og| great.outgoing(:as_great) << og end end
So we will use this to our advantage and let fate (randomness) decide. We will randomly create “As Great” relationships between each great and up to 5 other greats, excluding themselves.
greats.each do |g| ogs = greats.select{|v| v != g }.sample(1 + rand(5)) as_great(g, ogs) end
Now that we have our 8 greats linked to each other we can determine who the greatest is. We do this by traversing the graph from each node, following our outgoing relationships and keeping track of which nodes we visit along the way.
def the_greatest neo = Neography::Rest.new neo.execute_script("m = [:]; c = 0; g. V. out. groupCount(m). loop(2){c++ < 1000}.iterate(); m.sort{a,b -> b.value <=> a.value}.keySet().name[0];") end puts "The greatest is #{the_greatest}"
Let’s break this down. We create a map “m” and a counter “c”.
m = [:]; c = 0;
Then from g (which if you remember my post about Gremlin is our graph). We are going to start at each of our vertices (or nodes).
g. V.
and follow the outgoing relationships.
out.
the .out step emits a node, which we stuff into m, and every time we see that node again, we increment m with the groupCount method.
groupCount(m).
The loop(2) means we want to loop over the last two steps ( .out.groupCount(m) ) and the closure following it {c++ < 1000} means we will keep looping until our counter reaches 1000. Finally we iterate() until its done.
loop(2){c++ < 1000}.iterate();
At this point “m” is going to look like this:
==> v[5]=304 ==> v[8]=304 ==> v[2]=415 ==> v[6]=757 ==> v[7]=284 ==> v[1]=198 ==> v[4]=260 ==> v[3]=431
Which represents the vertices and how many times we’ve traversed over them. What we want is to get the node we see the most, so we order them in descending order, and get the name of the first element. Don’t let the keySet method bother you, it just turns the keys of the map ( the vertices) into a set from which we get the name properties and grab just the first one.
m.sort{a,b -> b.value <=> a.value}.keySet().name[0];")
So finally we run :
puts "The greatest is #{the_greatest}"
and the answer is… Muhammad Ali. Yup, he is the greatest, confirmed. If you try this at home, you might find Alexander or maybe the “Greatest Show on Earth” turned up to be the greatest. It is in the hands of rand.
Centrality
So that’s fantastic, but why did we want to do this? Well, have you ever wondered who among your Facebook friends is the biggest social butterfly? Who is the most influential of your twitter followers? If the nodes were roads or places, then maybe which are the most traveled or the most used?
These are measures of Centrality which tell you how important a node is in a graph. Now you have what you need to build your own version of Klout or answer important questions about the data in your projects.
What tool did you use to render the graph visually (a.k.a great_graph.png) ?
Neo4j web admin comes with visualization baked in: http://blog.neo4j.org/2011/10/announcing-neo4j-15-boden-bord.html
did you know about this? http://neo4j-challenge.herokuapp.com/ might want to enter!
Yup, already entered: http://gensen.herokuapp.com/show/18 is the Neoflix Application. Are you planning to enter too? I’ll be glad to help.
looks great but how can you sort/filter when you have more than 10k nodes in a tarevrsal (let’s say users involved in some context of your social network)
@Mandi
If you only want to pass through nodes with some property of a certain value you can use .filter, for example people over 190 cm:
.out.filter{it.getProperty(‘height’)> 190}
[…] may remember my previous post on Centrality which showed you how to use Gremlin to find out who the Greatest was? If we try to use the same […]