maxdemarzi

Oct 11 2012

Hubway Data Visualization Challenge with Neo4j

Michael Hunger imported the Hubway Challenge dataset into a Neo4j graph database, and made it available for us to play with.
Continue reading →

Tagged cypher, d3.js, graph database, heroku, javascript, neo4j, relationship graph, visualization

Oct 03 2012

1 Comment

Problems, Random

Hunting Trolls with Neo4j!

Allison Sparrow shared a link to Patentula, a company interested in finding better ways to explore patent data and hunt patent trolls. What caught my attention is this quote from the video below:

What we tried to do with it, is bypass any sort of keyword processing in order to find similar patents. The reason we’ve done this is to avoid the problems encountered by other systems that rely on natural language processing or semantic analysis simply because patents are built to avoid detection by similar keywords…we use network topology (specifically citation network topology) to mine the US patent database in order to predict similar documents.

Continue reading →

Tagged graph database, network, network topology, patent, relationship graph

Sep 07 2012

1 Comment

Problems, Random

I’ve had “Networks, Crowds, and Markets: Reasoning About a Highly Connected World” by David Easley and Jon Kleinberg on my bookshelf for a few months now, and a conversation with a client reminded me that I hadn’t finished reading it (barely started really). It is available from Cambridge University Press, but also on the web and in PDF format.
Continue reading →

Tagged centrality, cluster, graph, graph database, max flow, network, pagerank, six degrees of kevin bacon

Aug 17 2012

23 Comments

Cypher, Heroku, Neography, Visualization

NeoSocial: Connecting to Facebook with Neo4j

Social applications and Graph Databases go together like peanut butter and jelly. I’m going to walk you through the steps of building an application that connects to Facebook, pulls your friends and likes data and visualizes it. I plan on making a video of me coding it one line at a time, but for now let’s just focus on the main elements.
Continue reading →

Tagged d3.js, github, graph database, heroku, javascript, neo4j, relationship graph, ruby, visualization

Aug 16 2012

2 Comments

Random

Getting a Big Neo4j Test Box for Cheap!

When embarking on a new Neo4j project, one of the things you have to figure out is where to run it. Most of the time the answer is just your laptop. Other times, using Heroku works great. However, if you are at the stage of your testing where you have billions of nodes and relationships, you need something a little bigger.

If you are not ready to commit to purchasing a 100k server for testing, then I suggest you borrow one for a short time. You can try to spin up an Amazon EC2 instance, the high memory large ones go up to 60 gigs of RAM. But what if you need more? Lots more?
Continue reading →

Tagged big data, graph database, neo4j

Aug 13 2012

6 Comments

Random

Neo4j Internals

An overview of Neo4j Internals

View more presentations from Tobias Lindaaker

It is interesting to see how node, relationship and property records are stored differently on disk and in the cache.

It is all linked lists of fixed size records on disk. Properties are stored as a linked list of property records, each holding a key and value and pointing to the next property. Each node and relationship references its first property record. The Nodes also reference the first relationship in its relationship chain. Each Relationship references its start and end node. It also references the previous and next relationship record for the start and end node respectively.
Continue reading →

Tagged graph database, neo4j, nosql

Aug 10 2012

11 Comments

Problems, Random

Summarize Opinions with a Graph – Part 1

How does the saying go? Opinions are like bellybuttons, everybody’s got one? So let’s say you have an opinion that NOSQL is not for you. Maybe you read my blog and think this Graph Database stuff is great for recommendation engines and path finding and maybe some other stuff, but you got really hard problems and it can’t help you.

I am going to try to show you that a graph database can help you solve your really hard problems if you can frame your problem in terms of a graph. Did I say “you”? I meant anybody, specially Ph.D. students. One trick is to search for “graph based approach to” and your problem.
Continue reading →

Tagged graph, nosql, word graph

Jul 18 2012

1 Comment

Cypher

HCIR 2012

A tweet from RiparianData caught my eye the other day:

https://twitter.com/RiparianData/status/222319315800698880

I built getvouched.com with this idea of “expert and expertise discovery” using skill based vouching adjusted by the distance from searcher to target as a way to find rank. So I dug in and found out that Human-computer Information Retrieval (HCIR) combines research from the fields of human-computer interaction (HCI) and information retrieval (IR), placing an emphasis on human involvement in search activities.

The HCIR challenge for this years symposium includes “hiring,” “assembling a conference program,” and “finding people to deliver patent research or expert testimony” as summarized by Patrick Durusau.
Continue reading →

Tagged cypher, github, graph database, hcir, neo4j, relationship graph

Jul 03 2012

1 Comment

Neography, Random

Graph Generator

Update: Code to this project is available on Github.

In the US Air Guitar Championships, competitors use their talents to fret on an “invisible” guitar to rock a live crowd and deliver a performance that transcends the imitation of a real guitar and becomes an art form in and of itself. The key factor that determines the winner is having the elusive quality of “Airness“. When considering using Neo4j in a project, one of the key considerations is having a domain model that yields itself to a graph representation. In other words, does your data have “Graphiness“. However, it didn’t dawn on me until recently that when starting a proof of concept, you probably don’t have that data (or enough of it) or maybe your security guys won’t let you within 100 miles of the company production data with this newfangled nosql thingamajig.
Continue reading →

Tagged big data, graph database, neo4j, ruby

Jul 02 2012

6 Comments

Java

Batch Importer – Part 3

At the end of February, we took a look at Michael Hunger’s Batch Importer. It is a great tool to load millions of nodes and relationships into Neo4j quickly. The only thing it was missing was Indexing… I say was, because I just submitted a pull request to add this feature. Let’s go through how it was done so you get an idea of what the Neo4j Batch Import API looks like, and in the next blog post I’ll show you how to generate data to take advantage of it.
Continue reading →

Tagged big data, github, graph database, java, neo4j

Max De Marzi

Graphs, Graphs, and nothing but the Graphs

Author Archives: maxdemarzi

Hubway Data Visualization Challenge with Neo4j

Hunting Trolls with Neo4j!

Networks, Crowds, and Markets

NeoSocial: Connecting to Facebook with Neo4j

Getting a Big Neo4j Test Box for Cheap!

Neo4j Internals

Summarize Opinions with a Graph – Part 1

HCIR 2012

Graph Generator

Batch Importer – Part 3