Dynamic Rule Based Decision Trees in Neo4j – Part 2

A couple of weeks ago I showed you how to build a dynamic rule based decision tree in Neo4j. It was pretty simple and used an Expression Evaluator to determine if a set of parameters in an expression was true or false. Based on that answer it decided where to go.

But what if we had more than just true or false? What if we wanted to make our Rule nodes have more than 2 options? Today I am going to show you how to do just that… but please make sure you have read part 1 already.

Many years ago, Chappelle’s Show did a skit on rapper “Lil Jon”, making fun that he was famous for only saying What? Yeah! and Okay! He was of course in on the joke and people still bring it up to him. On an episode of Ridiculousness, he mentioned that it will stay with him until his funeral. Today we are going to make a decision tree on a tweak of his remarks. We will ask 3 questions, each having 4 possible answers in this fashion:

Was Lil Jon a good man? option 1: Yeah -> incorrect option 2: What? -> next question option 3: Okay -> incorrect unknown: Unknown -> unknown I said, was he a good man? option 1: Yeah -> next question option 2: What -> incorrect option 3: Okay -> incorrect unknown: Unknown -> unknown May he rest in peace option 1: Yeah -> incorrect option 2: What? -> incorrect option 3: Okay -> correct unknown: Unknown -> unknown

In Neo4j, we will create a new Tree node and call it “funeral”, 3 new Answer nodes (correct, incorrect, and unknown), and 3 Rule nodes, one for each question. It will look like this:

If you recall, our Rule nodes had an expression statement that was evaluated into a boolean, and were created this way:

   
    CREATE (gender_rule:Rule { parameter_names: 'age,gender', 
                               parameter_types:'int,String', 
                               expression:'(age >= 18) && gender.equals(\"female\")' })

Instead of a boolean expression, we are going to evaluate a script that returns a String, so our Rule nodes now looks like this:

   
    CREATE (good_man_rule:Rule { parameter_names: 'answer_1', 
                                 parameter_types:'String', 
                                 script:'switch (answer_1) { 
                                         case \"yeah\": return \"OPTION_1\"; 
                                         case \"what\": return \"OPTION_2\"; 
                                         case \"okay\": return \"OPTION_3\"; 
                                         default: return \"UNKNOWN\"; }' })

The switch script above is taking an “answer_1” parameter, evaluating it, and returning a numbered option or UNKNOWN. To make this work, we need to add a second traversal expander with a few minor changes. Instead of an ExpressionEvaluator, we are going to use a ScriptEvaluator:

   
public class DecisionTreeExpanderTwo implements PathExpander {
    private Map<String, String> facts;
    ScriptEvaluator se = new ScriptEvaluator();

    public DecisionTreeExpanderTwo(Map<String, String> facts) {
        this.facts = facts;
        se.setReturnType(String.class);
    }

In our expand method, instead of expanding to IS_TRUE or IS_FALSE, we are going to expand to the relationship type returned by the choosePath method:

  
if (path.endNode().hasLabel(Labels.Rule)) {
     try {
        return path.endNode().getRelationships(Direction.OUTGOING, choosePath(path.endNode()));
     } catch (Exception e) {
        // Could not continue this way!
        return Collections.emptyList();
     }
}

The choosePath method, loads up the arguments passed in, sets them, cooks the script and returns a relationship type named after the result of the evaluation of the script:

 
    private RelationshipType choosePath(Node rule) throws Exception {
        // Get the properties of the rule stored in the node
        Map<String, Object> ruleProperties = rule.getAllProperties();
        String[] parameterNames = Magic.explode((String) ruleProperties.get("parameter_names"));
        Class<?>[] parameterTypes = Magic.stringToTypes((String) ruleProperties.get("parameter_types"));

        // Fill the arguments array with their corresponding values
        Object[] arguments = new Object[parameterNames.length];
        for (int j = 0; j < parameterNames.length; ++j) {
            arguments[j] = Magic.createObject(parameterTypes[j], facts.get(parameterNames[j]));
        }

        // Set our parameters with their matching types
        se.setParameters(parameterNames, parameterTypes);

        // And now we "cook" (scan, parse, compile and load) the script.
        se.cook((String)ruleProperties.get("script"));

        return RelationshipType.withName((String) se.evaluate(arguments));
    }

After we compile, replace the stored procedure, restart Neo4j and create our new example decision tree, we can run the following cypher to see if it works by passing in the right sequence of answers:

   
CALL com.maxdemarzi.traverse.decision_tree_two('funeral', {answer_1:'what', 
                                                           answer_2:'yeah', 
                                                           answer_3:'okay'})
yield path 
return path

… here is our answer:

But what if we didn’t know? What if we answered “what” to the first question and then were not sure how to answer the rest of the questions?

  
CALL com.maxdemarzi.traverse.decision_tree_two('funeral', {answer_1:'what', 
                                                           answer_2:'', 
                                                           answer_3:''}) 
yield path 
return path

Our graph would return “Unknown” but it would also return the path of how we got to Unknown and in that Path is the question we didn’t know the answer to, so we can ask our user again.

Pretty neat right? With this updated procedure at our disposal we can make really complicated decision trees with multiple branching options with ease. As always the source code is on github. Enjoy.

Update: Read the paper on Decision Streams and checkout the implementation.

Max De Marzi

Graphs, Graphs, and nothing but the Graphs