Dynamic Rule Based Decision Trees in Neo4j – Part 3

At Graph Connect this year I did a short lightning talk on building Decision Trees using Neo4j. The slides are up and down below, the video is up. After the talk, someone asked, “What if we don’t know all the facts ahead of time?”. They wanted to be able to step through the tree and ask for the facts as needed at each step. So today we’re going to see how to do that.

If you haven’t read part 1 and part 2, then maximize the slides below and go through those first.

Now if we go back to our original example, the bouncer at the bar needs to ask you a few questions before they let you in or send you away. The first one of these was your age. If you were over 21, you could be let in immediately. If not, they would ask for your gender and maybe let you in. But there was no need to ask for both questions right away, just on a need to know basis. How can we model this in a graph?

We can add a new type of node I’m calling “Parameter”. In this case we have two. One for the “age” and one for the “gender”. They are linked to the Rule node(s) that require them to be set, and also have validation criteria. When we ask to be let into the bar without any parameters set, we should arrive at the “Over 21?” rule, realize we can’t answer it, and therefore be sent via the REQUIRES relationship to the Parameter “age” and stop there. So let’s build a stored procedure to handle this. We will call this one “stepwise” and it takes the same arguments as before:

 
    @Procedure(name = "com.maxdemarzi.stepwise.decision_tree", mode = Mode.READ)
    @Description("CALL com.maxdemarzi.stepwise.decision_tree(tree, facts) - traverse stepwise decision tree")
    public Stream<PathResult> traverseStepWiseDecisionTree(@Name("tree") String id, @Name("facts") Map<String, String> facts) throws IOException {
        // Which Decision Tree are we interested in?
        Node tree = db.findNode(Labels.Tree, "id", id);
        if ( tree != null) {
            // Find the paths by traversing this graph and the facts given
            return decisionPath(tree, facts);
        }
        return null;
    }

All it does really is call decisionPath which is shown below. It creates a new Expander and Evaluator using the facts that were passed in.

 
    private Stream<PathResult> decisionPath(Node tree, Map<String, String> facts) {
        TraversalDescription myTraversal = db.traversalDescription()
                .depthFirst()
                .expand(new DecisionTreeExpander(facts))
                .evaluator(new DecisionTreeEvaluator(facts));

        return myTraversal.traverse(tree).stream().map(PathResult::new);
    }

Unlike before where we had one Evaluator for the entire procedure, now we need a new one for every query since it uses the facts to decide valid paths. If we reach an Answer node, our traversal is done. If we reach a Parameter node, we check to see if we have this fact. If we have it, we don’t include this path, but if we don’t have it then we are missing a parameter and need the user to answer it.

 
    @Override
    public Evaluation evaluate(Path path, BranchState branchState) {
        Node last = path.endNode();

        // If we get to an Answer then stop traversing, we found a valid path.
        if (last.hasLabel(Labels.Answer)) {
            return Evaluation.INCLUDE_AND_PRUNE;
        }

        if(last.hasLabel(Labels.Parameter)) {
            // If we get to a Parameter, check if its missing in our facts, if so we found a valid path.
            if (facts.containsKey(last.getProperty("name", "").toString())) {
                return Evaluation.EXCLUDE_AND_PRUNE;
            }

            return Evaluation.INCLUDE_AND_PRUNE;
        }

        // If not, continue down this path if there is anything else to find.
        return Evaluation.EXCLUDE_AND_CONTINUE;
    }

In our expander, we first check to see if we have rules to evaluate. If we get to a Rule, we have a fork in the road. In Part 1 of the series I showed you how to deal with a standard decision tree which has only two outcomes, True or False. In Part 2 of the series we looked at dealing with multiple choice outcomes as with a decision stream. This expander is dealing with both. If for some reason though it is not able to evaluate the expression or script (because they were missing or let’s say somebody forgot to validate the parameters before passing them in), it follows the REQUIRES relationship to the Parameter node.

 
    @Override
    public Iterable<Relationship> expand(Path path, BranchState branchState) {
        Node last = path.endNode();
        // If we have Rules to evaluate, go do that.
        if (last.hasRelationship(Direction.OUTGOING, RelationshipTypes.HAS)) {
            return last.getRelationships(Direction.OUTGOING, RelationshipTypes.HAS);
        }

        if (last.hasLabel(Labels.Rule)) {
            try {
                if (last.hasProperty("expression")) {
                    return last.getRelationships(Direction.OUTGOING, trueOrFalse(last, facts));
                } else {
                    return last.getRelationships(Direction.OUTGOING, choosePath(last, facts));
                }
            } catch (Exception ignored) { }
            return last.getRelationships(Direction.OUTGOING, RelationshipTypes.REQUIRES);
        }
        // Otherwise, not sure what to do really.
        return Collections.emptyList();
    }

Our trueOrFalse method returns a RelationshipType. We are looking for IS_TRUE or IS_FALSE. We perform the same magic as before and cook our expression then convert the boolean result to an uppercase string to build what we need. Yes, it’s a little hacky, but it’s nice.

 
    static RelationshipType trueOrFalse(Node node, Map<String, String> facts) throws Exception {
        ExpressionEvaluator ee = new ExpressionEvaluator();
        ee.setExpressionType(boolean.class);

        String[] parameterNames = Magic.explode((String) node.getProperty("parameter_names", node.getProperty("name", "")));
        Class<?>[] parameterTypes = Magic.stringToTypes((String) node.getProperty("parameter_types", node.getProperty("type", "")));

        // Fill the arguments array with their corresponding values
        Object[] arguments = new Object[parameterNames.length];
        for (int j = 0; j < parameterNames.length; ++j) {
            arguments[j] = Magic.createObject(parameterTypes[j], facts.get(parameterNames[j]));
        }

        // Set our parameters with their matching types
        ee.setParameters(parameterNames, parameterTypes);

        // And now we "cook" (scan, parse, compile and load) the expression.
        ee.cook((String) node.getProperty("expression"));

        return RelationshipType.withName("IS_" + ee.evaluate(arguments).toString().toUpperCase());
    }

The choosePath method does almost the same thing, but it evaluates a script instead of an expression:

 
ScriptEvaluator se = new ScriptEvaluator();
se.setReturnType(String.class);
...
// And now we "cook" (scan, parse, compile and load) the script.
se.cook((String) node.getProperty("script"));

return RelationshipType.withName((String) se.evaluate(arguments));

We will use the same idea when validating a parameter to see if it’s valid or not in the isValid method (not shown for brevity). Let’s give it a whirl by building the tree using Cypher. If your gender is not “male” or “female”, apologies, but you are not getting into the bar unless you are at least 21.

 
CREATE (tree:Tree { id: 'bar entrance' })
CREATE (over21_rule:Rule { name: 'Over 21?', parameter_names: 'age', 
                           parameter_types:'int', expression:'age >= 21' })
CREATE (gender_rule:Rule { name: 'Over 18 and female', parameter_names: 'age,gender', 
                           parameter_types:'int,String', 
                           expression:'(age >= 18) && gender.equals(\"female\")' })
CREATE (answer_yes:Answer { id: 'yes'})
CREATE (answer_no:Answer { id: 'no'})
CREATE (tree)-[:HAS]->(over21_rule)
CREATE (over21_rule)-[:IS_TRUE]->(answer_yes)
CREATE (over21_rule)-[:IS_FALSE]->(gender_rule)
CREATE (gender_rule)-[:IS_TRUE]->(answer_yes)
CREATE (gender_rule)-[:IS_FALSE]->(answer_no)    
CREATE (age:Parameter { name:'age', type:'int', prompt:'How old are you?', 
                        expression:'(age > 0) && (age < 150)'}) 
CREATE (gender:Parameter { name:'gender', type:'String', prompt:'What is your gender?', 
                           expression: '\"male\".equals(gender) || \"female\".equals(gender)'} ) 
CREATE (over21_rule)-[:REQUIRES]->(age)
CREATE (gender_rule)-[:REQUIRES]->(age)
CREATE (gender_rule)-[:REQUIRES]->(gender)

Let’s start with no facts:

 
CALL com.maxdemarzi.stepwise.decision_tree('bar entrance', {});

… we return the age parameter which prompts “How old are you?”:

Next, let’s say we are 20:

 
CALL com.maxdemarzi.stepwise.decision_tree('bar entrance', {age:'20'});

… we return the gender parameter which prompts “What is your gender?”:

Finally let’s say our gender is female:

 
CALL com.maxdemarzi.stepwise.decision_tree('bar entrance', {age:'20', gender:'female'});

…and we’re in the Bar! If you want to give it a whirl, the code is on github.

Tagged , , , , , , , , ,

One thought on “Dynamic Rule Based Decision Trees in Neo4j – Part 3

  1. inonixn says:

    For these kind of problems , isn’t it better to go with RETE network? Have you done anything similar to maintaining a RETE net in neo4j?

Leave a comment