Java Specialists' Java Training Europehome of the java specialists' newsletter

The Java Specialists' Newsletter
Issue 2052012-08-27 Category: Tips and Tricks Java version: Java 5,6,7

GitHub Subscribe Free RSS Feed

How to Make Your Own Rules

by Dr. Wolfgang Laun
Abstract:
Rule Based Programming, a declarative programming paradigm, is based on logical patterns to select data and associate it with processing instructions. This is a more indirect method than the sequential execution steps of an imperative programming language.

Welcome to the 205th issue of The Java(tm) Specialists' Newsletter. I would like to thank Dr Wolfgang Laun for educating us about rule-based programming in Java. Wolfgang has been using Java for many years and has found a nice way to marry our language with rules. Before we hand over to our friend, Dr Laun, a quick anecdote. I was presenting my masters course in Austria for his team last March. At the end of the training, the administrator asked him a question and addressed him as "Herr Laun". At the same time, Wolfgang asked me for my name tag, by saying "dogtag bitte". But I understood him as "Doktor bitte" (doctor please), so I thought he was telling the administrator to please address him with his title, not just the plain "Herr". This was quite out of character for him, as Wolfgang is a very humble person, so I just stood there, staring at him. He repeated his request of "dogtag bitte", whilst looking at me, obviously wondering why I was not responding. After some more requests, I eventually figured out that he was talking to me.

Over to Dogtag Laun, I hope you will enjoy his writing. At the end he extends an invitation to participate in a rules conference if you are interested in the subject.

How to Make Your Own Rules

There are several projects extending Java towards programming paradigms other than procedural and object-oriented. Scala, for instance, adds functional programming while retaining object-orientedness and full integration with Java. Others head for what is called rule-based programming, one of the several flavours of logic programming. As Heinz was kind enough to invite me to present a digression into this technique I'll guide you through a short tour of the hallmarks of one of these systems, highlighting principles rather than technical details.

Rule-based systems (RBS) have evolved from developments in the area of artificial intelligence, as the foundation for expert systems, but they have outgrown their heritage long since, and gained a very strong foothold in many lines of business and technical applications.

The sample rules presented below are written according to the RBS Drools, an open source project started more than ten years ago, and still going strong. Its architecture adheres to the typical composition: an Inference Engine (IE) is driven by the rules in the rule base (RB) supplied by the programmer, applying them to the data in a storage area known as Working Memory (WM). This triad lies on the bed of a Java application, which is responsible for loading the RB with the appropriate rules, starting a "session" with its WM, supplying Java objects to the session and kick-starting the IE into its operating cycle.

Selecting Facts

One thing the IE does in its operating cycle is the selection of objects from WM, which are called "facts" to distinguish them from all other objects. This selection is based on the condition part of a rule, as shown here:

rule "Hello Dolly"
when
    Person( firstName == "Dolly" )
then
    //...
end
  

As you might guess, the condition is the part between the keywords "when" and "then", selecting a fact of class Person with its field firstName containing "Dolly". Two remarks are indicated:

  1. Don't think that the operator == used in the comparison is a blunder: it's merely a convenience provided by the rule compiler, which converts it to a call to String.equals. (The compiler is clever enough to leave the operator in places where it is suitable.)
  2. Another simplification permits references to fields by their field name, even when the field's permission is private. It just depends on the fact's class declaration following the basic rules for Java Beans.

As you might expect, these Patterns may contain more than a simple comparison. Arbitrary constraints on a fact's fields may be combined in a list, implying a conjunction:

rule "Allo mon vieux"
when
    Person( gender == Gender.MALE, age >= 80, country == "France" )
then
    //...
end
  

Action!

Fact selection would be rather pointless without the capability of doing something with the retrieved objects. This is the domain of the consequence of a rule, written between "then" and "end". Here's the last rule, completed with a simple output statement:

rule "Allo mon vieux"
when
    Person( gender == Gender.MALE, age >= 80, country == "France" )
then
    System.out.println( "Allo mon vieux" );
end
  

A consequence is simply written in Java, with a few extensions, as we shall see in a minute. As you'd expect, this code is executed by the IE once a matching Person object is found.

Linking Data and Actions

We can access located facts in the consequence by adding binding variables to facts and fields. If the requirement is to add the person's name to the salutation and log this action, we can write:

rule "Allo mon vieux X"
when
    $person: Person( $name: name,
                     gender == Gender.MALE, 
                     age >= 80, country == "France" )
then
    System.out.println( "Allo mon vieux " + $name );
    SysLog.log( "Saluted " + $person );
end
    

A Rule is Not an If Statement

It would be too rash for you to be disappointed after concluding that a rule is merely a strange way of coding something that might be expressed by a simple Java if statement. Quite the contrary: a rule is something fundamentally different. For one thing, each rule applies to all matches that can be found in WM, i.e., the IE acts rather according to this Java snippet:

for (Object fact : workingMemory) {
  if (fact instanceof Person) {
    Person person = (Person)fact;
    if (person.getGender() == Gender.MALE &&
      person.getAge() >= 80 &&
      person.getCountry().equals("France")) {
      System.out.println("Allo mon vieux " + person.getName());
      SysLog.log("Saluted " + $person);
    }
  }
}
    

And there is an even more distinctive gap between rule and if statement, which is illustrated by the following rule, where "code" fields contain IATA airport codes:

rule "Flights between two cities"
when
    Airport( $depCode: code, city == "London" )
    Airport( $arrCode: code, city == "Paris" )
    $flight: Flight( departureCode == $depCode, arrivalCode == $arrCode )
then
    System.out.println( $flight );
end
  

This example is suggestive of the full matching power of an IE. Facts are located according to the constraint conditions, so that the first Airport pattern matches airports such as "LHR", "LGW" or "LCY", and the second one is bound to "CDG", "ORY", "LBG", and so on. Finally, the third pattern results in a selection of any flight from London to Paris, and then the consequence may execute for yet another triple of facts.

Comparable Java code might use three loops, and be rather inefficient if the full range of all objects were iterated or require some coding effort if maps were used for efficiency.

How Matching Rules Are Found

But isn't the IE required to perform much the same work to produce the outcome? A condition may have any number of patterns, with many matches per pattern and, consequently, even more combinations of two, three and more matching facts: the math is somewhat frightening. One solution to the many pattern with many object pattern match problem is known as the Rete algorithm, developed by Dr. Charles L. Forgy and first published in 1974. (Dr. Forgy adopted the term "Rete" - the latin word for "net" - because of its use in anatomy to describe a network of blood vessels and nerve fibers, resembling a diagram of the algorithm's fundamental data structures.)

A detailed discussion of the Rete algorithm is out of scope, but its basic concept is to sacrifice memory for speed. In a network compiled from the rules' patterns, references to facts are passed on to positions corresponding to patterns and the "joins" of patterns as long as conditions match or there is at least one accumulated tuple of facts to be joined with. A terminal node represents a rule. Most importantly, this approach permits factoring of alike patterns into a single network node, thereby reducing the evaluation effort. Further improvement is made by indexing, permitting fast access to facts whenever a field is tested for equality with a value.

A Fact's Life: Insert, Update, Retract

So far I haven't said anything about the way objects are made into facts, and what may happen to them in WM except being selected for their participation in the "firing" of a rule, as the execution of its consequence is picturesquely called.

The span of an object's existence as a fact begins when it is inserted, which can be done both in the embedding Java code and from a rule's consequence. The IE will immediately proceed to advance the new fact through the Rete network, which may cause any number of rules to become ready to fire. In the following example, the participating Request fact has fulfilled its mission, and therefore it is retracted in the rule's consequence:

rule "Cheapest flights between two cities"
when
    $request: RequestCheapestFlight( $fromCity: fromCity,
                                     $toCity: toCity )
    Airport( $depCode: code, city == $fromCity )
    Airport( $arrCode: code, city == $toCity )
    $flight: Flight( departureCode == $depCode,
                     arrivalCode == $arrCode,
                     $price: price )
    not Flight( departureCode == $depCode,
                arrivalCode == $arrCode,
                price < $price )
then
    System.out.println( $flight );
    retract( $request );
end
  

In addition to the retract method which is implicitly taken to refer to the WM where this happens, the rule also introduces another conditional element, i.e., the one initiated with "not". This is the equivalent of the negated existential quantifier from predicate logic, denoted by ∄. The result of the conditional element is a simple boolean value: true, if no fact matching the following pattern exists, and false otherwise. It does not contribute another fact to the accumulated tuple (consisting of a RequestCheapestFlight, two Airport facts and a Flight fact; it merely acts as a guard, inhibiting the firing of the rule except for the cheapest flight being the fourth fact in the assembled tuple.

Reactive to Change

A fact may also be destined to have some of its fields being modified when a rule fires. The next rule simply checks whether a Flight fact contains existing airport codes and sets the "checked" flag in the fact:

// Version 1
rule "Check flight data"
when
    $flight: Flight( $depCode: departureCode, 
                     $arrCode: arrivalCode )
    Airport( code == $depCode )
    Airport( code == $arrCode )
then
    modify( $flight ){
        setChecked( true )
    }
end
  

The modify statement is another extension available for coding consequences. It applies one or more method calls, enclosed in braces, to the object given in the parenthesis after "modify". But wouldn't the simple method call

        $flight.setChecked( true );
  

achieve the same thing? The answer to this question exhibits another fundamental principle for RBS that are reactive to fact changes. While the straightforward setter call changes the object representing the fact, the IE is not aware of the effected change. And without being informed about a change, the IE will not reassess the fact's position in the Rete network, which might duly cause other rules to fire.

The principle of being reactive to change comprises insertions, modifications and retractions. Ensuing reevaluation creates new activations, which are queued for firing. Once fired, the activation is gone, otherwise rules would fire infinitely, which would be witless. Note, however, that a pending activation might very well be deactivated if some change falsifies one of the corresponding rule's conditions. Although this sounds dangerous, it's rarely a problem in practice.

You may have wondered why I added the comment "Version 1" to the preceding rule, and perhaps you have guessed it by now. The principle of being reactive to change does not exclude the rule causing the change. Oops! Rule "Check flight data" announces a modification of one of the participating facts (the Flight) and so it is subject to reevaluation. All conditions are still fulfilled and so we promptly receive another activation, the rule fires again, and again, and again. A solid remedy is to provide a more exact definition for the firing of the rule:

// Version 2 - corrected
rule "Check flight data"
when
    $flight: Flight( $depCode: departureCode, 
                     $arrCode: arrivalCode,
                     checked == false )
    Airport( code == $depCode )
    Airport( code == $arrCode )
then
    modify( $flight ){
        setChecked( true )
    }
end
  

More Features

Several features had to be left out for brevity's sake. One of them is the (unnegated) existential quantifier. Its usefulness can be seen from the following rule, a variant of "Flights between two cities":

rule "Are two cities connected"
when
    Airport( $depCode: code, city == "Reykjavik" )
    Airport( $arrCode: code, city == "Vienna" )
    exists Flight( departureCode == $depCode, arrivalCode == $arrCode )
then
    System.out.println( "Direct flight from Reykjavik to Vienna available" );
end
  

Without "exists", the rule would fire for each flight. The quantifier constitutes a simple boolean guard, activating the rule once, and only once, independent from the actual number of flights.

Patterns may not only be written using class names of fact objects: it's equally possible to use interface and abstract class names. This significantly reduces the number of rules whenever it is possible to formulate constraints in terms of properties of the abstract class or the interface.

Conclusion

Regrettably, the rather deadpan examples do not let on about the potential fun that can be had while writing rules. Cracking the famous Zebra Puzzle or any of its variations, implementing heuristic rules for solving Sudoku or hopping from vertex to vertex as in Dijkstra's Algorithm - these are just a few of the pastimes to be had.

Finally, and to be fair, I ought to mention that, besides Drools, there are several other RBSes in Javaland, e.g., Jess, OpenRules, and SMARTS, to name some of them.

Tips and Tricks Articles Related Java Course

Java Master
Java Concurrency
Design Patterns
In-House Courses



© 2010-2014 Heinz Kabutz - All Rights Reserved Sitemap
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. JavaSpecialists.eu is not connected to Oracle, Inc. and is not sponsored by Oracle, Inc.
@CORE_THE_BAND #RBBJGR