New Declare Package in ProM

declareProM has had support for Declare for quite some time.  Support has never been the best, because the Declare package just embedded all of the Declare tool inside ProM, which is of course not desirable.  The old data-structure had some problems, mostly that it was written for a different purpose, which made it have dependencies that a not really necessary for most uses.

I also found some serious semantical problems in Declare, so it was a good reason to redo the Declare implementation from scratch.  As some code still depends on the old code, the new implementation is made in parallel, but it should be noted that the old implementation does not work.  It has semantical problems that I shall outline later and will not give the right results.  Any code building on the old code is broken and should instead use the improved code.

overview3 Layers

The new data structure works on 3 layers.  At the most concrete layer, we have Declare models.  They are simply a collection of tasks and constraints.

Declare constraints are instantiations of constraint templates.  Different tools can handle different sets of constraint templates, and it is even possible to create entirely new templates.  Many tools do not really care about the specific constraint templates, but some do.  Some, also don’t care about the actual templates, but need to know exactly which ones.  For this reason, we collect sets of constraint templates in Declare languages.

Declare also has different semantics.  Different tools operate on different semantics. For example, Declare has a semantics in finite LTL, one in regular expressions, and one in the event calculus.  It also has an informal written semantics.  For this reason, we make the semantics independent of the actual language.

Interface-Based

The new package is entirely interface based. This means that all concepts have an interface and (at least) one implementation. Here, I’ll talk mostly about the interfaces, as the implementations are less important for most.

The important part is that everything takes interfaces as inputs if at all possible, and classes return concrete classes if possible. This means that the assumptions are as weak as possible and guarantees as strong as possible. This philosophy should be adhered to, and no class should ever implicitly assume parameters can be cast to a concrete type.

All generic implementations have the postfix Impl after the interface name (and reside in an .impl subpackage of the interface package). Specialized implementations have a prefix describing the dependencies.

Extensibility

Rather than making a model supporting anything, we make each component instead be arbitrarily extensible. Examples of extension are textual descriptions of constraints, graphical information, data annotations, etc.

We build on an abstract model which makes Declare infinitely extensible. At the bottom level, each element implements the Annotatable interface:
[java] public interface Annotatable {
<T> T get(Class<T> clazz);
boolean has(Class<?> clazz);
}
[/java] This interface exposes two methods. The get method returns an annotation of type T. If no such annotation exists, it is created. It is stored with the object, and two calls to the method will return the same object. The has function returns whether such an annotation exists.

The package supplies default annotations coded in a way which allows simple chaining. The simplest annotation is the Description, and the package also includes an annotation for graphical information and one for grouping elements in logical groups.
[java]public class Description {
public Description() { … }
public Description(final String shortDescription, final String longDescription) { … }
public String getLongDescription() { … }
public String getShortDescription() { … }
public Description setLongDescription(final String longDescription) { … }
public Description setShortDescription(final String shortDescription) { … }
}
[/java] The class illustrates everything necessary for annotations: they should have a default constructor (l. 2) and can contain any number and kinds of fields. We make the setters return the annotation itself, which makes it possible to chain invocations, e.g., as:
[java] public class Description {
constraint.get(Description.class)
.setShortDescription(“Short Description of Constraint”)
.setLongDescription(“Long Description of Constraint.”);
}
[/java] Any class can be an annotation and they need not be registered anywhere. This makes it easy to add information if needed, but also not to worry about information not necessary for the task at hand (e.g., graphical information for a simple mining algorithm).

Any node can have at most one annotation of each type, so one should not rely on generic classes for annotations, such as a generic String or a generic Collection, but rather on custom subclasses.

Semantical Domain

At the bottom level, we have the semantical domain. This will typically be some well-understood mathematical structure, like finite automata. A semantics can be translated to the semantical domain. Mostly, this is not necessary for anybody but those adding new semantics. The basical semantics is (simplified) like this:
[java] public interface Semantics<S extends Semantics<S, Domain>, Domain> {
Domain getDomain();
}
[/java] A Semantics is parametrized with S (just the class implementing Semantics which is needed for technical reasons) and Domain, the semantical domain. Most of the code in the new Declare package uses the implementation RegexpSemantics which uses regular expressions as intermediate representation and has finite automata as domain. The skeleton of the RegexpSemantics looks like this:
[java] public class RegexpSemantics implements Semantics<RegexpSemantics, Automaton> {
public RegexpSemantics(final RegExp<Task> semantics) { … }
}
[/java] The Automaton and RegExp classes come from the ltl2aut library from Declare.  The parameter is a regular expression over Tasks (which we get to later).

Constraint Templates and Languages

At this level, we have constraint templates and languages which are collections of such.  A declarative langauge is just a collection of templates which can be annotated:
[java] public interface DeclarativeLanguage<S extends Semantics<S, Domain>, Domain> extends
Collection<ConstraintTemplate<S, Domain>>, Annotatable {
}
[/java] We note that we explicitly carry around the semantics and the semantical domain. More interesting is the constraint templates:
[java] public interface ConstraintTemplate<S extends Semantics<S, Domain>, Domain>
extends Iterable<Parameter>, Parametrized<Constraint<S, Domain>>, Annotatable {
public String getName();
public Parametrized<S> getSemantics();
}
[/java] Constraint templates are annotatable and have a name. For convenience, they also allow iteration over their parameters. The interesting part is that they can return a parametrized version of their semantics. This is a version which has not been specialized with its actual parameters yet. Constraint templates are themselves parametrized versions of Constraints. Anything parametrized can be instantiated leading to the concrete version:
[java] public interface Parametrized<Domain> {
Domain instantiate(Map<Parameter, ? extends Collection<Task>> instantiation);
}
[/java] Thus, we can get a parametrized semantics and instantiate that to a semantics, or we can instantiate the template to a concrete constraint. The only non-generic classes implementing the interfaces in the package are implementations of constraint template and a parametrized semantics, as they need to depend on the actual semantical domain.

A parameter is just an encapsulated name (which can be annotated if needed).

The package provides a simple implementation of the standard Declare language with all constraints, and also support for deriving one language from another.

Declare Models

Declare models are fairly simple: collections of tasks and constraints.  Each are tied to their semantical domain and the language used.  No checks are performed to ensure this for performance.  For example, all constraints of a model should be from the same language, and all tasks used for actual parameters should live in the model.  A simplified version of the DeclareModel interface is:
[java] public interface DeclareModel<S extends Semantics<S, Domain>, Domain> extends Annotatable {
public DeclarativeLanguage<S, Domain> getLanguage();
String getName();
Collection<Task> tasks();
Collection<Constraint<S, Domain>> constraints();
Domain instantiate();
}
[/java] This should be fairly obvious by now. A model has a name and comes from a certain language. It has a collection of tasks and constraints. Each constraint is parametrized with a semantics and a semantical domain.

For convenience (and because it can be made in a generic way), it is possible to directly instantiate a model.

We notice that the interface does not allow modifying models. They should be constructed by either an importer, an editor or a miner and never directly modified (except annotated). Model constructors need access to concrete implementations anyway (or a factory, but we have decided not to make such as that would just clutter the design at the present).

A constraint is also fairly simple:
[java] public interface Constraint<S extends Semantics<S, Domain>, Domain> extends Annotatable {
S getSemantics();
Domain instantiate();
ConstraintTemplate<S, Domain> getTemplate();
Map<Parameter, Collection<Task>> getInstantiation();
}
[/java]

We can get the semantics and domain (for convenience) for each constraint. We can also get the underlying instantiation and template.

A Task just has a name.

That concludes the data structure. We just have a model which has tasks and constraints. All are parametrized with a semantics. Constraints are instantiations of constraint templates, which are collected in languages. Semantics are abstract objects which can be turned into a semantical domain for further processing.

More on Declare Semantics

Recently, we released CPN Tools 4 with Declare support. This prompted a major cleanup of the code base. This also lead to a survey of available tools and implementations. This revealed that no semantics for Declare published hitherto is correct.

Declare has traditionally been founded on finite linear temporal logic (just LTL in the following). This seems like a nice idea, because there is a bunch of theory about LTL already and it should be possible to leverage this. Also, LTL is a nice and high-level language and several implementations exist. The only problem is, this is not really true.

LTL in general talks about the future of a single execution. It is possible to say that if A is executed, then B has to be executed later. LTL normally talks about infinite such executions, which means they never have to worry about what happens at the end of the execution. Declare instead concerns itself with finite executions, so we should look at a finite segment of LTL.

The Problem of Finite LTL

Finite LTL has two problems: the ending and the beginning. As executions are terminating, we need to be able to say what is acceptable at the end of executions. Also, as executions are ending, we need to talk about what happens for an empty execution. The first problem can be ignored using adequate amounts of hand-waving and ignoring the next operator. The second problem can be ignored by saying that we have no semantics for empty traces.

Neither of these work for Declare. Many constraints need the next operator for their semantics (if A happens, A cannot happen again until B has happened and things like that). We also need to talk about the empty trace when doing on-the-fly analysis.

We can solve the problem of next by splitting it up into a weak and a strong variant. The strong variant says “there is a next step and in that step something holds,” whereas the weak version only talks about what happens if there is a next step. This was never treated formally for Declare, but I made an implementation taking this into account.

The other problem is harder to deal with. It is actually not possible to give a semantics for finite LTL which talks about the empty trace. It is just not possible. What you have to do is to create a weak and a strong semantics and switch those around. This is solvable, but I did not feel like redoing the entire implementation. Also, it makes it doubtful that most theoretical work on runtime analysis is sound. It just generalized infinite LTL results to the finite case without justification, and these are substantially different.

The LTL semantics for Declare is incorrect when talking about the start and end of executions. It cannot be used for on-the-fly analysis, and the semantics is broken for several corner cases. No LTL semantics given so far is correct.

The Escape: Regular Expressions

While it is possible to make an LTL semantics for Declare that is healthy, I decided to drop that and instead use regular expressions. They are well-understood and I could make a translator in less than a day. I could make the translator to the same semantical domain class as before, so I could reuse any of my tools building on that. Tools relying on alternative implementations are broken, however.

We were not the first to use regular expressions for the semantics of Declare, but have found mistakes in all published regular expression semantics for Declare.

Conclusion

These considerations are why we split out the semantics as something separate. If somebody want to continue using LTL, they can. They can even use the same semantical domain and tools depending on the semantical domain and not on the actual semantics would still work. It is also possible to go with another semantics (even just a dummy semantics if the actual semantics does not matter).

The new Declare package is called Declare2 and is available in nightly builds of ProM. You can browse the source here. This package is used to implement the UnconstrainedMiner, which is also available in nightly builds of ProM.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.