UnconstrainedMiner: A Fast Tool for Declarative Process Mining

The UnconstrainedMiner is a tool for mining Declare models.  The tool makes no assumptions about the output and can mine all constraints for a log.  The tool is not intended for end-users, but as an intermediate step for getting all constraints for further processing.

The tool is at the same time extremely dumb and extremely smart.  It applies standard techniques in a manner that makes it very efficient, despite not forcing the hand of the user.

All constraints, even those with 5 or more input parameters are systematically checked, and off-the-shelf techniques, such as symmetry reduction and parallelism are exploited, making mining possible for even very complex process logs.  In addition, we employ a super-scalar mining technique, allowing us to mine several constraints at once at no extra cost.

Performance

This means, the UnconstrainedMiner can mine all constraints for the BPI challenge 2012 log1 in approximately 25 seconds, and can mine the 20 essential constraints in a little over one second.

About the Miner

The unconstrained miner is written in Java and looks like this:

configThe miner can compute whether a constraint is triggered, but does not rely on this.  It can do filtering, but performance is not affected by this.  Users can elect which constraints to mine, but that was mostly for earlier versions of the tool, where comprehensive mining would take upwards of 10 minutes, and some constraints intractable.  The tool displays the LTL semantics of each constraint along with an automaton representation.

While running, the tool shows current progress, which can be computed from the complexity of constraints.  It automatically groups constraints in an optimal way and provides diagnostics during execution.  Progress is reported for each worker thread (here 2) and overall.

runningAfter processing, a report is given.  The report reports each identified constraint along with information about how often it is satisfied, how often it was triggered (for several ways of being triggered).  This can be imported to, e.g., Excel for further processing:

excelSymmetry reduction automatically identifies if a constraint allows replacing one parameter for another.  This is the case for, e.g., choice, where the exact order of parameters does not matter.  Then actual events are canonicallized, leading to an exponential reduction in complexity.

Parallelism

Parallelism splits up the log and allows any number of CPUs to process the log in parallel.  As the miner is very efficient, overhead is negligible, so we do not need to do anything smart, and just redo any computation in each thread with minor overhead.  This makes our parallel miner scale almost linearly.  As we do not try to be smart, we can easily combine results from any number of worker threads.

Super-scalar Mining

Super scalar mining intelligently groups constraints.  It identifies constraints with a relationship (basically logical implication), and groups together constraints which does not increase computation time.  This means that constraints are grouped if their symmetry groups are compatible.  The gist is that we avoid to ever split a symmetry group and add another parameter, both of which increase computation time.  For the full Declare language, we find 3 symmetry groups:

dep1

dep2

 

dep3

For each group, the computed most complex symmetry group is displayed.  For each constraint the individual symmetry group is shown.  Arcs mean implication.

Super-scalar mining is implemented using colored automata, where the acceptance condition for each constraint is maintained.  This means that using a single check, we can keep track of several constraints.  In theory this comes at an exponential cost in memory, but due to the intelligent grouping, we can still mine everything in 12 MB of memory.

Download

Download the miner from here.  Simply launch the jar file with a recent version of Java. The tool opens any uncompressed XES log file (extension .xes).  The current version only computes limited support when using the super-scalar mining techniques, and results are not aggregated when using parallel mining.

Changelog

1.1
Support for computing whether constraints are triggered,
switch to a regular expression semantics,
custom constraints, and
speed improvements.
1.0>
Initial release.

This video gives a quick demo of the functionality of UnconstrainedMiner:

  1. We use the 2012 log as the 2013 log is simple from the point of view of Declare mining. []

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.