In All Fairness…

[bibshow sort=author order=asc file=conferences.bib,workshops.bib,other.bib]Yesterday Boudewijn and I were talking about an interesting issue in CPN Tools.  The problem was that using performance analysis yielded different results on two different CPN models.  To understand why that is, let us discuss fairness in simulation of CPN models.

Fairness in simulation of CPN models is a difficult beast.  Consider this example:

Example 1

Very simple model, right?  In CPN Tools we have 1% chance of binding n to 1 and 99% chance of binding it to 2.  Try making this model in CPN Tools and execute 10 steps.  I’m 90% sure, you get a marking of 1`1++89`2 on the place.  This is because in CPN Tools, fairness is ((Roughly, the truth is more complicated.)) based on tokens.  For this example, this seems fair.

Now, consider the following two examples:

Example 2

You would think that executing 10 steps of t1 randomly and doing the same for t2 would yield similar results (i.e., that A! and A2 would have the same marking with high probability).  You’d be wrong.  In fact, t1 has a high probability (66%) of executing with n=1 whereas t2 only has 1% chance of making that choice.  That’s because CPN Tools in the first case choses a token from B1 and searches for a token on A1 to match, whereas t2 does it exactly the other way around.  Yet, the two models have the same enabled bindings and the same state-space.  All of this is caused by a really minor modification of the model.

Adding more transitions to the game ((Which could be a token-game 😉 )) only complicates things further:  CPN Tools randomly selects an enabled transition and executes it; consider the example:

Example 3

If you execute this for 10 steps, I’m 99.90% sure, you get a marking of 90`2; the reason is that the two transitions are executed with equal probability.

All of this confusion stems from the fact that the simulator in CPN Tools is designed to be fast.  Using this strategy allows us to execute transitions at a sustained rate of around 22 million transitions a minute on a modern computer.  The implementation roughly chooses a random transition and searches for an enabled binding in a random order.  If one is found, it is executed and the process restarts.  Again, I’m simplifying here; check Torben and Tommy’s master thesis [bibcite key=simulatorthesis] for all details.

All of this is of course horrible if you are doing simulation-based analysis.  The problem Boudewijn and I were talking about were an example similar to Example 2 above.  Boudewijn had made one solution and got a certain throughput for a production plant and some students had made the other and got a statistically significantly different result (IIRC, Boudewijn got around 90 with a 95% confidence interval of ±5, the students got 120±25).

We can get a more predictable simulator by instead simulating using fairness on binding elements.  This means instead of checking tokens in a random order and executing the first enabled binding, we compute all enabled bindings and choose among those.  For Example 2, this would mean we would get the same marking of the A places with high probability (>99.99%), namely 90`2.  This is because, initially, the two transitions would both have 3 enabled bindings (n=a=1, b=1; n=a=1, b=2; and n=a=2, b=3), which would be chosen with equal probability.  While not necessarily more fair than either of the old results, this is at least consistent and predictable.

Choosing fairly among enabled bindings, makes simulation more predictable but also slower, as we have to compute all enabled bindings of a transitions instead of just executing one at random.  Consider the example:

Example 4

In this example, the expression List.tabulate(1000, fn n=>n) produces the multi-set 1`0++1`1+1`2++…++1`999, so all of the places have 1000 tokens.  Thus, the transition has 1 billion unique enabled bindings.  Of course, computing these would be expensive (at least take a couple seconds), and require quite a bit of memory to store (at least 12 GB).  Yet CPN Tools can execute the 1000 enabled steps in less than a second because it does not compute all enabled bindings before execution.  In this example, in the initial state, we would compute 999,999,999 bindings which are not used, and 2,997,000 of those will actually never be enabled again.

In addition to not always being the most efficient choice to choose fairly among all enabled bindings, it may also not coincide with what we perceive as fair.  For example, consider again Example 1 above.  Here, it is intuitively more fair to give higher probability to choosing n=2 than to choosing n=1, where fair choice among enabled bindings would assign them the same probability.

In the model in Example 3 above, it may also not seem fair that t1 and t2 are chosen with the same probability, as t2 can be executed for 99 tokens, but t1 only for one.  Thus, we may want to extend the fairness to be a fair choice between all enabled binding elements of the model, i.e., we do not first choose a transition and then choose one of its enabled bindings, we compute all enabled bindings of all transitions and execute one of them randomly.  This is of course even less efficient, as we may have to cope with situations like Example 4 even if we want to execute a completely different transition.

Now, instead of purely speculating about which was better, I decided to test it out.  Actually, the test was done completely independently of these discussions for the paper [bibcite key=priority], but it exactly tests the three approaches explained here: the fast mode with fairness on tokens, the method with local fairness, and the method with global fairness.  The results are summarized in the table below.  The first three models are small toy-examples included with CPN Tools and the remaining ones are real-life models created in various projects (ERDP in [bibcite key=kris:erdp], DYMO in [bibcite key=dymo], and the various OSes in [bibcite file=conferences.bib key=osprotocol]).  The models with a prime are identical to the corresponding non-primed versions, except the marking has been changed to include many more tokens.  The algorithm called All Bindings corresponds to computing all bindings of all enabled transitions and randomly picking one, Priority Sorted picks a transitions at random, computes all enabled bindings, and executes one at random, and Algorithm 4 is the “old” fast implementation in CPN Tools.

Experimental Results

We see that the speed does indeed degenerate with the mode involved algorithms.  The result is in fact even worse than shown here, as Algorithm 4 does additional work during simulation.  The degeneration is worst for toy-examples, though, as they have a lot of tokens and enabled bindings, whereas realistic models typically have fewer tokens, and thus lose less from the computation.  This is good news, as it indicates that it may be possible to make an implementation of the notions of fairness discussed above.

For the record, the experiments were done using the code displayed here.

[/bibshow]

6 thoughts on “In All Fairness…

  1. Example 2 leaves the impression that the transition guard determines which place is used to pick a token from where to search for a matching token. Is this correct? For Boudewijns assignment I have two copies of a seemingly similar (besides cosmetical differences) model. These copies consistently produce different simulation results. The model that produces the ‘correct’ results was obtained by removing a place from the ‘incorrect’ model and restoring it using the undo functionality.

    1. That’s footnote 1 😉 The above explanation is a bit simplistic for clarity.

      Consider this altered version of Example 4:

      This model has 1,000 enabled bindings, but they are among 1,000,000,000 possible assignments of tokens. This means, if we just naively try out random tokens for each arc, enabling computation is very difficult. A human can easily see that the best way is to consume a tokens from one of the places and check that it is available on the others as well, making binding much easier.

      CPN Tools instead generates a binding strategy to find an enabled binding of all transitions. This binding strategy is devised individually for each transition. The strategy can involve picking a token at random if none of the variables on the arc are currently bound, checking if a token is available on an arc if some or variables used on an arc have been bound, testing a guard, and binding using the guard. Each of these strategies have a cost, for example randomly choosing a token from a place has cost 100 and checking if a token is available has cost 10. The cost reflects how likely an operation is to lead to a successful, executable binding. In order for a binding to be successful, we must find a binding for each variable and validate each arc and the guard.

      Now, we finally get to your question 😉

      In Example 2 (left), the best strategy is to randomly select a token from B1 (cost 100), bind n to the value of a using the guard (cost 1), and check if n is available on A1 (cost 10). In Example 2 (right), this strategy is not possible (we can no longer bind n from a as we apply a function to the right side of the equation), and the one resembling it would be to randomly select a token from B2 (cost 100), randomly select a token from A2 (cost 100), and check if the guard holds (cost 2) for a total of 202. A more efficient way is to instead randomly select a token from A2 (cost 100), bind a to 1 * n in the guard (cost 1), check if a token matching the partial binding is available on B2 (cost 20) for a total of 121. We can do something similar for t1 by picking a token from A1 (100), bind a to n using the guard (1), and check token on B1 (20) for 121 again, but the previous strategy only yielded 111. Thus, the actual form of the transition does influence the probabilities.

      As this illustrates, we can easily have more than one binding strategy. CPN Tools finds the best one by constructing a Petri net modeling all binding strategies and uses state-space exploration to find the best one. The details are in the master’s thesis [2] above and more concisely in the paper Efficient Data-Structures and Algorithms for a Coloured Petri Nets Simulator.

      When we can have more binding strategies, we can naturally also have more than one with the same weight. In the example just above, we immediately have 3 (formally 6) different strategies with the same value: randomly choose a token from one place, test that it is available on the others (in some order). Which one is chosen depends on the search of the state-space, and hence on the order in which the model was constructed. If the marking of the places is asymmetric, this may influence the probabilities of different bindings. Deleting and undoing may influence that.

      Two other possibilities are that you have encountered a situation where the simulator was confused or that you have used an inscription like “2@5” where a “2@+5” is needed or vice versa. The first case is not that likely if your model simulates, and especially unlikely if you have done the old “save, close, load” trick which should always be employed when something weird happens. The second case should also be thwarted by the trick, though it may give weird results even in this case.

  2. I am quite surprised to find out that you are interested in fairness… Are you becoming a communist? Or socialist?

    1. Uhm… Mathematical notion… Different things… Crap – you got me, I’m now a Søvndal fan. With his ingenious opinions and exquisite English skills, he’ll be the best foreign minister ever!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.