RFC: New Features in CPN Tools

I already wrote about conveniences that may or may not go into CPN Tools 3.6.  In this post I’ll look into some features that would be nice to add to CPN Tools 4.0 (or later).  I do not promise that any of these will become reality as they all require significant effort.  If you are interested in paying my time so I can implement them, feel free to contact me, however 🙂

These features are mostly independent of each other (sort of, the ASAP integration makes no sense if the simulator is rewritten).  They would all require significant effort, but I’m sure you would agree they are all interesting.  Which features would you like to see?  Let me know in the comments!

ASAP Integration

As you probably know, CPN Tools contains a state space analysis tool.  It is old, not that fast, and not that powerful.  It is immensely useful, however, as it makes it very easy to get simple results, and quite easy to write custom queries.

ASAP, on the other hand is much newer.  While it doesn’t implement the most advanced of techniques (as some are simply not applicable as we try doing state space analysis of the full colored Petri nets formalism as implemented in CPN Tools), it is much faster, more efficient, and modern than the one already in CPN Tools.

ASAP previously existed as an autonomous tool, but that it no longer maintained.  The fundamental code of ASAP still lives on, however, in the tummy of the CPN Tools simulator.  You can play with it here.

It would be interesting to expose this code further.  This should happen at different levels:

  • Support modern CPN Tools features
  • Compatibility API
  • Generate standard report
  • Use for manual exploration
  • (Optional) graphical composition of functors

ASAP has suffered a bit from feature rot, as new features have been added to CPN Tools without being represented in ASAP as well.  This includes prioritized transitions and time (though not new, ASAP does not understand this).  This should be implemented to on par with the old tool.

The compatibility API would implement all the search functions of the old state space tool on top of the ASAP structures.  Using these would probably not be much faster than using the old tool in most cases, but they would also be available when using advanced techniques.

The standard report can almost be generated by the ASAP code as it stands today, but it would need some tweaks to handle everything.  This is a must-have feature, as the standard report is the most used feature of the CPN Tools state space tool.

CPN Tools supports manually drawing the state space.  ASAP would be very efficient at doing this and could easily replace the old tool here.  The only problem is that in order to show incoming arcs (as CPN Tools does), it would need to compute the entire state space, making it as fast (slow) as the old tool.  An option to ignore this information, could alleviate this.

ASAP supports composing your own state space tool from components.  This is possible in so many ways that we even created an entire language just for this.  It would be nice to have some language for doing this (semi-)automatically, as use otherwise requires a lot of knowledge of the internals.  This is a huge endeavor, however, and unlikely to happen.

Java Simulator

For kicks and giggles, I started making a simulator in Java a couple years ago.  It’s currently at the level of screen-shot ware.  I think it would be really neat to actually make this.  It would be useful not only because Java is better known than SML and has much better library support, but also because Java is more actively maintained.

CPN Tools is currently written in Beta (the GUI) and Standard ML (the simulator and state space tool).  Most people at best knows Standard ML and I’m probably the person maintaining and writing the most Beta code in the world.  Standard ML is a very nice language (and the future of Standard ML probably lies in Microsoft’s F#).  Unfortunately, it is not super-maintained anymore.  Most importantly, it does not have support for operating system threads (yes, I know CML; no it uses light-weight threads and does not take advantage of multi-processing) and 64-bit processing.  This mostly affects the state-space tool, but it is still important.  There are also a lot of cool things I would like to add that are just infeasible in the current simulator; for an example see my point about web-services below.  The relatively secret life of Standard ML also means there are fewer good programmers that know it and can contribute code.  Coupled with the fact that the current simulator is a code-generating nightmare makes it very inapproachable for anybody who is not or have been maintainer of the code.  Did I mention that the simulator does not compile with the stock compiler, but requires a patched version that I also maintain?  Fun.

It is therefor obviously™ a good idea to port the simulator to a more well-known language. As I know Java the best and like the fact that it works well on the three major platforms (OS X, Windows, and Linux), this is my language of choice. Also, that way I can make anybody using the language suffer Eclipse’s endless rewrites.

Much of the rewrite I could do fairly easily as I have a good idea of how the current simulator works. I am quite certain that the JSR 199 implementation in javax.tools (which just happens to be made in a large part by my friend Peter Ahe, who was also involved with implementing wildcards in Java 5) should provide a good enough API for the code generation needs.  To parse Java code we’d need to use some internal classes (forcing people to use Sun’s ((Yeah, Oracle, but I don’t care.)) compiler) or another parser.

The only difficult thing is that we would need to add some things to Java.  First, we’d need some syntactic sugar.  This would mostly to make simple models carry over directly, and includes syntax for multisets and list handling.  Second, we’d need some new language features, most importantly pattern matching and tuples.  These features have been coming to Java for ages, but maybe basing the simulator on one of the languages living within Java like Scala would speed this up.  I would prefer something extending Java instead of a completely new language like Scala, though.

I already have a decent structure which has multiple layers, one of which is a nice, modern Java API and a compatibility layer making it possible to use the CPN Tools GUI with the new code.  There is still a lot of design work to do as well as the actual implementation.

I think this would be a really fun project and not that time-consuming (for me).  The problem is that while I in one afternoon can write the 3646 lines of code to make the above screen-shot and in perhaps a month or so have a simulator running, it would take maybe half a year of my time to make a simulator that is as stable and feature rich as the current one.  On top of that comes implementing a new state-space tool.  Again (though it would be easier this time and would have some features we can only dream of currently, such as multi-threading and access to much more memory).

Other features that would be easily possible using a simulator written in Java (thanks to the external libraries) would be something like exporting logs in XES format (instead of CPN Tools custom log format and simulation replication output), import of PNML (though that is probably more a GUI feature), and many other similar things that are next to trivial to do in Java but annoying in Standard ML as we have to implement everything ourselves.

New Web-based GUI

I already mentioned that the GUI of CPN Tools is written in Beta.  Aside from an academic approach to extend Beta, nothing happens in Beta land these days, and I doubt anybody even understands the syntax, which mostly looks like something a Japanese schoolgirl would write in a text message.  This means that nobody can extend the user interface of CPN Tools.

Not only that, Beta was written many a night ago when computer resources were scarce.  Thus the compiler (like the TeX compiler) has some hard-coded upper limits for various data elements.  One such limit is the number of AST (abstract syntax tree) nodes in a single file.  Put another way, there is a maximum size of files in Beta programs.  Beta uses a novel module concept, which has a consequence that some data structures need to reside in the same file.  Guess what, that file in CPN Tools is so large it is just under the hard limit, which means that extreme care must be taken to extend certain parts of CPN Tools.

The GUI of CPN Tools is very nice as soon as you are used to it.  Before that, less so.  It has a quite steep learning curve, which is a liability when teaching as students first have to learn a completely new user-interface before they can start on the actual content: learning how to model.  It would be nice to have a more standard user-interface, preferably even one where a lot of the advanced features are switched off.

All of this leads to the fact that a new GUI would be nice.  Preferably written in a commodity language and supporting plug-ins.  It would be nice to have a web-based GUI to avoid all kinds of cross-platform shenanigans (…and replace them by cross0-browser shenanigans 😉 ).

Such a GUI could conceivably be integrated with some of the external tools written around CPN Tools, most importantly Grade/CPN.

The biggest challenge of writing a new GUI is that writing GUIs sucks.  Especiually editors.  Especially, especially graphical editors.  Luckily there are tons of graph editors for the web out there, though they will probably take a bunch of customizing to use.  Having tried to use something like GMF, I am very suspicious of anything claiming to make it easy to make editors, however, especially if the look and feel of the framework is only almost good enough.  I estimate that a new GUI would take me around one year to write.  Given a couple good student programmers, I could do it in a year but only spend half of my time doing it.

Web Services

It is very interesting to use CPN models to control other things or to use CPN models as parts of other things.  Access/CPN 2.0 facilitates this to some extent, but it would be much nicer to have this as a native feature so we would not need any external programs.

One very elegant way to build that into CPN Tools is to add external transitions and exported pages using web services.  Like substitution transitions have a double outline to distinguish them from regular ones, service invocations could have a triple outline.

If we rely on something like SOAP and WSDL, we can even check that the interfaces match the places around the module.  We can make invocation asynchronous if we want; much like how sub-pages are asynchronous in the sense that “invocation” of a substitution transition is not atomic and output tokens may be produced after steps happen elsewhere.  It would also be interesting to implement the protocol we described for cosimulation with SystemC, which allows asynchronous execution but ensures that all participants are synchronized before time is advanced.  This allows fast distributed execution of timed systems.

Similarly, “certain” pages could be exported as web-services to be executed from external applications.

All of this would be fairly easy (especially because I now have implemented something similar at least 3 times – depending on how you count), but only if the simulator is written in Java.  There is no chance I’ll ever (EVER!!!) try my sanity by implementing SOAP web-services in SML myself.  I looked into that once and came out a sadder person.

Real (Parametrized) Modules

Pages in CPN Tools are really useful.  They allow you to structure your model.  Unfortunately, it is not easy to reuse useful implementations.  It would be easy to make a lot of reusable functionality, but it is just not easy to re-use.  Part of it is because declarations are global and part of it is because pages cannot be parametrized.  Thomas Mailund wrote a nice paper on parametrized CPNs a million (approximately) years ago.  See it here.

The basic idea of parametrized modules is two-fold: We move declarations to the modules themselves and we allow some types to be unspecified at definition time and only defined at use time.  That way a module becomes self-contained (it already has the declarations it uses) and generic (as we can make, e.g., a generic queue of any data-type).

Combining this with a Java simulator would be super-neat (and, I’d say, a prerequisite), as transitions would simply be an interface, a module a particular (parametrized) subclass, and a web-service invocation another.  Other implementations could be possible as well (e.g., with support for communication channels, driving animations, etc.).

Declare Constraints

Declare supports declarative modeling by specifying constraints between tasks (e.g., that you can only execute “True Happiness” after executing “Britney Concert” at the top center in the example to the right).

This is a nice and (in some cases) natural way to describe control flow.  Declare sort of sucks in describing data at the moment, however.  On the other hand, CPNs are really nice for describing data flow.  It is also possible to describe control flow, but it has a tendency of becoming a bit artificial and mixed with the data flow.  There are ways to remedy that (see, e.g., process-partitioned colored Petri nets), but that requires discipline on behalf of the user (or tool support).

Instead, one could mix the two (which is sort-of, half-way) implemented in Declare.  Then the control-flow would be described using Declare constraints and data-flow would be described using regular CPN.  Of course, one can completely ignore either perspective and focus on the one, one is interested in, but it is also possible to use both.  That would be good for declarative modeling (getting to a broader audience), but also for CPNs as it would be possible to automatically extract implementations due to the clear separation of control-flow and data.

I would not endeavor this without a Java simulator, as reimplementing Declare seems a bit counter-productive.

Conclusion

In conclusion, there are many very interesting directions to take CPN Tools, alas most of them rely on rewriting either the GUI or the simulator (or both).  Such a rewrite would unfortunately not produce any (or at least not enough) research to make it financially viable.  Still, I would be willing to take a year out of my calendar to try it if the financing was possible; both because it would be useful and because it would be fun.

10 thoughts on “RFC: New Features in CPN Tools

  1. I’m a beginner of SML, but I like SML and CPNTools very much. I have some questions and a concern about the Java simulator.

    1) Does MLton support the 64-bit memory system? If it does, may be porting CPNTools simulator from SML/NJ to MLton would be nice. Then, state space analysis would get a great benefit from this by having very large memory to work on. I know that MLton does not support OS-level threads, but the 64-bit memory benefit is attractive enough.

    I also like ASAP, since it provides the SML interface which is easy to use to develop new techniques for state space analysis. But, I found some strange things. When I ran some CPN models which generate a large number of states (> 70,000 states) in ASAP, it ran out of memory, but the classical CPNTools did not.

    So, I think that the 64-bit memory system would provide a great benefit for ASAP.

    2) In the Java simulator, can we write SML functions in CPN models and SML queries for state space analysis? How about writing SML programs for state space computation?

    3) I also have a concern. I think many existing researchers in CPNTools are familiar with SML and I’m sure that they like SML very much. So, moving from SML to Java may be a big change and Java is not a declarative programming. I am worry that some researchers who do not like imperative programming may stay with the old SML simulator or, even worst, they may think about another tool.

    In summary, CPNTools may lose old users . I don’t want this to happen. I like CPNTools very much. But this is my concern and I would like to discuss about this with you and all others. Also, please don’t take me wrong. This is just my worry.

    Best regards,

    WilBur.

    1. Thanks for the feedback.

      1) MLton does support 64 bits and (simple) threads, and we have had a simulator running on MLton. MLton is completely compiled, however, which makes it impossible to use in CPN Tools. In CPN Tools we need to be able to generate and evaluate code (and many more details). This is not possible in MLton, and hence it will never be the main compiler. the infrastructure to set up MLton is too great at this time. We would need to ship the entire simulator source, keep track of generated code, combine all of this and build an executable, and then we can run analysis. Forgot to add a query or made a syntax error? Well, you have to recompile and as MLton is a whole-program compiler and optimizer, that’ll take a LONG time.

      ASAP can easily handle more than 70.000 states; I’ve generated state-spaces with more than 10.000.000 states (though using a reduction technique). You must do something that is not tail-recursive.

      2) No. Instead you can write Java functions and Java queries. Java will have syntactic sugar added for lists and multisets, so “1`1 ++ 1`2” is still legal as is “[1, 2, 3]”. This actually already works in my quick and dirty tech preview.

      3) Most users of CPN Tools are not that familiar with SML. They may grudgingly write a simple function, but I only see complex functions from very few people. SML is not declarative at all, this is all added by CPN Tools and will still be available. SML is functional, meaning you cannot store state. The only important feature SML has that Java lacks is (native) pattern matching.

      The idea is to make the new simulator so most simple models can be used directly and only few declarations need to be changed for complex models. Any model using undocumented features will of course break, but I warn against that all the time.

      I agree that some users may be lost, but I think that the benefits by far outweigh that. It will be much more accessible by anyone using Java, and further development will be possible. Currently that is not possible as we have to invent EVERYTHING whenever we want to try something out. This is why web-services are not integrated, though it would be very interesting and useful: we would have to reinvent web-services in SML and that is just not worth the effort. In Java we could include a single library and be done in less than a week.

  2. I can see the difficulty of using MLton in CPNTools, and I do respect your solution of Java simulator as a future development of CPNTools.

    1) I have a question about the following of your reply.

    > ASAP can easily handle more than 70.000 states; I’ve generated state-spaces with more than 10.000.000 states (though using a reduction technique). You must do something that is not tail-recursive.

    Do you mean that I should avoid using tail-recursion in my state-space computation program? (Now I am developing my own program and algorithm for a state space computation.)

    2) Right now, what is the best way to compute a state space analysis which requires a large amount of memory in CPNTools? Would it be the Remote simulation on another machine? I mean to run my own developed state space analysis program, not the built-in analysis.

    Thank you very much for your suggestion in advance.

    Best regards,

    WilBur.

  3. > we have had a simulator running on MLton.

    One more thing. By the above sentence, do you mean that you have already ported SML/NJ simulator to MLton simulator? I can see that the compilation process in MLton will take a long time, but someone who does a serious state space analysis may like to try this.

    1. It means we have ported one example to MLton. The CPN simulator comprises (at least) 3 layers: generic simulation code, model-specific simulation code, and code to generate the model-specific code. For one model, we have ported the generic code and the model-specific code. It requires changes to the generic code that are incompatible with the SML/NJ simulator.

      I agree it would be interesting, but the speedup is not really worth it (2-4 times) compared to the effort it would take making it user-friendly.

  4. I think that MLton-based CPNTools would provide the following advantages:

    1) It offers an analysis of an extremely huge state space with or without a reduction technique. This will certainly enhance CPNTools capability to compete with other tools in terms of the analysis of a huge state space. Also, new users will find this very attractive, since new users may not know well about any reduction technique.

    2) This would be a way to keep old users who like SML with CPNTools. Then, those old users can further develop their own methods or techniques for a state space analysis, or even contribute to MLton-based simulator or ASAP in the future. So, Java-based system would attract new users, while MLton-based system would keep old users.

    On one hand, Java-based CPNTools or SML/NJ-based CPNTools can be seen as a fully automated and graphical system which allows a development of models and an analysis at a preliminary stage. On the other hand, MLton-based CPNTools can be seen as a manual and text-based system which allows a serious state space analysis.

    So, initially developers create and test their models with Java or SML/NJ-based CPNTools. After they are sure about the correctness of the models, then they can use MLton-based CPNTools to do a serious analysis.

    Also, since you have already ported the generic simulation code to MLton and have some examples of other codes, to provide MLton-based CPNTools may not be too much burden to you. Of course, it would be nice if you could provide a brief guideline on how to port other codes (eg. model-specific simulation code, code to generate the model-specific code) to MLton. Also, a shell script on how to compile the whole system would be nice. I mean that a semi-automated & manual MLton-based CPNTools would be sufficient.

    In summary, this is a very simple solution which could not only attract new users (by the analysis of a huge state space) but also keep old users. With the ability to analyze an extremely huge state space (millions states), CPNTools will not be just an intuitive tool due to the graph models, but also would provide a serious state space analysis. Also, this solution may not require too much work from you, since you already did the main part (correct me if I’m wrong).

    Please consider this. Just a little more work might have a big impact for old users and for CPNTools.

    1. Despite the fact that the ported versions are on a computer which has probably been reformatted by now and is 800 km away from me, there are five issues with supporting MLton.

      First, MLton would not provide huge state-spaces; current versions already grow to ~2.3 GiB before running out. Due to the data-structures, this corresponds to approximately 4.6 GiB + 50% overhead on a 64-bit architecture or around 7 GiB. Thus, you’d only get an advantage when you have more than 8 GiB memory.

      Second, the changes are incompatible with the SML/NJ code. This means I would have to maintain two branches or add even more platform-dependent code. I already maintain 3 platforms (though OS X and Linux are quite similar). Adding MLton would double that to 6 (4) and officially supporting 64-bit architectures further to 12 (8). That’s a lot of testing and compilation…

      Third, MLton would make it impossible to do queries. You would have to write and debug your queries, ALL your queries, before compilation. If something goes wrong, you have to do the full analysis again. There is no interactive step where you issue queries and get feed-back. If you already need to test your queries in an interactive environment with a smaller state-space, why redo the analysis for a larger version? Is that going to give you better confidence?

      Fourth, I have developed a data-structure, which makes it possible to storing more than 40 million states using the current infra-structure. That’s before reduction using the sweep-line method or similar methods. It is described in my paper “Modeling and Verification of a Protocol for Operational Support Using Coloured Petri Nets”, see conference publications from 2011 at https://westergaard.eu/cv/publications/

      Fifth, the code we had running was using ASAP only. This is not compatible with the current state-space tool in CPN Tools. Either, you could not make the same queries, or you would have to port the state-space tool of CPN Tools or make a compatibility layer for ASAP.

      So, in conclusion, yes this is possible. Yes, this would be nice to have. The problem is that you underestimate how much time it would cost me, that I’d rather spend doing more interesting things (like implementing the new data-structure or making ASAP available in the GUI). If you are interested, you are very welcome to look at it yourself – I’d be happy to help you finding your way around in the code.

  5. I’m sorry that I assumed that you have the MLton-based simulator with you now. In this case, I don’t know if I’m capable enough to do the porting, even though I’m quite interested in it. I will think about it. And I have a few questions.

    – Suppose that I’m interested in only ASAP queries. What are the source codes that need to be ported? I can see that the simulator and ASAP in the Source section in the CPNTools web are required. What else?

    – I don’t quite understand the following of your sentence. What is the reason for the correspondence between 2.3 GB in the current CPNTools and 4.6 GB in 64-bit MLton?

    > ; current versions already grow to ~2.3 GiB before running out. Due to the data-structures, this corresponds to approximately 4.6 GiB

    – Referring to my earlier question about the slow performance in my computation in ASAP. Do you mean that I should do tail-recursion, if possible, in my model? This is because tail-recursion offers a better performance than a non-tail recursion. Do I get it right?

    1. To make ASAP work, you need the generic simulator code, including disabling all the code-generating code (which cannot be ported), the model-specific code (only minor changes IIRC) and also minor changes to aSAP, which mostly runs on MLton. Last time we more or less just compiled the whole shebang and removed parts unnecessary and changed a few parts that had to.

      The reason a 64-bit version would take up twice as much memory is that most of the data-structures are pointers. 64-bit pointers take up twice as much memory as 32-bit pointers.

      Porting ASAP will not fix your speed issues – it’s something more fundamental. I can compute hundreds of thousands states a minute. Tail recursion is “always” needed, especially when dealing with stuff that happens a lot. You can add to a list non-tail-recursively as long as the list has less than 10 elements (i.e., it often doesn’t matter for any model you can do state-space analysis of), it matters very much for the mail loop of a state-space analysis algorithm. Tail recursion allows the compiler to change it into iteration, making it possible not to save the stack. Without tail-recursion, you need to keep the stack (really closure) for each call, which is very expensive in space and hence time.

  6. Thank you for the information, and all your advice.

    I will come back to you later if I decide to do the porting.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.