User Scenarios Considered Harmful | Michaelpedia Galactica

Programming can be complex. Modern systems can include many inputs from users. Inputs that each need to be checked, processed, and results returned back to users. A simple interface to the user may support many usage scenarios and many error scenarios. Here I’ll make a case for not bothering with trying to figure out these scenarios exactly, but rather just focus on the steps needed to fulfil the requirements and leave it to the computer to figure out the details.

I exemplify this using real-life examples. Contrary to the time when I was at university where “real-life example” meant “bullshit I thought up that seems vaguely realistic and where this approach might have been useable had we not just done something simpler instead,” it here means actual examples in the real world where I have applied this technique.

I’ll also demonstrate how this can be and to some degree is supported by formal methods, which allows us the to explore the scenarios if desired or to precisely specify the behavior.

Background

In 1968, quite possible the most famous Eindhoven computer scientist, Edsger Dijkstra, published a paper called “Go To Statement Considered Harmful.” It presented the position that rather than using conditional and unconditional jumps, programmers should instead embrace structured programming, where programs are split into blocks and control structures handling the flow of blocks.

This way of thinking lead to structured programming, which vastly simplified programming. The improvement arose as programmers no longer had to deal with the nitty-gritty details of jumps such as the fact that any piece of code in principle could jump to any other piece of code at any point in time.

There was a lot of discussion of this around the time of publication, as proponents of the go to statement were touting the added flexibility, that it was closer to the machine and hence more efficient, and that there were genuinely problems that could not adequately be solved using structured programming. These problems were solved by adding more control structures as needed and by developing better optimizing compilers. One of the last problems of structured programming was solved with the invention of of continuations or exceptions for error handling. Today only the most neck-bearded of nerds would even consider unstructured programming for anything but the most low-level optimizations.

As time progressed, go to statement became the target of ridicule to the point that INTERCAL in 1972 was invented as a parody discarding the go to statement in favor of the “come from” statement. Instead of specifying that the flow of the program should jump to another place, the statement indicated that during execution, the flow could be interrupted from anywhere. It is wildly more unreadable; instead of having to take care that flow of control could come from anywhere, you ad to take care that flow of control could be intercepted at any point. Of course, no joke so good as when it was implemented using aspect oriented programming or in integration frameworks like Apache Camel.

In the end, structured programming won over unstructured programming using the go to statement, and complexity was removed by allowing computers to handle more of the underlying complexity of programming.

Programming paradigms

Programing has been thru many paradigms. Multiple paradigms exist during the same time and some have yet to catch on (or may never do so). I’m here using paradigm in the archaic meaning where it actually means paradigm instead where it means “vaporous dot com venture capital scam.”

Probably oldest is the imperative paradigm, where the program more or less is a recipe telling exactly what the computer needs to do. Imperative languages don’t really exist anymore, as most of them gets objects tacked on. Examples are BASIC, COBOL, C (without any ++, # or Objective) and Pascal (before any Turbo/Borland/Delphi editions).

The object oriented paradigm shifts focus to data, where programs are essentially composed of little machines with internal state and behavior, and the opacity of the shell of the machine varies from Scandinavian SIMULA-school transparent to American Smalltalk-school opaque encapsulation. As an aside, I once sat next to the inventor of encapsulation, David Parnas during a conference dinner. Nice guy, who was surprisingly receptive when I told him that even though everybody else at the table thought it was brilliant, I did not entirely believe in his idea to teach linear temporal logic to end-users. Modern examples of object oriented languages are Java, C++, C#, Objective-C, Python, PHP, Javascript and just about any language you can throw a stick at.

There’s a couple less common paradigms such as logic-based programming, where programs are essentially logic processing engines which are given axions and then ~~start an infinite loop choking on the logic~~ elegantly work out logical conclusions themselves. The only example anybody has ever conceived as an example of logic programming is encoding their own genealogy using PROLOG, allowing them to figure out who and what a grand cousin actually is.

Another older paradigm is functional programing, which is similar to imperative programming except computation is separated into functions without variables. That means a function needs everything it needs to perform its function passed as a parameter and the only result of the function is what it returns. Functional programming is especially well-tailored for elegant implementation of the factorial function. It is also an emerging trend as the lack of environment makes it extremely well-suited to parallel programming. Examples of functional languages are Standard ML, F#, and Erlang.

Many things people think about as paradigms fall under the above 4 categories; for example, client-server architectures is just imperial programming with networks, the cloud is imperial programming with the internet, component-based architectures are object oriented programming renamed, agent-based architectures are object oriented programming with networks, and web-service architectures object oriented programming with the internet.

Constraint programming

Another paradigm I first heard of in 2003 is concurrent constraint programming. It was presented by Mogens Nielsen during a course I attended in Eichstätt, Germany.

Here’s an illustration of CCP:

At the center we have a shared information domain or shared data space. That’s just the data we are working on. It may be the entire world, the entire hard drive in your computer, the file we are working on, or an even smaller set of data. We have a number of rules interacting with this shared information domain; 8 in the figure. Each of these rules have a guard and an action part. If the guard is true, the action part can be executed. That is, the rules are constrained by the guard, and all rules can be executed concurrently. Hence the name concurrent constraint programming.

CCP has no ties at all to CCCP but a lot of ties to logic programming, which I’ve already ~~made fun of~~ mentioned above. Where logic programming also deals with a shared information domain and rules interacting on it, it is based on a provable uncomputable logic, first order logic which means computation often fails unless you are very careful. CCP instead focuses on the individual steps of the computation in the form of rules.

In the real world, we work pretty much like CCP. Instead of a car analogy, consider your daily life. You have a rule for emptying the garbage bin when it is full. There’s a guard: “when garbage bin is full” and an action: “empty garbage bin over fence to the neighbor.” You have a rule for putting on music “when Britney isn’t playing, put on Britney record.” A rule for cooking dinner: “when around 7 in the evening, cook dinner.” A rule for eating dinner: “when dinner is ready, eat dinner.” A rule for cleaning plates after dinner: “when other people visit, clean kitchen.” And so on and so on. You have many such rules; some run at all times (like the music and garbage bin rules) and some only when needed (for example you may have a rule set for making stewed cabbage or for doing groceries, which are only invoked when cooking or shopping).

An Example: Customer data retrieval

Instead of trying to explain the CCP paradigm in further details abstractly, let’s look at concrete examples. For a customer, I was to implement code to retrieve customer information. This seems extremely simple, was it not for a couple facts:

data was to be retrieved using 6 different identifiers (such as name, e-mail, 3 kinds of customer id, etc)
data was to be retrieved from 5 different data sources each searchable on different identifiers
data was sometimes available from multiple sources in which case prioritization took place
different subsets of all the available information would be needed at different times
the data sources have different performance profiles so data should only be looked up in sources if necessary

The initial specification only supported 4 identifiers, 2 data sources and 5 fixed subsets of information. It ignored requirements 3 and 5. Yet it was still an error-ridden too large document because of the number of possibilities available (which was in this case “only” 5 subsets * 4 identifiers = 20 combinations). And this only contained the fair-weather scenarios, i.e., very little error handling was specified.

As time progressed, the number of subsets, identifiers and data sources increased and requirements 3 and 5 were added. The number of scenarios was so large it was impossible to ensure consistency, correctness and comprehensibility.

Instead of trying to think of all these usage scenarios, I split the task up into two rule sets: one for looking up data from one source, and one for extracting data.

Extracting the data was the simplest; I assumed we had access to all data. Then for each data value, I devised a look-up strategy which was just “when the value is available in the highest prioritized source, use that,” “when the value is not in the highest prioritized source but is in the second highest prioritized source, use that,” and so on. This can be realized as a very simple and readable sequence of if-statements in any language.

Looking up data was also reasonably simple: “when we have one of the identifiers needed to look up all data in a source, do that” and “when we don’t have an identifier needed to look up the data for a source, look up such an identifier.”

Running these two rule sets together, makes it possible to look up any property using any identifier. The system even supports prioritization of sources and only ever consults sources needed. On top of that, the specification became much simpler as it was separated into “strategy for looking up data” and “prioritization of data sources.”

Formal Modeling

This is almost easy to specify using my favorite modeling formalism, colored Petri nets. Petri nets are bipartite directed graphs where some nodes are drawn as rectangles and called transitions and others are ovals and called places. Places represent data – together they would be the shared information domain from CCP – but in a distributed setting. Transitions together with the edges/arcs represent actions and conditions, i.e., rules from CCP.

You can get CPN Tools here and get a copy of the model for you own enjoyment here.

The model has 4 layers: at the top, we have the user-initiated action: a query for information. Then comes all the possible keys, the lookups in the 3 modeled data sources, and finally the data values we consider as outputs (the keys are also outputs). A query provides any of the key values, in this case an Id:

Provided with an Id (notice the green token count 1 to the right of the Id place), we can look up in data source C (notice the green halo around the blue subpage tag under the Lookup C transition). This is modeled as a double arrow between the lookup and the Id – it requires (but does not consume) an Id to be executed. It produces an Email (the unidirectional arrow indicates this) and a Phone number. Performing this lookup therefore leads to this state:

We now have both an Id and an Email key available, which allows us to continue the search with a lookup in datasource B. This will produce a Name, a Fax number, and a Homepage URL:

With the Name available, we can do the Lookup in source A, yielding the remaining values:

This is screenshots from the actual model, though some complexity has been hidden using the module concept of CPN Tools. Revealing this complexity yields a model like this:

The complexity comes from the fact that we actually provide two A lookups – one based on Name and one based on SSN, and a bit of book-keeping to prevent multiple lookups. Even with all of this book-keeping, the individual models are not terribly complex; here is the most complex, the implementation of Lookup A:

The module only deals with the fields pertaining to it and uses the same notion of using a double arrow to indicate non-destructive reads and single arrows to indicate that we “produce” a value. The little house-things above the transitions indicate that they may at most be executed once (it does not make sense to perform the same lookup more than once). The stricken out line with circles for heads between the two lookups ensures that we only look up in source A using one of the possible keys (Name or SSN).

This model allows us to experiment with any type of query. We can see possible ways of executing the various lookups as responses to the various queries.

It is worth noticing that at no point in the development of the model did we consider user scenarios. We considered we could get 4 different kinds of keys, we specified the different kinds of lookups we could do at each datasource including what we would give as input and what we would get as output. The tool resolved the scenarios for us, and we can either use the resolved scenarios in out implementation, or make the implementation resolve the scenarios for us, just like the model. We could extend the model to take requirements 4 and 5 (only looking up in necessary data sources depending on the requested subset of values) without altering what we already have, by simply specifying which values to request.

The shared data space is the distribution of tokens on the places: which values have we already looked up. The transitions are rules and the guards are the data dependencies.

Implementation

The implementation is fairly simple. You can get the full demo sources from Git here, but the gist of it is encompassed in this snippet:

[java] public class CombiningCustomer implements Customer {
private FutureTask<Customer> ca, cb, cc;

public CombiningCustomer(DatasourceA a, DatasourceB b, DatasourceC c, IdType type, Object id) {
if (type == null) throw new IllegalArgumentException();
ca = new FutureTask<>(new Callable<Customer>() {
@Override
public Customer call() throws Exception {
switch (type) {
case NAME:
return a.lookup(type, id);
case SSN:
return a.lookup(type, id);
default:
return a.lookup(IdType.NAME, getB().getName());
}
}
});
// Similarly for B and C
}

protected Customer getB() throws InterruptedException, ExecutionException {
if (!cb.isDone()) {
cb.run();
}
return cb.get();
}

@Override
public String getAddress() {
try {
if (getA().getAddress() != null) return getA().getAddress();
} catch (InterruptedException | ExecutionException e) {
}
try {
if (getB().getAddress() != null) return getB().getAddress();
} catch (InterruptedException | ExecutionException e) {
}
try {
if (getC().getAddress() != null) return getC().getAddress();
} catch (InterruptedException | ExecutionException e) {
}
return null;
}
[/java]

The idea is that we create a future representing customer data from each of the 3 sources in the example (l. 2). The lookup happens in the constructor. Datasource A can look up based on Name and SSN directly (ll. 10-13); for all other keys, we first look up the customer using Datasource B, extract the name, and use that for lookup (l. 15).

We look up the customer using Datasource B explicitly instead of delegating this to avoid conflicts between the lookup and the priority.

We access elements of the future using getters as in lines 22-27 because of how futures work in Java.

Data extraction happens as in lines 29-44. We simply go thru the datasources according to the priority. In this example, datasource A always has highest priority and C lowest for all data values. We can of course ignore individual data sources for some values and prioritize the look-ups differently for each value.

This strategy completely hides the exact strategy for lookup. The strategy not just dependent on the scenario input by the user, but also the data; the code checks that each value is set, so if a data value is not set in one source the next one is consulted. Hence if one customer doesn’t have a phone number registered in datasource A but another does, the lookup strategy may prove different for the two.

The most important part is that the code paths we are immediately interested in work, and so does a large number of unplanned strategies. By removing the focus from the scenarios, we have moved the focus from how to solve the problem, to what we are trying to solve. We have described what to look up and a couple dependencies, and the details are up to the computer.

We satisfy all requirements above, and it is easy to add new data sources and keys. This has in fact happened during the project.

Evaluation

Instead of thinking in flows of control, we think in terms of data and dependencies. We have specified how to produce various pieces of data form other pieces of data and allowed the computer to resolve the details by itself.

In the example, we relied on futures and made explicit decisions on priority (for example, if we have both a Name and an SSN available, we prefer using the Name). It is also possible to realize many of the concepts using asynchronous or reactive programming with a bunch of components corresponding to rules just waiting for inputs (i.e., their guards to be satisfied). We can also go completely to the other side, and serialize all operations in the code. This makes it easier locally to follow what happens in each step, but makes it harder to get a view of the overall behavior.

Going further with the formal specification

So far, we have just used CCP as a way of (re)organize our thinking. We can in fact use a model like the CPN model above even further. This section is quite theoretical and does not provide much practical use, so if you just want stuff you can use, feel free to skip directly to the next section, our second example.

The customer data retrieval example was simplified from the original case. In the original case, there was a rather large overlap between the data returned from the different data sources, and also between the keys used by the various services. This complicates the lookup phase; in the example we actually always had exactly one available lookup method, whereas in the real case, we have multiple. Which one should we use? It might make a difference as using one strategy might make it possible to avoid looking up data at one source whereas another forces us to make a lookup at the most time-consuming datasource.

Another issue is whether various lookup strategies yield the same result. In the example above, we see that at the end of the execution, the Name place contains two tokens, indicating that we have values for this field from multiple sources. This is all acceptable for output values as we have an explicit prioritization strategy in place, but for keys it might mean we have a value from a lower-prioritized source available before a higher prioritized one during lookup. We need to either check that these values are always the same (as a requirement on the datasources) or that we define a particular order to ensure consistent responses (preferably the first!).

We also didn’t deal with whether it was always possible to resolve a user (in this example it is) and whether it is possible to always obtain all output values (again, here it is). As the system this example comes from only provides a best-effort lookup, we can ignore this, but if we use the strategy in a setting where we have to ensure all values are looked up if available, it would be nice to be able to check this.

The great thing is that we can! By using the state space analysis technique, we can generate all possible executions of this system and check all of these and more questions. The state space is a directed graph where each node corresponds to a state of the system and where each arc correspond to executing a transition/rule leading from one state to another. The entire state-space from our example is shown below:

I have highlighted the example execution we saw above using red and this corresponds to the “Id” path. The two meeting in the middle correspond to the SSN and Name paths. The meet after a lookup in datasource A, indicating that given either of those will lead to the same amount of knowledge.

The final states (those without outgoing arcs) indicate that our computation always finishes (that and the fact that we have no loops). The simple structure of the state space can serve as blue-print for an implementation in thise case: we have to just consider three different cases, one with a brach at the front. We might have noticed that the SSN and Name paths were the same by thinking in scenarios, but we also might not have.

The reason that all paths dont converge at the end is that we might get multiple values for three of the keys: Email, Name, and Id. This can be observed by looking at the bounds of the number of tokens of all places in all configurations:

[raw] Best Integer Bounds
Upper Lower
Hierarchical’Address 1 1 0
Hierarchical’Age 1 1 0
Hierarchical’Email 1 2 0
Hierarchical’Fax 1 1 0
Hierarchical’Homepage 1 1 0
Hierarchical’Id 1 2 0
Hierarchical’Name 1 2 0
Hierarchical’Phone 1 1 0
Hierarchical’SSN 1 1 0
[/raw]

This can be computed automatically regardless of the size of the state space. Each of the three distinct branches corresponds to a branch with an extra copy of one of the three keys. We can check that at the end of each query, we indeed look up all values.

This might seem like overkill for such simple example, but let’s not forget that this is a simplified version of the real world. If we allow datasources to not contain all key values, we get a state space with 36 nodes instead of the 14 before. Just a fragment of it looks as here:

We see that we have a lot more terminal states, where search cannot progress. In fact, we have exactly 20 such states in this example. That is understandable as if a datasource does not return a key necessary for the next datasource lookup, the search has to stop.

All in all, we can do a lot of analysis of such models if we want to, but the beauty is that we don’t have to. The complexity we see in the state spaces is the complexity we would have to deal with in an implementation and specification if we think in user scenarios rather than rules. As illustrated, this becomes particularly egregious when we need to deal with error states that can happen in each step.

Another Example: Document conversion

Another project has to do with document conversion. We have two types of messages, Requests and Responses. Each of these messages comes in two different versions, 1 and 2. Version 1 messages can furthermore be sent as XML files or as text files while version 2 only comes in an XML variety.

Any version 1 Request message can be converted to a corresponding 2 message and version 2 Request messages to version 1 in both XML and text. For Request messages we in addition generate a Response message corresponding to the original Request message. Response messages are converted from test to XML.

We have three calls: convertVersion, convertXML and generateResponse. These calls can fail. If the generateResponse call fails, it does not make sense to call convertVersion (it indicates the input file is a Response message already) and we’ll instead call convertXML for text files. In other words, we have 3 versions * 2 types = 6 input scenarios. For these, we have up to 5 failure scenarios: generateResponse or convertXML fails, convertVersion fails or successfully returns but produces warning messages, or none fail. Some combinations do not make sense (convertVersion and convertXML would never be called in the same flow), but rather than dealing with at least 12 and up to 30 scenarios, we can just turn to CCP.

In this case, we do not need to rely on futures or such trickery; the entire flow of the application can be summarized as:

[java] if (generateResponse() != success) {
if (isText()) {
convertXML();
}
} else {
convertVersion();
}
[/java]

The complexity here stems from all the error handling. All calls (except isText) can fail or yield warnings. All calls produce various messages in XML and/or text format. All calls potentially produce errors or warnings.

If you want, you can find a simple CPN model of the case here (the smaller black places do not represent data but just logical points of decision; the convertVersion ok transition is a little more complex as it can conditionally produce errors and text output depending on input). The state space of the model is here:

Two things are obvious when we look at the state space: it is fairly small but it has a lot of terminal nodes. Even though we have 6 different types of input and an estimated 12-30 scenarios, the state space has only 11 nodes. That’s less than our most generous (and wrong!) guess of the number of possible scenarios. In fact, we only have 4 non-terminal nodes, i.e., nodes where we still have logic to do. That’s less that the number of possible inputs!

We have 7 terminal states or 7 different combinations of outputs. That’s out of a theoretical maximum of 2⁴ = 16. Care to guess which ones that are? I didn’t bother. Instead, we can just assume that any combination is possible and show each of the 4 possible outputs (converted XML, converted text, response XML, and errors) if they are present. That cuts the output routine down to 4 if statements.

In total, we have reduced 6 possible inputs with 12-30 possible scenarios and 7 out of 16 theoretically possible output combinations to 4 program decision points and 4 if statements for output by ignoring use scenarios and thinking in data flows instead.

Conclusion

I have given two examples of thinking in data processing rules rather than user scenarios. Both examples are slightly simplified versions of real-life problems I’ve solved in the past year. In both cases did colleagues take the user scenario approach. As I’ve worked with rule-based computation since 2000, first using colored Petri nets and CPN Tools, using our own language JoSEL in ASAP, and using Declare, this approach was more natural to me.

In one case I literally got a specification with 18 scenarios. Aside from copy-paste errors and a couple oversights, it also only covered the fair-weather scenarios. It did not handle the extensions, and it confused even people very familiar with the underlying datasources. Only by discarding the scenario-based approach was it possible to get a stable implementation and cope with subsequent addition of complexity in the form of new datasources and parameter subsets.

The choice between a rule-based approach or a scenario-based approach is vary similar to the choice between a declarative or traditional business process description language. If your problem has a lot of branches at each point and fairly loose connections, thinking in rules is likely to benefit you. Both contribute to a large degree of freedom/concurrency, which causes the number of scenarios to explode. Rule-based approaches are good for best-effort systems where many individual steps may fail; both examples illustrate a sequence of calls that depend on outcomes of previous calls, which may fully or partially fail. This is an extremely common situation in modern application based on web-services. Even in the presence of partial failures, it is often worth providing a best-effort response back to the user relying either on error-recovery or just providing as much information as has been inferred before the failure.

There are various frameworks available for reactive programming, but in many cases that is not even necessary; as illustrated by the document conversion example, much can be achieved by a sequence of if statements just checking the guards of each rule. That takes some of the concurrency out of CCP, but I see CCP more as a way of thinking than a dogma that has to be followed rigorously.

At the end of the day, I think that CCP thinking in many situations can liberate us from thinking in user scenarios; we can focus on the services (= rules) we have and have the computer figure out the rest. In a best-effort scenario we can just check that our application behaves correctly in fair-weather scenarios, and when we need to provide guarantees established academic techniques like state space analysis can provide us with an overview of the assumptions and failure modes our application has.