This post has 2299 words. Reading it will take approximately 11 minutes.
I motivated the need for a user-friendly tool supporting model-based testing in practice in a previous post. The gist is that industry best-practice relies on logical test-cases rather than testing everything, and that is bad for software quality. The reason, I theorize, is that model-based testing is still a bit too academic, and requires a couple leaps of confidence to replace regular testing.
In my previous post, I outlined an example of a testable system implementing a risk assessment strategy for an insurance company. The company would grant insurance to all women and to men above a certain age. The logic test-cases were “young man is denied,” “old man is approved,” and “woman is approved.”
The tool I describe is tentatively called the MBT Workbench because I have the imagination of somebody with very little imagination. MBT is here an abbreviation for model-based testing. The tool will leverage (buzzword so it’s more businessy) MBT to test web-services.
At the core, the tool allows users (developers/testers) to load a web-service description and select variables interesting for testing. Users will also supply or construct a model describing the correct outcomes. With this information, the tool executes tests for all combinations of input parameters, and compares outcomes with the known good values. Based on this, a report is generated.
So far, it’s all very simple and in line with my previous post, but the devil, as always, is in the details. And hell. From which we can conclude that the details are hell.
First, let’s talk about models. Not the ones on the catwalk. Models as in model-based testing. What we need is a model that can translate input parameters into output parameters. The model needs to be executable by a computer. Normally, when I think of a model like that, I am thinking of a (colored) Petri net, but I am probably in the minority when it comes to that. Many people in industry use flow diagrams or just textual descriptions. Rather than finding one model to rule them all, let us consider models in the abstract. Instead of going with my knee-jerk reaction and just going with a model definition that is “anything with a transition system semantics,” we will go even more abstract: a model is anything that given a set of input variables can produce a set of output variables. The model might include hidden variables neither in the input or output, but that doesn’t concern us at this level. We can use relabeling and duplication to wlog assume that the set of input and output variables are disjoint. I will try giving more concrete examples as they become useful, but for now we’ll just notice that this notion of models includes state charts, Petri nets, flow charts, (hidden) Markov models, and many other standard formalisms.
Tying this model definition to the insurance example, the two input variables are the age and gender of the insured party, and the output variable is whether they are granted an insurance.
As testing will deal with all combinations of all values of all variables, we need the domains of all variables to be finite, and preferably sufficiently small that multiplying the number of elements with other numbers doesn’t become too large to handle. Booleans are fine. Enumerations are fine. Integers, dates, and strings not so much. But if we require that a user specifies a finite set of interesting values for these types, we are good to go. We can in principle broaden the scope to handle other types, but – let’s be honest – this covers well over 90% of practical cases.
For the insurance example, the gender is a finite enumeration with two possible outcomes (MAN and WOMAN), the age is an integer, but the user has specified that only values < 25, 25, and > 25 are interesting. The outcome is a Boolean value (true or false) indicating whether an insurance can be granted.
We need a manner to convert model variables into a web-service request. At the outset, we just need to enter the values into the appropriate fields of the web-service request, but for more generality, we allow making user-defined value mappings (for example, our web-service might not accept the gender specified as MAN/WOMAN but instead as MALE/FEMALE or it might require that the age isn’t given as an age in years but instead as a birth date).
Similarly, we need a way to extract values from the web-service response and convert them into model output variables. We’ll see later why we want the translation to be in this direction.
In the insurance example, we might want to transform an insurance offer into the Boolean value true and an error message into the value false.
So, this is all just a bunch of framework. If all is well, nobody but MBT Workbench developers will ever have to deal with the specifics. Now is where (I think) things get clever.
Now the idea is that users don’t have to supply a web-service definition, a set of input and output variables with possible assignments, mapping’s between web-services and variables, and a model. Instead, we will guide the user thru defining most of these.
The web-service definition is simple: that’s done the old-fashioned way by supplying a WSDL or Swagger definition or something similar. If your operation is anything close to professional, you will have these handy well before starting tersting anyway.
Using the web-service definition, the user is prompted to fill in fields of the request. Some fields will get fixed values, and some will be marked as input variables. The input variables can just be specified from a list with intelligent suggestions. For example, for a Boolean or enumeration, the tool will just suggest adding the value verbatim to the set of input variables. For integers, the tool will suggest creating an enumeration with values UNDER_xx, AT_xx, and OVER_xx (for example UNDER_25, AT_25, and OVER_25 in the insurance example). It will do the same for dates. Strings have to be explicitly enumerated. More advanced users can manually add more input variables with appropriate types, and use a simple scripting language to compute web-service parameters based on these. In this manner, the tool supports users from beginner to advanced in setting up the input variables and input mapping.
For setting up the model and output variables + mapping, the tool can provide several options. The simplest way would be to just call the service with the given parameters. That will provide responses from the web-service. The tool can just ask users how to interpret these values by selecting elements of the response and use them directly as an output variable (with a given value). Output variables do not have restrictions on their domains (we will automatically only ever get a finite number of possible values as we only have a finite number of scenarios), so there is no issue with just verbatim using the error message of a response as an output variable. Again, we are free to manually add variables and mappings.
To fit better with an agile development strategy (more buzzwords, get your marketing porn right here, 5 for the prize of 3 and comes with a block chain in the cloud), the output variables can also be extracted by examining the schema describing the result, and mapping values of the response to model output variables like for input variables. This would require more interaction from the user (as there would be no concrete values available), but would be possible to do before the service was actually implemented.
To supply the model, I see at least three possibilities: user-specified, user-assisted, and automatic generation. The first option is not that interesting: a user can specify a model using their preferred modeling language. Which consists mostly of blaring trivialities and a wish for world peace, am I right, guys? The tool can support multiple languages, and users can specify models in their favorite. The tool can execute the models and compare values. Pretty standard stuff.
Where it gets interesting is if the user isn’t forced to specify a model. In the insurance example, the rules are very simple to formulate: first, if the customer is a woman, grant the insurance, and second, if the customer is above 25, grant the insurance, third if none of these match, deny it. It is simple to allow the user to specify an input variable and provide an output if it has a particular value. This way of specifying results incidentally also matches perfectly with the project that is the original cause for these considerations. Funny how life works out. Doing this will specify a simple flowchart model which can be presented either graphically or textually. The tool can assist the user and check if the model is fully specified, consistent, redundant etc.
Finally, it is also possible to generate such a model automatically. If we just execute the model, we get output variables for all combinations of input variables automatically (this is the reason the mapping is from response to output variables and not the other way around). We can use standard partitioning algorithms to automatically construct a flowchart model that reproduces the values observed by the service. The model can then be inspected by the user to see if it matches the desired outcome. The model is not necessarily unique; in the insurance example, we might get a decision tree that first asks a person about their gender and then if it is a man about their age, or it might first ask about the age and only in the case of a young person ask about the gender.Tests can be presented in a giant table with input variables and expected output variables, but it can also be aggregated into a set of logical test cases. This can be done in a manner that is – according to some metric – “complete.” It can be as simple as distinguishing based on the response variables (yielding two cases in the insurance example, a young man and a not-young-man (randomly either an old man or a woman), or it can take the specified or generated model into account to ensure that each decision node is touched. These test cases can even be automatically described in the “old fashioned” manner. All of this is fully automatic and guaranteed complete, so no time is wasted reviewing test cases for completeness or coming up with them in the first place.
Already at this point, I think it would be possible to use MBT in practice, and that such a tool would be immensely powerful. But at the same time, I also think there’s more one can do. For example, an emerging test technology is fuzzing. Emerging here means has been used the past 10 years. Fuzzing just tries throwing in random values and see what happens. I can think of at least two ways where fuzzing might be useful in a tool like the MBT Workbench.
First, the user is (optionally) specifying a set of fixed values. The tool could try to establish whether these values have to be set at those values, or can be set entirely freely. This can be decided by trying changing a value randomly and checking the test result. If most changes cause errors, the tool can conclude that the value is “magic” and needs to be set to a particular value. In the insurance example, the tool might conclude that a non-tested parameter indicating the insurance type has to be set to car insurance as most other insurance types have different requirements causing most tests to fail. Or it might conclude that a fixed postcode parameter can be set freely, as the outcome doesn’t depend on the value. Or if there is no clear answer, it might suggest including the value in the set of input parameters. If one starts with a test without specifying any input variables, the tool might even be able to identify some of them itself. If this were an academic paper, I would claim that the tool could automatically identify the input variables, further automating the approach, but being a bit more pragmatic, I have serious doubts this would work sufficiently reliably in reality.
Second, the tool can also use fuzzing to help test the user-specified data mapping. User specifies that the important values are UNDER_25, AT_25, and OVER_25 and maps these to 20, 25, 30. Try 24 and 26. Try 19, 21, 29, and 31. Try 10 and 50. Try -5000 and 5000. If some of these causes unexpected results, suggest further refining the test. If not, it further reinforces that the chosen equivalence classes are probably good.
Now, if this were an academic paper, I would throw in a future work section, where I claim that if the input and output variable mappings were reversible, the tool could also be used to make model-based stub implementations and “with further research” perhaps lead to model-based implementations or model-based software engineering. People have been claiming that for 20 years with few tangible results, though, so let’s not get carried away.
Instead, let’s just conclude by stating that I’ll start looking at making a prototype of this. If things work out, it might even develop into a viable tool. I’ll be very interested in hearing opinions about this approach as described: what will work and what won’t. Do you have other suggestions that would be useful in such a tool? For my first version, I think I’ll just use simple decision trees for models. I was originally thinking of using full CPN models but I think that’s overkill. I’ll allow SOAP messages for now (because that’s what I need), and probably also allow database requests (because I need that too).
Time person of the year 2006, Nobel Peace Prize winner 2012.