Testing Simple Train Control System Using Its Formal Description

In our previous post, Decoding Formal Description of a Simple Train Control System, we examined how to interpret a formal model defined through a Kripke structure for a simple train control system—the Ceiling Speed Monitoring Controller (CSMC). Now, we’re taking the next step by using this model to build a powerful test suite to verify an AI-generated Python implementation of the controller.

With AI tools like ChatGPT increasingly involved in code generation, rigorously testing this code is essential. So, we’ll develop the test suite and see how good the AI-generated code really is—buckle up, and let’s start testing!

...

Decoding Formal Description of a Simple Train Control System

In scientific research on software testing, you often come across papers filled with complex mathematical descriptions. Formal mathematics is commonly used in academia because it offers a precise way to describe how a system should behave in various situations, ensuring all possible cases are accounted for. However, for many testers, the math in these papers can feel overwhelming and difficult to apply to everyday testing work.

Terms like “Kripke structures,” “atomic propositions,” and “equivalence classes” might sound abstract and unfamiliar. However, these concepts aren’t as intimidating as they seem. They’re simply tools that help us think more systematically about a system, allowing us to create test cases that cover all the important behaviors.

...

Using Atomic Propositions and Equivalence Classes (Part 1)

Atomic propositions and equivalence classes offer a practical solution for making testing manageable when dealing with large or infinite input domains. While these concepts are widely used in formal verification—such as in Exhaustive Model-Based Equivalence Class Testing by W. Huang and J. Peleska, 2013, from which we’ll borrow the formal definition of equivalence classes—they are extremely useful in everyday testing tasks.

In this two-part post, we explore how atomic propositions and equivalence classes help us select “representative values” when input spaces are large. These representative values group inputs that behave equivalently, enabling thorough system testing without needing to cover every possible input within each group. Given that atomic propositions and equivalence classes are commonly used in formal verification, we also delve into the formal mathematical notation that defines these concepts.

...

Using Atomic Propositions and Equivalence Classes (Part 2)

Building on the foundations from Part 1 🛸, where we introduced atomic propositions and equivalence classes as mathematically rigorous techniques, this section will dive into applying these concepts in practice. Here, we’ll explore how to account for internal states, carefully select specific input values, and utilize a combinatorial sketch to cover all equivalence classes effectively.

Ready to dive deeper? Let’s continue our journey into advanced testing strategies!

...

Combinatorial Testing: Writing Behavior Model

Combinatorial testing significantly increases the test coverage of software systems, achieving levels of thoroughness that are unattainable with traditional test suites focusing on individual scenarios or limited combinations of user actions. However, combinatorial testing introduces the notorious test oracle problem—how to determine or compute the expected behavior for every possible combination of user actions.

The survey “The Oracle Problem in Software Testing: A Survey” provides an excellent overview of this issue, highlighting that deriving or even knowing the correct outcomes for complex combinations is often not straightforward. A common approach to solve this is by developing a behavior model that can compute expected outcomes for given system behaviors. Yet, many testers have never encountered the oracle problem, as they primarily deal with single combination tests. Fewer still have created or even understand what a behavior model entails.

To demystify behavior models, we will explore the development of such a model for a simple stateful system: a memory function that allows users to write, read, or erase data. We’ll illustrate how to apply combinatorial testing to this function and develop a behavior model from scratch to verify expected results....

Sketching Combinations For Combinatorial Tests

Writing combinatorial tests usually requires that the author of the test plans for combinatorial testing upfront. This requires that all the combination variables and their values are identified and combined with the code that creates possible combinations and loops through each one of them. However, it would be nice to simplify the process and enable testers to write combinatorial tests in a manner that is as close as possible to a test that checks just one combination. This is where combinatorial Sketch‘es come in. A combinatorial sketch allows a tester to write combinatorial tests in an intuitive way without worrying about variable identification, calculation of combinations, and any explicit loops.

1
2
3
4
5
6
7
8
9
@TestSketch(Scenario)
@Flags(TE)
def test_add(self):
values = {0, 1, math.inf, math.nan, 1 / 3, 2**-200, 2**200}
sign = {1, -1}
check_add(
a=either(*values, i="a") * either(*sign, i="sign_of_a"),
b=either(*values, i="b") * either(*sign, i="sign_of_b")
)
...

Combinatorial Testing: The Introduction

According to the US National Institute of Standards and Technology’s (NIST) page on Combinatorial Testing, combinatorial testing can reduce costs for software testing, and its applications in software engineering can be significant. This is not surprising, as combinatorial testing is the cornerstone of any type of software testing. In its simplest form, one can think of it as finding the correct combination for a padlock. But instead of looking for a combination that opens the lock, the quality assurance team looks for combinations that lead to a violation of one or more requirements—the bugs. The technique is simple and universal, and in theory, it provides the most sought-out feature in testing, which is complete test coverage....

BDD or not to BDD, or TDD, or even ATDD. How does it affect your QA team?

How do software development processes such as behavior-driven development (BDD), test-driven development (TDD), or even acceptance-test-driven development (ATDD) affect quality assurance team? To answer this question, we need to first note that all these terms end with “development” and not with “quality assurance”. But could we define behavior-driven quality assurance (BDQA), test-driven quality assurance (TDQA), or even acceptance-test-driven quality assurance (ATDQA)? If so, what would be the difference between them? Well, it would depend on the definitions of a behavior, a test, and an acceptance test....

Get Your Software Covered Using Covering Arrays

For software systems with any non-trivial number of parameters, exhaustively testing all possible combinations of parameter values is not feasible. For example, if we have 10 parameters (k = 10) that each have 10 possible values (v = 10), the number of all possibilities is vk=1010=10billionv^k=10^{10} = {10}_{billion}, thus requiring 10 billion tests for complete coverage....

Working With Requirements Just Like With Code

“Requirements are the starting point of any design”, this is a quote from the presentation given by Nicholas D. Kefalas, who is a senior technical officer at Sikorsky Aircraft/Lockheed Martin. In this article, we’ll explore how you can work with requirements just like with code and look at a simple example of specifying requirements for a Google Calculator web application. While most people will agree with the quote above, somehow it is not applied or scarcely applied to the commercial software development process. We are developing too fast, and it seems like there is no time to think about the functionality that we are developing, much less write it down...

Making Your Tests Better or How to Break Your Tests Into Steps

Writing tests is not as easy task as it may seem at first glance. In many cases, writing a good test can be as hard as writing good application code. While readability and maintainability are both desirable features in application code, they are not always required. For tests, on the other hand, readability and maintainability are a must....