Testing Simple Train Control System Using Its Formal Description

In our previous post, Decoding Formal Description of a Simple Train Control System, we examined how to interpret a formal model defined through a Kripke structure for a simple train control system—the Ceiling Speed Monitoring Controller (CSMC). Now, we’re taking the next step by using this model to build a powerful test suite to verify an AI-generated Python implementation of the controller.

With AI tools like ChatGPT increasingly involved in code generation, rigorously testing this code is essential. So, we’ll develop the test suite and see how good the AI-generated code really is—buckle up, and let’s start testing!

Reference implementation

Before we can start testing, we need a concrete implementation of the system. Taking a modern approach, we’ll use an AI tool, ChatGPT, to provide us with a reference implementation in Python.

We’ll give the ChatGPT 4o model access to the original article, Decoding Formal Description of a Simple Train Control System, to provide context. In addition, we will restate the transition relation, RR , which defines the system’s behavior as follows:

R((Vest,VMRSP,,W,EB),(Vest,VMRSP,,W,EB))φ0:(=NSVestVMRSP=NSW=WEB=EB)φ1:(=NSVest>VMRSP=WSW=1EB=EB)φ2:(=NS(gebi1gebi2)=ISW=1EB=1)φ3:(=WSVest>VMRSP¬(gebi1gebi2)=WSW=1EB=EB)φ4:(=WSVestVMRSP=NSW=0EB=0)φ5:(=WS(gebi1gebi2)=ISW=WEB=1)φ6:(=ISVest>0=ISW=WEB=1)φ7:(=ISVest=0=NSW=0EB=0)R((V_{\text{est}}, V_{\text{MRSP}}, \ell, W, EB), (V'_{\text{est}}, V'_{\text{MRSP}}, \ell', W', EB')) \equiv \\ \quad \varphi_0: (\ell = \text{NS} \land V_{\text{est}} \leq V_{\text{MRSP}} \land \ell' = \text{NS} \land W' = W \land EB' = EB) \lor \\ \quad \varphi_1: (\ell = \text{NS} \land V_{\text{est}} > V_{\text{MRSP}} \land \ell' = \text{WS} \land W' = 1 \land EB' = EB) \lor \\ \quad \varphi_2: (\ell = \text{NS} \land (g_{\text{ebi1}} \lor g_{\text{ebi2}}) \land \ell' = \text{IS} \land W' = 1 \land EB' = 1) \lor \\ \quad \varphi_3: (\ell = \text{WS} \land V_{\text{est}} > V_{\text{MRSP}} \land \neg(g_{\text{ebi1}} \lor g_{\text{ebi2}}) \land \ell' = \text{WS} \land W' = 1 \land EB' = EB) \lor \\ \quad \varphi_4: (\ell = \text{WS} \land V_{\text{est}} \leq V_{\text{MRSP}} \land \ell' = \text{NS} \land W' = 0 \land EB' = 0) \lor \\ \quad \varphi_5: (\ell = \text{WS} \land (g_{\text{ebi1}} \lor g_{\text{ebi2}}) \land \ell' = \text{IS} \land W' = W \land EB' = 1) \lor \\ \quad \varphi_6: (\ell = \text{IS} \land V_{\text{est}} > 0 \land \ell' = \text{IS} \land W' = W \land EB' = 1) \lor \\ \quad \varphi_7: (\ell = \text{IS} \land V_{\text{est}} = 0 \land \ell' = \text{NS} \land W' = 0 \land EB' = 0)

Refer to the Transition relation RR section in the previous blog if the transition relation is unclear.

Next, we provided ChatGPT with the following prompt to generate a Python implementation:

"Can you write a Python program that will simulate the behavior of the system described by this transition relation? Make it a CLI utility that allows the user to set initial V_MRSP and guards, then prompts the user for V_est, shows the state of the system, and exits cleanly when CTRL-C is pressed."

Below is the Python program generated by ChatGPT, which we will now test.

train_control_system.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
import sys
import signal

class TrainControlSystem:
def __init__(self, V_MRSP, gebi1_threshold=15, gebi2_threshold=7.5):
self.V_MRSP = V_MRSP
self.g_ebi1_threshold = gebi1_threshold # Threshold for gebi1 condition
self.g_ebi2_threshold = gebi2_threshold # Threshold for gebi2 condition
self.V_est = 0
self.state = "NS" # Initial state: NS (Normal State)
self.W = 0 # Warning
self.EB = 0 # Emergency Brake

def transition(self):
# Determine emergency brake conditions
gebi1 = self.V_MRSP > 110 and self.V_est > self.V_MRSP + self.g_ebi1_threshold
gebi2 = self.V_MRSP <= 110 and self.V_est > self.V_MRSP + self.g_ebi2_threshold

# Apply transition rules based on the current state
if self.state == "NS":
if self.V_est <= self.V_MRSP:
self.state = "NS" # phi_0
elif self.V_est > self.V_MRSP:
self.state, self.W = "WS", 1 # phi_1
elif gebi1 or gebi2:
self.state, self.EB = "IS", 1 # phi_2

elif self.state == "WS":
if self.V_est > self.V_MRSP and not (gebi1 or gebi2):
self.state, self.W = "WS", 1 # phi_3
elif self.V_est <= self.V_MRSP:
self.state, self.W, self.EB = "NS", 0, 0 # phi_4
elif gebi1 or gebi2:
self.state, self.EB = "IS", 1 # phi_5

elif self.state == "IS":
if self.V_est > 0:
self.state, self.W, self.EB = "IS", 1, 1 # phi_6
elif self.V_est == 0:
self.state, self.W, self.EB = "NS", 0, 0 # phi_7

# Print the current state after the transition
print(f"State: {self.state}, V_est: {self.V_est}, V_MRSP: {self.V_MRSP}, W: {self.W}, EB: {self.EB}")

def set_V_est(self, V_est):
self.V_est = V_est
self.transition()

def signal_handler(sig, frame):
print("\nExiting program.")
sys.exit(0)

def main():
# Register the signal handler for clean exit on CTRL-C
signal.signal(signal.SIGINT, signal_handler)

# Prompt the user for initial setup values
V_MRSP = float(input("Enter the initial maximum restricted speed (V_MRSP): "))
gebi1_threshold = float(input("Enter the emergency brake guard threshold for V_MRSP > 110 (default is 15): ") or 15)
gebi2_threshold = float(input("Enter the emergency brake guard threshold for V_MRSP <= 110 (default is 7.5): ") or 7.5)

# Initialize the system with these parameters
system = TrainControlSystem(V_MRSP, gebi1_threshold, gebi2_threshold)

print("\nTrain Control System initialized.")
print("Press CTRL-C to exit.\n")

# Continuously prompt for V_est and update the system state
while True:
try:
V_est = float(input("Enter the estimated speed (V_est): "))
system.set_V_est(V_est)
except ValueError:
print("Invalid input. Please enter a numerical value for V_est.")

if __name__ == "__main__":
main()

When you run the program, it will prompt you to enter V_MRSP and two thresholds used to determine when emergency braking should be applied. Afterward, it will ask for the estimated current speed of the train, V_est, display the system’s current state, and then wait for the next V_est reading. You can stop the program at any time by pressing CTRL-C.

1
$ python train_control_system.py
1
2
3
4
5
6
7
8
9
10
Enter the initial maximum restricted speed (V_MRSP): 100
Enter the emergency brake guard threshold for V_MRSP > 110 (default is 15): 15
Enter the emergency brake guard threshold for V_MRSP <= 110 (default is 7.5): 7.5

Train Control System initialized.
Press CTRL-C to exit.

Enter the estimated speed (V_est): 80
State: NS, V_est: 80.0, V_MRSP: 100.0, W: 0, EB: 0
Enter the estimated speed (V_est):

Building test input sequences

With the reference implementation in hand, we’re ready to verify that it meets the system’s behavioral requirements. Fortunately, since we have a formal definition of the system, these requirements are as precise as they can be.

What should we use from the formal definition to build our test sequences? The answer is straightforward: we need the set of atomic propositions APAP to create input equivalence class partitions (IECP).

Below is our set of atomic propositions:

AP={Vest=0,Vest>VMRSP,VMRSP>110,Vest>VMRSP+7.5,Vest>VMRSP+15,=NS,=WS,=IS,W,EB}AP = \{ V_{\text{est}} = 0, V_{\text{est}} > V_{\text{MRSP}}, V_{\text{MRSP}} > 110, V_{\text{est}} > V_{\text{MRSP}} + 7.5, V_{\text{est}} > V_{\text{MRSP}} + 15, \\ \quad \ell = \text{NS}, \ell = \text{WS}, \ell = \text{IS}, W, EB \}

To construct input equivalence class partitions, we only need to focus on the propositions directly related to the inputs VestV_{\text{est}} and VMRSPV_{\text{MRSP}} . Therefore, our input set becomes:

API={Vest=0,Vest>VMRSP,VMRSP>110,Vest>VMRSP+7.5,Vest>VMRSP+15}AP_I = \{ V_{\text{est}} = 0, V_{\text{est}} > V_{\text{MRSP}}, V_{\text{MRSP}} > 110, V_{\text{est}} > V_{\text{MRSP}} + 7.5, V_{\text{est}} > V_{\text{MRSP}} + 15 \}

Let’s construct the initial table of input equivalence class partitions (IECP). This table includes every combination of each atomic proposition related to the inputs being either true or false, representing the Cartesian product of all possible input states.

For example,

IECP={Vest=0,¬(Vest=0)}×{Vest>VMRSP,¬(Vest>VMRSP)}×{VMRSP>110,¬(VMRSP>110)}×{Vest>VMRSP+7.5,¬(Vest>VMRSP+7.5)}×{Vest>VMRSP+15,¬(Vest>VMRSP+15)}IECP = \{ V_{\text{est}} = 0, \neg(V_{\text{est}} = 0) \} \times \\ \quad \{ V_{\text{est}} > V_{\text{MRSP}}, \neg(V_{\text{est}} > V_{\text{MRSP}}) \} \times \\ \quad \{ V_{\text{MRSP}} > 110, \neg(V_{\text{MRSP}} > 110) \} \times \\ \quad \{ V_{\text{est}} > V_{\text{MRSP}} + 7.5, \neg(V_{\text{est}} > V_{\text{MRSP}} + 7.5) \} \times \\ \quad \{ V_{\text{est}} > V_{\text{MRSP}} + 15, \neg(V_{\text{est}} > V_{\text{MRSP}} + 15) \}

Now, we can construct a table explicitly listing all 25=322^5 = 32 possible classes. From this set, we can then filter out combinations that have logical conflicts and select specific values for VestV_{\text{est}} and VMRSPV_{\text{MRSP}} that satisfy each class. Alternatively, we could visualize these combinations graphically, as shown below.



For VMRSP>110V_{\text{MRSP}} > 110 , we can use the following values:

  • VMRSP=120,Vest={0,50,123,132,140}V_{\text{MRSP}} = 120, V_{\text{est}} = \{0, 50, 123, 132, 140\}

For VMRSP110V_{\text{MRSP}} \leq 110 , we can use the following values:

  • VMRSP=100,Vest={0,50,105,113,120}V_{\text{MRSP}} = 100, V_{\text{est}} = \{0, 50, 105, 113, 120\}

These values are chosen to cover each region of interest. Since the estimated speed cannot simultaneously fall into multiple regions, most of the 25=322^5 = 32 combinations are invalid. Only 1010 classes remain as valid.

Another way to work this out

If we assign each atomic proposition a short pip_i name, we get the following definitions:

  • p0p_0 : VMRSP>110V_{\text{MRSP}} > 110 — The MRSP is greater than 110.
  • p1p_1 : Vest=0V_{\text{est}} = 0 — The estimated speed of the train is zero.
  • p2p_2 : Vest>VMRSPV_{\text{est}} > V_{\text{MRSP}} — The estimated speed exceeds the maximum restricted speed (MRSP).
  • p3p_3 : Vest>VMRSP+7.5V_{\text{est}} > V_{\text{MRSP}} + 7.5 — The estimated speed exceeds the MRSP by more than 7.5.
  • p4p_4 : Vest>VMRSP+15V_{\text{est}} > V_{\text{MRSP}} + 15 — The estimated speed exceeds the MRSP by more than 15.

From the table that includes all 3232 classes, only the following valid combinations remain:

Class p0p_0 p1p_1 p2p_2 p3p_3 p4p_4 Explanation
1 p0p_0 p1p_1 ¬p2\neg p_2 ¬p3\neg p_3 ¬p4\neg p_4 VMRSP>110V_{\text{MRSP}} > 110 , Vest=0V_{\text{est}} = 0
2 p0p_0 ¬p1\neg p_1 ¬p2\neg p_2 ¬p3\neg p_3 ¬p4\neg p_4 VMRSP>110V_{\text{MRSP}} > 110 , no thresholds exceeded
3 p0p_0 ¬p1\neg p_1 p2p_2 ¬p3\neg p_3 ¬p4\neg p_4 VMRSP>110V_{\text{MRSP}} > 110 , Vest>VMRSPV_{\text{est}} > V_{\text{MRSP}} without exceeding other thresholds
4 p0p_0 ¬p1\neg p_1 p2p_2 p3p_3 ¬p4\neg p_4 VMRSP>110V_{\text{MRSP}} > 110 , Vest>VMRSPV_{\text{est}} > V_{\text{MRSP}} exceeding 7.5 but not 15
5 p0p_0 ¬p1\neg p_1 p2p_2 p3p_3 p4p_4 VMRSP>110V_{\text{MRSP}} > 110 , exceeding all thresholds
6 ¬p0\neg p_0 p1p_1 ¬p2\neg p_2 ¬p3\neg p_3 ¬p4\neg p_4 VMRSP<=110V_{\text{MRSP}} <= 110 , Vest=0V_{\text{est}} = 0
7 ¬p0\neg p_0 ¬p1\neg p_1 ¬p2\neg p_2 ¬p3\neg p_3 ¬p4\neg p_4 VMRSP<=110V_{\text{MRSP}} <= 110 , no thresholds exceeded
8 ¬p0\neg p_0 ¬p1\neg p_1 p2p_2 ¬p3\neg p_3 ¬p4\neg p_4 VMRSP<=110V_{\text{MRSP}} <= 110 , Vest>VMRSPV_{\text{est}} > V_{\text{MRSP}} without exceeding other thresholds
9 ¬p0\neg p_0 ¬p1\neg p_1 p2p_2 p3p_3 ¬p4\neg p_4 VMRSP<=110V_{\text{MRSP}} <= 110 , Vest>VMRSPV_{\text{est}} > V_{\text{MRSP}} exceeding 7.5 but not 15
10 ¬p0\neg p_0 ¬p1\neg p_1 p2p_2 p3p_3 p4p_4 VMRSP<=110V_{\text{MRSP}} <= 110 , exceeding all thresholds

Then, you simply select values for each input that satisfy the conditions of each input equivalence class partition.

Coding up user actions

With our test input values defined, we can start writing the test code to automate all possible user actions, including:

  1. Starting and stopping the program
  2. Setting V_MRSP and threshold values
  3. Entering estimated speed values
  4. Verifying the normal status state
  5. Verifying the warning status state
  6. Verifying the emergency braking (intervention) status state

Since we’re working with a command-line interface (CLI) program, we’ll use the testflows.connect module to control it with the send and expect methods. Additionally, we’ll define a State class to encapsulate all elements of the controller’s state into a single object. This class will be crucial when we implement the controller’s behavior model, which will serve as a test oracle to provide the expected state for any combination of controller behavior.

Here is the implementation of all actions, including the State class, in Python:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
import re
from testflows.core import *
from testflows.combinatorics import *
from testflows.connect import Shell

class State:
def __init__(self, State, V_est, V_MRSP, W, EB):
self.State = State
self.V_est = float(V_est)
self.V_MRSP = float(V_MRSP)
self.W = int(W)
self.EB = int(EB)

def __str__(self):
return f"State={self.State},V_est={self.V_est},V_MRSP={self.V_MRSP},W={self.W},EB={self.EB}"


@TestStep(Given)
def start_program(self):
"""Start the train controm system program."""
with Shell() as bash:
with bash("python train_control_system.py", asynchronous=True) as controller:
self.context.app = controller.app
try:
yield
finally:
with By("stopping the program with CTRL-C"):
self.context.app.child.send("\03")
self.context.app.child.expect("Exiting program.")
self.context.app.child.expect("\n", escape=False)
bash.expect(bash.prompt)


@TestStep(When)
def enter_speed(self, v_est):
"""Enter the estimated speed and then
read and return current system state."""
self.context.app.child.send(str(v_est))
self.context.app.child.expect(str(v_est))
self.context.app.child.expect("Enter the estimated speed \(V_est\)\:")
# State: NS, V_est: 60.0, V_MRSP: 100.0, W: 0, EB: 0
pattern = re.compile(
r"State: (?P<State>\w+), "
r"V_est: (?P<V_est>[\d.]+), "
r"V_MRSP: (?P<V_MRSP>[\d.]+), "
r"W: (?P<W>\d+), "
r"EB: (?P<EB>\d+)"
)
return State(**pattern.match(self.context.app.child.before.strip()).groupdict())


@TestStep(Given)
def enter_vmrsp_and_thresholds(self, v_mrsp, g_ebi1_threshold=15, g_ebi2_threshold=7.5):
"""Enter V_MRSP and guard 1 and 2 thresholds."""
app = self.context.app

app.child.expect("Enter the initial maximum restricted speed \(V_MRSP\)\:")
app.child.send(str(v_mrsp))
app.child.expect("\n", escape=False)

app.child.expect(
"Enter the emergency brake guard threshold for V_MRSP > 110 \(default is 15\)"
)
app.child.send(str(g_ebi1_threshold))
app.child.expect("\n", escape=False)

app.child.expect(
"Enter the emergency brake guard threshold for V_MRSP <= 110 \(default is 7.5\)"
)
app.child.send(str(g_ebi2_threshold))
app.child.expect("\n", escape=False)

app.child.expect("Press CTRL-C to exit.")
app.child.expect("\n", escape=False)
app.child.expect("Enter the estimated speed \(V_est\)\:")


@TestStep(Then)
def expect_normal_status(self, state, W, EB):
"""Expect normal status."""
assert state == "NS", f"expected state NS but got {state}"
assert W == 0, f"expected W=0 but got {W}"
assert EB == 0, f"expected EB=0 but got {EB}"


@TestStep(Then)
def expect_warning_status(self, state, W, EB):
"""Expect warning status."""
assert state == "WS", f"expected state WS but got {state}"
assert W == 1, f"expected W=1 but got {W}"
assert EB == 0, f"expected EB=0 but got {EB}"


@TestStep(Then)
def expect_emergency_brake_status(self, state, W, EB):
"""Expect emergency brake status."""
assert state == "IS", f"expected state IS but got {state}"
assert W == 1, f"expected W=1 but got {W}"
assert EB == 1, f"expected EB=1 but got {EB}"

The behavior model

To check the numerous combinations of inputs—potentially hundreds—we need a behavior model that can accurately compute the expected state of the system for each case.

Refer to Combinatorial Testing: Writing Behavior Model for an introduction to writing a behavior model.

This behavior model is derived from the system’s transition relation RR . Here it is again for easy reference, allowing you to compare it directly with the code implementation below:

R((Vest,VMRSP,,W,EB),(Vest,VMRSP,,W,EB))φ0:(=NSVestVMRSP=NSW=WEB=EB)φ1:(=NSVest>VMRSP=WSW=1EB=EB)φ2:(=NS(gebi1gebi2)=ISW=1EB=1)φ3:(=WSVest>VMRSP¬(gebi1gebi2)=WSW=1EB=EB)φ4:(=WSVestVMRSP=NSW=0EB=0)φ5:(=WS(gebi1gebi2)=ISW=WEB=1)φ6:(=ISVest>0=ISW=WEB=1)φ7:(=ISVest=0=NSW=0EB=0)R((V_{\text{est}}, V_{\text{MRSP}}, \ell, W, EB), (V'_{\text{est}}, V'_{\text{MRSP}}, \ell', W', EB')) \equiv \\ \quad \varphi_0: (\ell = \text{NS} \land V_{\text{est}} \leq V_{\text{MRSP}} \land \ell' = \text{NS} \land W' = W \land EB' = EB) \lor \\ \quad \varphi_1: (\ell = \text{NS} \land V_{\text{est}} > V_{\text{MRSP}} \land \ell' = \text{WS} \land W' = 1 \land EB' = EB) \lor \\ \quad \varphi_2: (\ell = \text{NS} \land (g_{\text{ebi1}} \lor g_{\text{ebi2}}) \land \ell' = \text{IS} \land W' = 1 \land EB' = 1) \lor \\ \quad \varphi_3: (\ell = \text{WS} \land V_{\text{est}} > V_{\text{MRSP}} \land \neg(g_{\text{ebi1}} \lor g_{\text{ebi2}}) \land \ell' = \text{WS} \land W' = 1 \land EB' = EB) \lor \\ \quad \varphi_4: (\ell = \text{WS} \land V_{\text{est}} \leq V_{\text{MRSP}} \land \ell' = \text{NS} \land W' = 0 \land EB' = 0) \lor \\ \quad \varphi_5: (\ell = \text{WS} \land (g_{\text{ebi1}} \lor g_{\text{ebi2}}) \land \ell' = \text{IS} \land W' = W \land EB' = 1) \lor \\ \quad \varphi_6: (\ell = \text{IS} \land V_{\text{est}} > 0 \land \ell' = \text{IS} \land W' = W \land EB' = 1) \lor \\ \quad \varphi_7: (\ell = \text{IS} \land V_{\text{est}} = 0 \land \ell' = \text{NS} \land W' = 0 \land EB' = 0)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
class Model:
"""Ceiling Speed Monitoring Contoller (CSMC) behavior model."""

def __init__(self, V_MRSP_threshold=110, g_ebi1_threshold=15, g_ebi2_threshold=7.5):
self.V_MRSP_threshold = V_MRSP_threshold
self.g_ebi1_threshold = g_ebi1_threshold
self.g_ebi2_threshold = g_ebi2_threshold

def g_ebi1(self, current_state):
"""Guard 1 condition."""
return (
current_state.V_MRSP > self.V_MRSP_threshold
and current_state.V_est > current_state.V_MRSP + self.g_ebi1_threshold
)

def g_ebi2(self, current_state):
"""Guard 2 condition."""
return (
current_state.V_MRSP <= self.V_MRSP_threshold
and current_state.V_est > current_state.V_MRSP + self.g_ebi2_threshold
)

def expect_normal_status(self, behavior):
"""Expect normal status."""
current_state = behavior[-1]
prev_state = behavior[-2]
if prev_state.State == "NS":
if current_state.V_est <= current_state.V_MRSP:
return expect_normal_status
elif prev_state.State == "WS":
if current_state.V_est <= current_state.V_MRSP:
return expect_normal_status
elif prev_state.State == "IS":
if current_state.V_est == 0:
return expect_normal_status

def expect_warning_status(self, behavior):
"""Expect warning status."""
current_state = behavior[-1]
prev_state = behavior[-2]
if prev_state.State in ("NS", "WS"):
if current_state.V_est > current_state.V_MRSP:
if not (self.g_ebi1(current_state) or self.g_ebi2(current_state)):
return expect_warning_status

def expect_emergency_brake_status(self, behavior):
"""Expect emergency brake status."""
current_state = behavior[-1]
prev_state = behavior[-2]
if prev_state.State in "NS":
if self.g_ebi1(current_state) or self.g_ebi2(current_state):
return expect_emergency_brake_status
elif prev_state.State == "WS":
if self.g_ebi1(current_state) or self.g_ebi2(current_state):
return expect_emergency_brake_status
elif prev_state.State == "IS":
if current_state.V_est > 0:
return expect_emergency_brake_status

def expect(self, behavior):
"""Return expected behavior."""
return (
self.expect_normal_status(behavior)
or self.expect_warning_status(behavior)
or self.expect_emergency_brake_status(behavior)
)

To use this model, we’ll define a new check_result action that takes a specified behavior and uses the model to determine the expected state of the controller.

1
2
3
4
5
6
7
8
@TestStep(Then)
def check_result(self, behavior, model):
"""Check the result of the train control system."""
current_state = behavior[-1]
expect = model.expect(behavior)
debug(f"{current_state},expect={expect.__name__ if expect else None}")
expect(state=current_state.State, W=current_state.W, EB=current_state.EB)

Writing the test to check all equivalence classes

With the inputs selected to cover each equivalence class, we can now write the actual test. Since the system is stateful, we need to account for transitions between states as the system progresses. Given that the system has only three internal states, we can apply the Pigeonhole principle, just as we did in the Determining the minimum number of calls section when testing the memory function in the previous article.

The test itself will consist of two parts: the TestOutline, which defines the overall structure of the test, and individual checks for specific combinations of VMRSPV_{\text{MRSP}} , threshold values, and sequences of estimated speed readings.

Here is the outline:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
@TestOutline(Scenario)
def check_combination(self, v_mrsp, g_ebi1_threshold, g_ebi2_threshold, speed_sequence):
"""Check specific combination of the controller behavior."""

model = Model(g_ebi1_threshold=g_ebi1_threshold, g_ebi2_threshold=g_ebi2_threshold)
behavior = []

with Given("I start the train control system program"):
start_program()

with And("I set V_MRSP and guard thresholds"):
enter_vmrsp_and_thresholds(
v_mrsp=v_mrsp,
g_ebi1_threshold=g_ebi1_threshold,
g_ebi2_threshold=g_ebi2_threshold,
)
# add the initial state
behavior.append(State(State="NS", V_est=0, V_MRSP=v_mrsp, W=0, EB=0))

for i, v_est in enumerate(speed_sequence):
with When(f"step {i}: speed is {v_est}"):
state = enter_speed(v_est=v_est)
behavior.append(state)
check_result(behavior=behavior, model=model)

The outline above will be used in the check_all_combinations test, which will generate all the necessary combinations to test each value of VMRSPV_{\text{MRSP}} , threshold settings, and estimated speed values corresponding to each selected VMRSPV_{\text{MRSP}} .

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
@TestFeature
def check_all_combinations(self):
"""Test train control system."""
v_mrsps = {
100: (0,50,123,132,140),
120: (0,50,105,113,120)
}
thresholds = [(15, 7.5)]
i = 0

for v_mrsp, (g_ebi1_threshold, g_ebi2_threshold) in product(list(v_mrsps.keys()), thresholds):
speeds = v_mrsps[v_mrsp]
for speed_sequence in product(speeds, repeat=3):
Combination(
f"#{i}: V_MRSP={v_mrsp},g_ebi1={g_ebi1_threshold},g_ebi2={g_ebi2_threshold},speeds={speed_sequence}",
test=check_combination,
)(
v_mrsp=v_mrsp,
g_ebi1_threshold=g_ebi1_threshold,
g_ebi2_threshold=g_ebi2_threshold,
speed_sequence=speed_sequence,
)
i += 1

With all the pieces in place, we’re ready to assemble the full test program.

The full test program

Here is the complete test program, which we can use to verify the accuracy of the reference implementation produced by ChatGPT.

test_train_control_system.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
import re
from testflows.core import *
from testflows.combinatorics import *
from testflows.connect import Shell


class State:
def __init__(self, State, V_est, V_MRSP, W, EB):
self.State = State
self.V_est = float(V_est)
self.V_MRSP = float(V_MRSP)
self.W = int(W)
self.EB = int(EB)

def __str__(self):
return f"State={self.State},V_est={self.V_est},V_MRSP={self.V_MRSP},W={self.W},EB={self.EB}"


@TestStep(Given)
def start_program(self):
"""Start the train controm system program."""
with Shell() as bash:
with bash("python train_control_system.py", asynchronous=True) as controller:
self.context.app = controller.app
try:
yield
finally:
with By("stopping the program with CTRL-C"):
self.context.app.child.send("\03")
self.context.app.child.expect("Exiting program.")
self.context.app.child.expect("\n", escape=False)
bash.expect(bash.prompt)


@TestStep(When)
def enter_speed(self, v_est):
"""Enter the estimated speed and then
read and return current system state."""
self.context.app.child.send(str(v_est))
self.context.app.child.expect(str(v_est))
self.context.app.child.expect("Enter the estimated speed \(V_est\)\:")
# State: NS, V_est: 60.0, V_MRSP: 100.0, W: 0, EB: 0
pattern = re.compile(
r"State: (?P<State>\w+), "
r"V_est: (?P<V_est>[\d.]+), "
r"V_MRSP: (?P<V_MRSP>[\d.]+), "
r"W: (?P<W>\d+), "
r"EB: (?P<EB>\d+)"
)
return State(**pattern.match(self.context.app.child.before.strip()).groupdict())


@TestStep(Given)
def enter_vmrsp_and_thresholds(self, v_mrsp, g_ebi1_threshold=15, g_ebi2_threshold=7.5):
"""Enter V_MRSP and guard 1 and 2 thresholds."""
app = self.context.app

app.child.expect("Enter the initial maximum restricted speed \(V_MRSP\)\:")
app.child.send(str(v_mrsp))
app.child.expect("\n", escape=False)

app.child.expect(
"Enter the emergency brake guard threshold for V_MRSP > 110 \(default is 15\)"
)
app.child.send(str(g_ebi1_threshold))
app.child.expect("\n", escape=False)

app.child.expect(
"Enter the emergency brake guard threshold for V_MRSP <= 110 \(default is 7.5\)"
)
app.child.send(str(g_ebi2_threshold))
app.child.expect("\n", escape=False)

app.child.expect("Press CTRL-C to exit.")
app.child.expect("\n", escape=False)
app.child.expect("Enter the estimated speed \(V_est\)\:")


@TestStep(Then)
def expect_normal_status(self, state, W, EB):
"""Expect normal status."""
assert state == "NS", f"expected state NS but got {state}"
assert W == 0, f"expected W=0 but got {W}"
assert EB == 0, f"expected EB=0 but got {EB}"


@TestStep(Then)
def expect_warning_status(self, state, W, EB):
"""Expect warning status."""
assert state == "WS", f"expected state WS but got {state}"
assert W == 1, f"expected W=1 but got {W}"
assert EB == 0, f"expected EB=0 but got {EB}"


@TestStep(Then)
def expect_emergency_brake_status(self, state, W, EB):
"""Expect emergency brake status."""
assert state == "IS", f"expected state IS but got {state}"
assert W == 1, f"expected W=1 but got {W}"
assert EB == 1, f"expected EB=1 but got {EB}"


@TestStep(Then)
def check_result(self, behavior, model):
"""Check the result of the train control system."""
current_state = behavior[-1]
expect = model.expect(behavior)
debug(f"{current_state},expect={expect.__name__ if expect else None}")
expect(state=current_state.State, W=current_state.W, EB=current_state.EB)


class Model:
"""Ceiling Speed Monitoring Contoller (CSMC) behavior model."""

def __init__(self, V_MRSP_threshold=110, g_ebi1_threshold=15, g_ebi2_threshold=7.5):
self.V_MRSP_threshold = V_MRSP_threshold
self.g_ebi1_threshold = g_ebi1_threshold
self.g_ebi2_threshold = g_ebi2_threshold

def g_ebi1(self, current_state):
"""Guard 1 condition."""
return (
current_state.V_MRSP > self.V_MRSP_threshold
and current_state.V_est > current_state.V_MRSP + self.g_ebi1_threshold
)

def g_ebi2(self, current_state):
"""Guard 2 condition."""
return (
current_state.V_MRSP <= self.V_MRSP_threshold
and current_state.V_est > current_state.V_MRSP + self.g_ebi2_threshold
)

def expect_normal_status(self, behavior):
"""Expect normal status."""
current_state = behavior[-1]
prev_state = behavior[-2]
if prev_state.State == "NS":
if current_state.V_est <= current_state.V_MRSP:
return expect_normal_status
elif prev_state.State == "WS":
if current_state.V_est <= current_state.V_MRSP:
return expect_normal_status
elif prev_state.State == "IS":
if current_state.V_est == 0:
return expect_normal_status

def expect_warning_status(self, behavior):
"""Expect warning status."""
current_state = behavior[-1]
prev_state = behavior[-2]
if prev_state.State in ("NS", "WS"):
if current_state.V_est > current_state.V_MRSP:
if not (self.g_ebi1(current_state) or self.g_ebi2(current_state)):
return expect_warning_status

def expect_emergency_brake_status(self, behavior):
"""Expect emergency brake status."""
current_state = behavior[-1]
prev_state = behavior[-2]
if prev_state.State in "NS":
if self.g_ebi1(current_state) or self.g_ebi2(current_state):
return expect_emergency_brake_status
elif prev_state.State == "WS":
if self.g_ebi1(current_state) or self.g_ebi2(current_state):
return expect_emergency_brake_status
elif prev_state.State == "IS":
if current_state.V_est > 0:
return expect_emergency_brake_status

def expect(self, behavior):
"""Return expected behavior."""
return (
self.expect_normal_status(behavior)
or self.expect_warning_status(behavior)
or self.expect_emergency_brake_status(behavior)
)


@TestOutline(Scenario)
def check_combination(self, v_mrsp, g_ebi1_threshold, g_ebi2_threshold, speed_sequence):
"""Check specific combination of the controller behavior."""

model = Model(g_ebi1_threshold=g_ebi1_threshold, g_ebi2_threshold=g_ebi2_threshold)
behavior = []

with Given("I start the train control system program"):
start_program()

with And("I set V_MRSP and guard thresholds"):
enter_vmrsp_and_thresholds(
v_mrsp=v_mrsp,
g_ebi1_threshold=g_ebi1_threshold,
g_ebi2_threshold=g_ebi2_threshold,
)
# add the initial state
behavior.append(State(State="NS", V_est=0, V_MRSP=v_mrsp, W=0, EB=0))

for i, v_est in enumerate(speed_sequence):
with When(f"step {i}: speed is {v_est}"):
state = enter_speed(v_est=v_est)
behavior.append(state)
check_result(behavior=behavior, model=model)


@TestFeature
def check_all_combinations(self):
"""Test train control system."""
v_mrsps = {
100: (0,50,123,132,140),
120: (0,50,105,113,120)
}
thresholds = [(15, 7.5)]
i = 0

for v_mrsp, (g_ebi1_threshold, g_ebi2_threshold) in product(list(v_mrsps.keys()), thresholds):
speeds = v_mrsps[v_mrsp]
for speed_sequence in product(speeds, repeat=3):
Combination(
f"#{i}: V_MRSP={v_mrsp},g_ebi1={g_ebi1_threshold},g_ebi2={g_ebi2_threshold},speeds={speed_sequence}",
test=check_combination,
)(
v_mrsp=v_mrsp,
g_ebi1_threshold=g_ebi1_threshold,
g_ebi2_threshold=g_ebi2_threshold,
speed_sequence=speed_sequence,
)
i += 1


@TestModule
def regression(self):
"Exhaustively test train control system." ""
Feature(test=check_all_combinations)()


if main():
regression()

Running the test program

The moment of truth has arrived! Is ChatGPT’s implementation of the controller correct? Let’s run our comprehensive test program:

1
python3 test_train_control_system.py -o slick

First bug

And, either unfortunately or fortunately for our test program, we catch the first bug!

1
2
3
4
5
6
7
8
➤ Module regression
➤ Feature check all combinations
✔ Combination #0﹕ V_MRSP=100,g_ebi1=15,g_ebi2=7․5,speeds=(0, 0, 0)
✔ Combination #1﹕ V_MRSP=100,g_ebi1=15,g_ebi2=7․5,speeds=(0, 0, 50)
✘ Combination #2﹕ V_MRSP=100,g_ebi1=15,g_ebi2=7․5,speeds=(0, 0, 123), AssertionError
✘ Feature check all combinations, AssertionError
✘ Module regression, AssertionError
AssertionError: expected state IS but got WS

The output indicates that we’ve found a case where we expected the controller to enter the emergency braking (intervention) status (IS) state, but instead, it was in the warning status (WS) state. Not good!

After reviewing the code, we identify the issue:

1
2
3
4
5
6
7
8
9
# Apply transition rules based on the current state
if self.state == "NS":
if self.V_est <= self.V_MRSP:
self.state = "NS" # phi_0
elif self.V_est > self.V_MRSP:
self.state, self.W = "WS", 1 # phi_1
# BUG: this needs to be checked before self.V_est > self.V_MRSP!
elif gebi1 or gebi2:
self.state, self.EB = "IS", 1 # phi_2

The code checks the gebi1 or gebi2 condition after self.V_est > self.V_MRSP, which is incorrect! It should first check for gebi1 or gebi2.

Second bug

Let’s apply the fix to train_control_system.py and try again.

But we encounter another issue!

1
2
3
4
5
6
7
8
➤ Module regression
➤ Feature check all combinations
✔ Combination #0﹕ V_MRSP=100,g_ebi1=15,g_ebi2=7․5,speeds=(0, 0, 0)
✔ Combination #1﹕ V_MRSP=100,g_ebi1=15,g_ebi2=7․5,speeds=(0, 0, 50)
✘ Combination #2﹕ V_MRSP=100,g_ebi1=15,g_ebi2=7․5,speeds=(0, 0, 123), AssertionError
✘ Feature check all combinations, AssertionError
✘ Module regression, AssertionError
AssertionError: expected W=1 but got 0

This time, the test program reveals that the warning signal (W) was not set when expected. Upon further investigation, we find that the controller entered the emergency braking (intervention) status state (IS) without setting the warning signal (W) to alert the driver.

It turns out the same block of code is to blame again—it had two bugs!

1
2
3
4
5
6
7
8
9
10
# Apply transition rules based on the current state
if self.state == "NS":
if self.V_est <= self.V_MRSP:
self.state = "NS" # phi_0
elif self.V_est > self.V_MRSP:
self.state, self.W = "WS", 1 # phi_1
# BUG: this needs to be checked before self.V_est > self.V_MRSP!
elif gebi1 or gebi2:
# BUG: warning singal self.W is not set to 1 when moving to the IS state!
self.state, self.EB = "IS", 1 # phi_2

Fixing both bugs

After applying both fixes to the code, we finally get the following:

1
2
3
4
5
6
7
8
# Apply transition rules based on the current state
if self.state == "NS":
if self.V_est <= self.V_MRSP:
self.state = "NS" # phi_0
elif gebi1 or gebi2:
self.state, self.W, self.EB = "IS", 1, 1 # phi_2
elif self.V_est > self.V_MRSP:
self.state, self.W = "WS", 1 # phi_1

Running the test program again, we see that it no longer fails immediately. This time, it actually completes successfully!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ python3 test_train_control_system.py -o slick
➤ Module regression
➤ Feature check all combinations
✔ Combination #0﹕ V_MRSP=100,g_ebi1=15,g_ebi2=7․5,speeds=(0, 0, 0)
✔ Combination #1﹕ V_MRSP=100,g_ebi1=15,g_ebi2=7․5,speeds=(0, 0, 50)
✔ Combination #2﹕ V_MRSP=100,g_ebi1=15,g_ebi2=7․5,speeds=(0, 0, 123)
...
✔ Combination #247﹕ V_MRSP=120,g_ebi1=15,g_ebi2=7․5,speeds=(120, 120, 105)
✔ Combination #248﹕ V_MRSP=120,g_ebi1=15,g_ebi2=7․5,speeds=(120, 120, 113)
✔ Combination #249﹕ V_MRSP=120,g_ebi1=15,g_ebi2=7․5,speeds=(120, 120, 120)
✔ Feature check all combinations
✔ Module regression
...
1 module (1 ok)
1 feature (1 ok)
250 combinations (250 ok)

The test program successfully checked all 2×53=2502 \times 5^3 = 250 input combinations.

Conclusion

In this post, we created a comprehensive test based on the formal description of the Ceiling Speed Monitoring Controller (CSMC) discussed in the Decoding Formal Description of a Simple Train Control System article. By using the set of atomic propositions derived from the Kripke structure and creating input equivalence class partitions from it, our test successfully caught two critical bugs in the AI-generated reference implementation.

These bugs were subtle and would have been challenging to identify through visual code review alone. However, by employing rigorous testing techniques rooted in formal methods, we effectively identified both issues. That said, this test program does not guarantee the discovery of all possible bugs in any given implementation of the controller. The test program achieves thorough coverage only when the set of atomic propositions aligns perfectly with the model. Since we cannot always know the actual atomic propositions for a specific implementation, additional testing techniques are necessary to uncover any hidden atomic propositions that might exist in the code. This is what makes testing both challenging and rewarding!