User Guide¶
This guide explains how to write and run pseudotest regression tests in day-to-day workflows.
What pseudotest does¶
pseudotest runs your executable in an isolated temporary working directory for each input case, then compares generated outputs against expected references defined in YAML.
Each test run:
- Reads a YAML config file.
- For every listed input file, copies the input (and any extra files) into a fresh temporary directory.
- Executes the program inside that directory.
- Checks specified outputs against reference values.
- Reports pass/fail and exits with a status code.
Typical use cases:
- Regression testing of scientific codes after refactors or optimisation passes
- Verifying numeric outputs, log patterns, file sizes, and directory contents
- CI checks for numerical stability and expected failures
- Baseline capture with
pseudotest-updateafter intentional changes
Install¶
Using pip¶
You can install the latest release directly from PyPI:
From source¶
From a local clone:
With optional test/developer extras:
Quick start¶
- Create a YAML config file (for example
test.yaml). - Ensure your target executable is in a directory (for example
./bin). - Run:
The command returns:
0when all executions and matches pass1when any execution or match fails2on a configuration/usage error3on a runtime error
Configuration model¶
Top-level keys¶
| Key | Required | Description |
|---|---|---|
Name |
Yes | Human-readable test suite name |
Enabled |
No | If false, test is skipped (default: true) |
Executable |
Yes | Executable filename looked up in -D/--directory |
InputMethod |
No | argument, stdin, or rename (default: argument) |
RenameTo |
Conditional | Required if InputMethod: rename |
Inputs |
Yes | Mapping of input filename to per-input config |
Per-input keys¶
| Key | Required | Description |
|---|---|---|
ExtraFiles |
No | List of additional files to copy into work dir |
Processors |
No | Number of MPI processes when MPIEXEC is set (default: 1) |
ExpectedFailure |
No | If true, execution failure is treated as pass |
InputMethod |
No | Overrides the top-level InputMethod for this input only |
RenameTo |
Conditional | Overrides the top-level RenameTo for this input only |
Matches |
No | Mapping of named checks |
Top-level keys (Executable, InputMethod, RenameTo) act as defaults and can be overridden per input.
ExtraFiles resolution¶
Paths in ExtraFiles are resolved relative to the directory containing the YAML test file. All files are copied flat into the temporary working directory before the executable runs.
Avoiding repetition¶
Two scope mechanisms let you factor out shared settings so they don't have to be repeated for every input or every match.
Execution scope¶
Executable, InputMethod, and RenameTo placed at the top level act as defaults for every input. Any per-input block can override them individually:
# Without top-level defaults — InputMethod is repeated for every input
Name: Solver tests
Executable: solver.x
Inputs:
case_01.in:
InputMethod: stdin
Matches: ...
case_02.in:
InputMethod: stdin
Matches: ...
case_03.in:
InputMethod: stdin
Matches: ...
# With a top-level default — InputMethod is written once
Name: Solver tests
Executable: solver.x
InputMethod: stdin # inherited by all inputs
Inputs:
case_01.in:
Matches: ...
case_02.in:
Matches: ...
case_03.in:
InputMethod: argument # override for this input only
Matches: ...
Similarly for RenameTo when using InputMethod: rename, define it once at the top level and only override it in the rare input that needs a different name.
Match scope¶
A named match can act as a group by containing child matches alongside shared parameters. Any recognised parameter placed directly in the group, such as file:, grep:, tol:, or directory:, is automatically inherited by every child match, so it only needs to be written once:
# Without grouping — file: and tol: are repeated for every match
Matches:
energy:
file: results.txt
grep: "Total energy:"
field: 3
value: -42.5000
tol: 1e-4
force_x:
file: results.txt
grep: "Force x:"
field: 2
value: -0.00123
tol: 1e-4
force_y:
file: results.txt
grep: "Force y:"
field: 2
value: 0.00045
tol: 1e-4
# With a match group — file: and tol: are written once
Matches:
results: # group: shared parameters for all children
file: results.txt
tol: 1e-4
energy:
grep: "Total energy:"
field: 3
value: -42.5000
force_x:
grep: "Force x:"
field: 2
value: -0.00123
force_y:
grep: "Force y:"
field: 2
value: 0.00045
A child match can override an inherited parameter by defining it locally as the child's value takes precedence:
Matches:
results:
file: results.txt
tol: 1e-4
energy:
grep: "Total energy:"
field: 3
value: -42.5000
checksum:
file: checksums.txt # overrides the group's file:
grep: "SHA256:"
field: 2
value: abc123def456
Groups can nest to any depth. A deeply nested group inherits from all its ancestors:
Matches:
results:
file: results.txt
energies:
tol: 1e-6 # tighter tolerance for the energies sub-group
total:
grep: "Total energy:"
field: 3
value: -42.5000
exchange:
grep: "Exchange:"
field: 2
value: -3.1416
forces:
tol: 1e-4 # looser tolerance for forces
atom_1:
grep: "Atom 1:"
field: 2
value: -0.00123
Complete example¶
This example covers the most common match types in a single config:
Name: Solver regression suite
Enabled: true
Executable: solver.x
InputMethod: argument
Inputs:
case_01.in:
ExtraFiles: [basis.dat, pseudo.UPF]
Processors: 4
Matches:
# Extract a field by searching for a keyword
Total Energy:
file: results.txt
grep: "Total energy:"
field: 3
value: -42.5000
tol: 1e-4
# Read from a specific line number (1-based)
Convergence Flag:
file: results.txt
line: 5
field: 2
value: converged
# Count occurrences of a pattern
Warning count:
file: run.log
grep: WARNING
count: 0
# Fixed-width column extraction
Band Gap:
file: bands.txt
grep: "Band gap"
column: 20
value: 1.0342
tol: 1e-3
# Complex number magnitude
Wavefunction magnitude:
file: wf.txt
grep: "Value:"
field_re: 2
field_im: 3
value: 3.1416
tol: 1e-4
# File size check
Restart File:
file: restart.bin
size: 65536
# Directory checks
Output directory count:
directory: output
count_files: 3
Output summary:
directory: output
file_is_present: summary.txt
# A case expected to fail (used for negative testing)
bad_input.in:
ExpectedFailure: true
Convergence:
file: output.txt
line: 3
field: 2
value: converged
Negative line values count from the end of the file (line: -1 is the last line):
Offsetting from a grep match
When both grep and line are present, line is treated as an offset from the matched line (0 = same line, 1 = next line, etc.). This is useful when the value appears on the line after a header:
Force after header:
file: results.txt
grep: "Forces (Ha/Bohr):"
line: 1
field: 2
value: -0.00123
tol: 1e-5
Given:
grep finds "Forces (Ha/Bohr):" and line: 1 steps to the next line, then field: 2 extracts -0.00124.
Extracting by field
field is 1-based and splits on whitespace or commas (with optional surrounding whitespace), equivalent to awk '{print $N}' for whitespace-separated output but also handling CSV-style output. Two consecutive commas produce an empty field between them:
Pressure:
file: output.txt
grep: "Pressure:"
field: 2 # second field (whitespace- or comma-separated)
value: 101.325
tol: 0.01
Extracting by character column
column extracts from a fixed character position (1-based), then takes the first token (delimited by whitespace or commas). This is useful for fixed-width formatted output:
# Output line: "Band gap (eV) 1.0342 direct"
# 123456789012345678901234567890
# ^ column 21
Band Gap:
file: bands.txt
grep: "Band gap"
column: 21
value: 1.0342
tol: 1e-3
Complex number magnitude
When output contains a complex number as two separate fields, field_re and field_im extract the real and imaginary parts and compare their magnitude (sqrt(re² + im²)) to value:
# Output line: "Wavefunction: 2.2214 2.2214"
Wavefunction magnitude:
file: evals.txt
grep: "Wavefunction:"
field_re: 2 # field holding the real part
field_im: 3 # field holding the imaginary part
value: 3.1416
tol: 1e-4
Counting matching lines
When count is used instead of a value-extraction key, pseudotest counts all lines containing the grep pattern. The count check takes precedence over field/column/field_re/field_im if both are present.
No warnings:
file: run.log
grep: "WARNING"
count: 0
Error count:
file: run.log
grep: "ERROR"
count: 2
Numeric tolerance
tol is an absolute tolerance applied when both the extracted value and the reference are numeric:
energy:
file: results.txt
grep: "Energy:"
field: 2
value: -42.5000
tol: 1e-4 # pass if |calculated - reference| <= 1e-4
Without tol, numeric values must match exactly (difference == 0). String values always require exact equality regardless of tol.
If the specified tol is smaller than the effective precision implied by the format of the extracted value (e.g. tol: 1e-8 for a value printed as 1.234), the match is treated as a failure. The detail block reports the offending tolerance and the effective precision, and suggests a minimum acceptable value. This catches configurations where the tolerance constraint is meaningless because the output cannot resolve differences that fine.
File metadata match¶
Compare a file's size in bytes:
Directory matches¶
File presence: assert that a specific file exists inside a directory:
File count: count files directly inside a directory (subdirectories are not counted):
If the directory does not exist, both directory matches fail.
Broadcasted matches (vector-style checks)¶
When any parameter value in a match is a list, pseudotest expands that match into one logical sub-match per list element. All list-valued parameters in the same match must have equal length; scalar parameters are reused for every element.
multi_energy:
matches: [R1, R2, R3]
file: [r1.txt, r2.txt, r3.txt]
grep: "Energy:"
field: 2
value: [-10.0, -20.0, -30.0]
tol: 1e-6 # scalar: same tolerance applied to all three
This is equivalent to writing three separate named matches, with names "R1", "R2", and "R3". Any element that fails is reported individually.
Broadcast works with any match type:
# Check the same field across multiple files
band_gaps:
matches: [Case1, Case2]
file: [case1/bands.txt, case2/bands.txt]
grep: "Band gap"
field: 3
value: [1.1, 2.3]
tol: [0.01, 0.01]
# Check file presence across multiple directories
Checkpoint directories:
matches: [run1, run2]
directory: [run1/output, run2/output]
count_files: [4, 6]
Running tests¶
Options¶
| Flag | Description |
|---|---|
-D, --directory DIR |
Directory containing executables (default: .) |
-p, --preserve |
Keep temporary work directory after the run for debugging |
-t, --timeout N |
Per-input execution timeout in seconds (default: 600) |
-r, --report FILE |
Append a YAML execution report to FILE |
-v / -vv |
Increase logging verbosity (INFO / DEBUG) |
Inspecting failures¶
Add -p to retain the working directory after a failed run, then inspect generated files:
Use -vv to see the full stdout/stderr of the executable on failure, and to trace match evaluation in detail.
Updating failing configs¶
pseudotest-update re-runs the test suite and automatically patches the YAML config for failing matches. Two modes are available.
Update tolerances¶
For each failing numeric match, computes |calculated - reference| × 1.1 rounded up to two significant figures and writes that value as tol. Reference values are not changed.
Example: if the observed difference is 0.0034, the written tolerance is 0.0038.
Update reference values¶
Replaces each failing reference value with the observed calculated value. Tolerances are not changed. The replacement is type-preserving: a ScalarFloat reference retains its original decimal precision.
Write to a different file¶
The original file is left untouched; changes are written to updated.yaml.
Protecting matches from updates¶
Add protected: true to any match that must never be modified automatically:
This is useful for cases where the references are obtained through some other method (e.g., theorical values).
Note that file_is_present checks are never updated automatically regardless of the protected flag.
Broadcast and tolerance updates¶
When a tolerance update applies to a broadcasted match, the scalar tol is automatically expanded to a list of the correct length, and only the failing elements are changed:
# Before update (two values, one failing)
multi:
file: [r1.txt, r2.txt]
value: [1.0, 2.0]
tol: 1e-6
# After --tolerance update (only the second element was failing)
multi:
file: [r1.txt, r2.txt]
value: [1.0, 2.0]
tol: [1e-6, 5.5e-4]
MPI execution¶
Set MPIEXEC to your MPI launcher to enable parallel execution:
The launcher is prepended automatically and the per-input Processors key controls the process count:
Supported launchers and their process-count flag:
| Launcher | Flag |
|---|---|
mpiexec, mpirun, mpiexec.hydra, orterun |
-np |
srun (SLURM) |
-n |
aprun (Cray) |
-n |
| any other | -np (default) |
Different inputs can use different process counts:
When MPIEXEC is not set, Processors has no effect and the executable is run directly.
YAML report output¶
Pass --report FILE to append a YAML document with per-run results:
If results.yaml already exists, the new document is appended with a --- separator (multi-document YAML).
Report structure:
test.yaml:
Name: Solver regression suite
Enabled: true
Executable: solver.x
Inputs:
case_01.in:
InputMethod: argument
Processors: 4
ExpectedFailure: false
Execution: pass
Elapsed time: 3.14
Matches:
total_energy:
file: results.txt
grep: "Total energy:"
field: 3
reference: -42.5000 # original reference
value: -42.5001 # calculated value
This output can be useful as a CI artifact or for further processing.
Troubleshooting¶
Executable not found¶
- Verify that
Executablein the YAML matches the actual filename. - Check that the path given to
-Dcontains the executable. - Confirm the executable has the execute bit set (
chmod +x).
Match extraction returns None / match fails with no detail¶
- Use
-pto keep the work directory and open the target file directly. - Check that
grepmatches a line that actually exists. - Check that
fieldorcolumnindex is within range for that line. - If the file is empty or missing, the match will always fail.
Tolerance too small¶
A failure message like "Tolerance 1e-8 is smaller than the effective precision 1e-4 of calculated value '1.2300'. Consider using tolerance >= 1.00e-04" means the printed value has fewer significant digits than the tolerance requires. The match is unconditionally failed in this case. Either relax the tolerance to at least the suggested value, or configure the executable to print more significant digits.
Timeout failures¶
- Increase
--timeout. - Add
-vvto see if the executable starts at all. - Verify MPI settings and that the launcher is available.
Unexpected update behavior¶
- Matches with
protected: trueare never modified. file_is_presentchecks are never reference-updated.- Only failing matches are updated; passing ones are left alone.