Chapter - Testing and Unit Tests
Supplementary chapter prepared for the BWXT Data Science Workforce Training Pilot. This material is original to the program.
About this chapter
Code that runs is not the same as code that is correct. Tests are small programs that check your code does what you expect — automatically, every time. The competency map lists writing "standardized, automated tests to verify your code is working correctly" as a core skill, and Module 3 calls out unit testing explicitly. This chapter covers the mindset and the everyday tools.
Why test
When you change one function, how do you know you did not quietly break another? Without tests you re-run things by hand and hope. With tests, the computer re-checks everything in seconds. Tests:
- Catch mistakes early, before they reach production data.
- Let you change code with confidence (if the tests still pass, you did not break the checked behavior).
- Document what the code is supposed to do.
For a defect-detection pipeline, a silent bug in preprocessing could mislabel every image. A test on that function would catch it immediately.
The simplest test: assert
Python's assert checks that a condition is true and raises an error if it is not.
def normalize(x, lo, hi):
return (x - lo) / (hi - lo)
assert normalize(5, 0, 10) == 0.5
assert normalize(0, 0, 10) == 0.0If a result is wrong, the program stops at the failing assert and tells you. Asserts are the seed of every test.
Unit tests with pytest
A unit test checks one small piece (a "unit", usually a function) in isolation. pytest is the standard tool. You write functions whose names start with test_, each containing assertions:
# test_features.py
from features import normalize
def test_normalize_midpoint():
assert normalize(5, 0, 10) == 0.5
def test_normalize_bounds():
assert normalize(0, 0, 10) == 0.0
assert normalize(10, 0, 10) == 1.0Run pytest in the terminal and it discovers and runs every test_ function, then reports passes and failures.
The red–green cycle
Good practice is to write the test first (it fails — red), then write code until it passes (green), then improve the code while the test keeps it honest.
What to test
You cannot test everything; test what matters:
- Core logic — preprocessing, feature calculations, metric computations.
- Edge cases — empty input, zeros, the largest and smallest values, missing data.
- Bugs you fix — add a test that fails on the bug, then fix it, so it can never come back.
You do not usually unit-test the model's accuracy this way (that is evaluation), but you do test the deterministic code around it.
Practice Questions
Practice Questions
- Why are automated tests more reliable than re-checking code by hand?
- What does
assertdo when its condition is false? - What is a unit test, and how does pytest find your tests?
- Describe the red–green–refactor cycle in your own words.
- Name three things worth testing in a data-preprocessing function.
- When you fix a bug, why is it good practice to add a test for it first?