Chapter - Testing and Unit Tests

Supplementary chapter prepared for the BWXT Data Science Workforce Training Pilot. This material is original to the program.

About this chapter

Code that runs is not the same as code that is correct. Tests are small programs that check your code does what you expect — automatically, every time. The competency map lists writing "standardized, automated tests to verify your code is working correctly" as a core skill, and Module 3 calls out unit testing explicitly. This chapter covers the mindset and the everyday tools.

Why test

When you change one function, how do you know you did not quietly break another? Without tests you re-run things by hand and hope. With tests, the computer re-checks everything in seconds. Tests:

Catch mistakes early, before they reach production data.
Let you change code with confidence (if the tests still pass, you did not break the checked behavior).
Document what the code is supposed to do.

For a defect-detection pipeline, a silent bug in preprocessing could mislabel every image. A test on that function would catch it immediately.

The simplest test: assert

Python's assert checks that a condition is true and raises an error if it is not.

python

def normalize(x, lo, hi):
    return (x - lo) / (hi - lo)

assert normalize(5, 0, 10) == 0.5
assert normalize(0, 0, 10) == 0.0

If a result is wrong, the program stops at the failing assert and tells you. Asserts are the seed of every test.

Unit tests with pytest

A unit test checks one small piece (a "unit", usually a function) in isolation. pytest is the standard tool. You write functions whose names start with test_, each containing assertions:

python

# test_features.py
from features import normalize

def test_normalize_midpoint():
    assert normalize(5, 0, 10) == 0.5

def test_normalize_bounds():
    assert normalize(0, 0, 10) == 0.0
    assert normalize(10, 0, 10) == 1.0

Run pytest in the terminal and it discovers and runs every test_ function, then reports passes and failures.

The red–green cycle

Good practice is to write the test first (it fails — red), then write code until it passes (green), then improve the code while the test keeps it honest.

Red, green, refactor: write a failing test, make it pass with the simplest code, then clean up while the tests guard the behavior. Repeat for the next small piece.

What to test

You cannot test everything; test what matters:

Core logic — preprocessing, feature calculations, metric computations.
Edge cases — empty input, zeros, the largest and smallest values, missing data.
Bugs you fix — add a test that fails on the bug, then fix it, so it can never come back.

You do not usually unit-test the model's accuracy this way (that is evaluation), but you do test the deterministic code around it.

Practice Questions

Why are automated tests more reliable than re-checking code by hand?
What does assert do when its condition is false?
What is a unit test, and how does pytest find your tests?
Describe the red–green–refactor cycle in your own words.
Name three things worth testing in a data-preprocessing function.
When you fix a bug, why is it good practice to add a test for it first?

What you'll be able to do

Key terms in this chapter