Chapter - Model Deployment and Monitoring
Supplementary chapter prepared for the BWXT Data Science Workforce Training Pilot.
Outline in development. This chapter is scaffolded from the maturity-model objectives. The BWXT-specific parts — target infrastructure, serving platform, monitoring stack, and retraining policy — should be filled in with the program's subject-matter experts. The conceptual outline below is ready to teach from.
About this chapter
A model that only runs in a notebook delivers no value. Deployment is the work of moving a trained model into a place where it makes predictions on real data; monitoring is making sure it keeps working after it gets there. These are the Tier 4 capabilities the maturity model calls Solutions Architecture and Algorithm & Pipeline Maintenance.
The lifecycle does not end at training
Production machine learning is a loop, not a finish line. A model is deployed, watched, and retrained as the world changes.
What this chapter will cover
Packaging a model
- Saving and versioning model weights and the exact preprocessing steps.
- Pinning dependencies so the model runs the same everywhere.
- (SME input: BWXT's model registry / artifact storage.)
Serving predictions
- Batch scoring versus a real-time inference service.
- A simple prediction API; where inference runs (edge vs. server, CPU vs. GPU).
- (SME input: BWXT's target serving platform and hardware.)
Monitoring in production
- Operational metrics (latency, errors, throughput).
- Model metrics on live data; data drift — when incoming images stop resembling the training set.
- Alerting and a human-in-the-loop review for flagged predictions.
- (SME input: BWXT's monitoring/observability stack.)
Retraining
- Deciding when to retrain (scheduled vs. drift-triggered).
- Keeping a labeled feedback loop from inspectors.
- Safe rollout: shadow testing and rollback.
- (SME input: BWXT's retraining cadence and approval process.)
Why it matters
A weld-defect model trained on last year's images can quietly degrade when a new camera, lighting rig, or material is introduced. Without monitoring you would not notice until defects slipped through. Deployment and monitoring are what turn a good model into a dependable part of the inspection line.
Practice Questions
Practice Questions
- Why is production machine learning described as a loop rather than a one-time task?
- What is data drift, and how could it appear on a weld inspection line?
- Name two things you must package alongside the model weights for it to run reliably.
- What is the difference between batch scoring and real-time inference?
- Give one signal that should trigger retraining a deployed model.