Testing and Deployment
Three Test Directories
| Directory | What It Tests | Requirements |
|---|---|---|
tests/unit/ |
Parsers, loaders, orchestration, validation, config | Fixtures only — no database, no network |
tests/web/ |
API endpoints, HTML pages, security headers, accessibility | In-memory fixtures via TestClient |
tests/warehouse/ |
Real data validation against warehouse.duckdb |
Populated warehouse (auto-skipped if absent) |
Unit and web tests run in CI on every push and PR. Warehouse tests run only locally after
jobclass-pipeline run-all.
What the Tests Verify
- Schema contracts — Required columns exist on every table
- Grain uniqueness — No duplicate business keys in any dimension or fact
- Referential integrity — Every fact row's dimension keys point to existing dimension rows
- Idempotence — Re-running a load produces the same row count
- Validation framework — Structural, temporal, and drift checks all pass
- API correctness — Every endpoint returns expected status codes and response shapes
- Security — CSP headers, no PII exposure, CORS configuration
CI Configuration
GitHub Actions runs on every push to main and every PR:
lint:
python-version: "3.14"
steps: ruff check + ruff format --check
test:
matrix: [3.12, 3.14]
steps: pip install -e ".[dev]" → pytest --cov
Key lesson: Run ruff format --check src/ tests/ locally
before pushing. CI will reject unformatted code even if it's functionally correct.
Full Deployment Pipeline
1. ruff check src/ tests/ # Lint passes
2. ruff format --check src/ tests/ # Formatting matches
3. pytest tests/unit/ tests/web/ -q # All tests pass
4. git push # CI passes on GitHub
5. python scripts/build_static.py \
--base-path /jobclass # Rebuild static site
6. python scripts/deploy_pages.py # Deploy to GitHub Pages
Steps 1–4 ensure code quality. Step 5 regenerates every HTML page and JSON file
(takes several minutes for ~870 occupations). Step 6 force-pushes _site/
to the gh-pages branch.