Pipeline testing is key for reliability. Unit tests, integration tests and automated quality checks in CI/CD.
Why Test Pipelines¶
Untested pipelines lead to silent failures — bad data in reports.
Test Pyramid¶
- Unit tests — individual transformations
- Integration tests — entire pipeline
- Data quality tests — output validation
- Contract tests — compliance with contracts
def test_removes_test_orders():
input_data = [
{"id": 1, "status": "confirmed"},
{"id": 2, "status": "test"},
]
result = run_model("stg_orders", input_data)
assert len(result) == 1
assert all(r["status"] != "test" for r in result)
CI/CD¶
# .github/workflows/data-ci.yml
jobs:
test:
steps:
- run: pip install dbt-duckdb
- run: dbt test
- run: soda scan -d test checks/
Summary¶
Pipeline testing prevents silent failures. Unit tests, quality checks and CI/CD are foundation of reliable data.
testingdata pipelineci/cddata quality