Quality Gates
Bad code doesn't make it to production.
We design and implement quality gates in the CI/CD pipeline — code coverage thresholds, security scanning, dependency audits, deployment policies. Automatic quality protection on every commit.
Why quality gates¶
Code review catches logical errors. Quality gates catch everything else — automatically, consistently, on every commit. Without fatigue, without oversight, without “let’s skip it just this once”.
Without quality gates you rely on discipline. Discipline fails on Friday afternoon, under deadline pressure, when “just this small fix” goes through without tests, without review, straight to production. And on Monday you’re dealing with an incident.
Quality gates turn best practices into policy. We don’t say “you should write tests” — we say “without tests, code doesn’t reach production”. We don’t say “check your dependencies” — we say “a critical vulnerability = blocked merge”.
The result: consistent quality regardless of who commits, when they commit, and under what pressure.
Anatomy of a CI/CD pipeline with quality gates¶
A modern CI/CD pipeline is a series of gates — each checking a different aspect of quality. Failure at any gate = blocked.
Gate 1: Static analysis (30s-2min)¶
Linting: ESLint, Pylint, golangci-lint, SwiftLint. Consistent code style, caught anti-patterns, potential bugs. Autofix where possible — formatting is fixed automatically, the developer only deals with real issues.
Type checking: TypeScript tsc --noEmit, mypy, Go compiler. Type errors are the cheapest bugs to fix — caught in seconds, not hours in production.
Dead code detection: Unused imports, unreachable code, unused variables. A clean codebase is a maintainable codebase.
Gate 2: Unit tests (1-3min)¶
The full unit test suite. Fast feedback — the developer knows within minutes whether something is broken. Parallelization for large test suites.
Fail criteria: Any test failure = blocked. No “that’s a flaky test, we’ll ignore it”. A flaky test is either fixed or deleted. Flaky tests destroy confidence in the entire test suite.
Gate 3: Code coverage (part of unit tests)¶
A coverage report generated during the test run. Threshold enforcement:
Overall coverage: Minimum 80% line coverage. Merge blocked below the threshold.
Diff coverage: New code in the PR must have >90% coverage. Legacy debt doesn’t block new development, but new code must be tested.
Critical path coverage: Business logic, error handling, security-sensitive code must have 100% coverage. Identified via code owners or path patterns.
Gate 4: Integration tests (2-5min)¶
API contract tests, database integrations, service-to-service communication. Test environments spun up in CI (Docker Compose, testcontainers).
Gate 5: Security scanning (2-5min)¶
Multiple layers of security checks:
Dependency scanning (SCA): Snyk, Dependabot, Trivy. Checking for known vulnerabilities in dependencies. Critical/High vulnerability = blocked merge. Medium/Low = warning, fix within the sprint.
SAST (Static Application Security Testing): Semgrep, CodeQL, SonarQube. Source code analysis for security anti-patterns — SQL injection, XSS, hardcoded secrets, insecure deserialization.
Secret detection: GitLeaks, TruffleHog. Scanning commits for API keys, passwords, tokens. One committed secret = one compromised secret. Prevention is infinitely cheaper than rotation.
Container scanning: Trivy, Snyk Container. Vulnerabilities in base images, misconfigured Dockerfiles. An outdated base image with a known CVE = blocked build.
License compliance: FOSSA, Snyk. Checking dependency licenses. GPL in a commercial project? Flagged.
Gate 6: E2E tests (3-10min)¶
Critical user flows tested end-to-end. Playwright/Cypress against a staging environment. A smoke test subset — not the full E2E suite, but critical paths.
Gate 7: Performance check (optional, 2-5min)¶
k6 smoke test against staging. Performance thresholds — latency, throughput. A regression in response time = warning or block.
Gate 8: Deployment policy (runtime)¶
The last gate before production:
Canary deployment: New version on 5% of traffic. Monitoring error rate and latency. Automatic rollback on anomaly. Progressive rollout: 5% → 25% → 50% → 100%.
Feature flags: New functionality hidden behind a feature flag. Deployed to production but visible only to the internal team. Gradually enabled for user segments.
Deploy windows: Defined windows for production deploys. No deploys on Friday afternoon (unless it’s a hotfix). No deploys during peak hours without explicit approval.
Required approvals: For critical services — manual approval after automated gates. Two-person rule for infrastructure changes.
Code coverage — done right¶
Coverage is a useful metric when used correctly. And useless when you gamify it.
What coverage measures¶
Line coverage: How many lines of code were executed during tests. Most common, simplest, but least informative.
Branch coverage: How many branches (if/else, switch cases) were tested. Better — reveals untested edge cases.
Function coverage: How many functions were called. Useful for a high-level overview.
Mutation coverage: The strictest. Mutates the code (changes > to <, removes a line) and checks whether tests fail. If a mutation passes the tests = the test is weak. Compute-intensive but reveals the true quality of tests.
Coverage anti-patterns¶
Testing for coverage, not correctness: Tests that pass through the code without meaningful assertions. The line executes, coverage goes up, but the test verifies nothing.
100% coverage obsession: Testing trivial code (constructors, getters) at the expense of complex business logic. 80% coverage with quality tests > 100% coverage with empty assertions.
Coverage ratchet without nuance: “Coverage must never drop” as an absolute. Sometimes refactoring removes tested code and coverage drops. That’s OK — diff coverage handles new code, overall coverage is managed strategically.
Security scanning in practice¶
Security scanning without a process generates noise. With a process, it catches vulnerabilities before they reach production.
Triage workflow¶
- Scan finds a finding — vulnerability, secret, anti-pattern
- Automatic classification — severity (Critical, High, Medium, Low)
- Critical/High: Blocked merge. Fix required before merge.
- Medium: Warning. Fix required by end of sprint. Tracked in the backlog.
- Low: Informational. Fix when convenient.
- False positive: Suppress with a comment (why). Regular review of suppressions.
Shift-left security¶
Security scanning doesn’t belong only in the CI pipeline. We integrate it into the developer workflow:
IDE plugins: Semgrep, Snyk IDE extension — real-time security feedback while writing code.
Pre-commit hooks: Secret detection, basic SAST. Errors caught before commit = zero CI time wasted.
PR review: A security bot comments on the PR with findings. The developer sees context directly in code review.
Deployment policies¶
Quality gates in CI protect the code. Deployment policies protect production.
Progressive delivery¶
Blue-green deployment: Two identical production environments. Deploy to the inactive one, switch traffic. Instant rollback = switch back.
Canary releases: New version on a small percentage of traffic. Automated metric analysis. Anomaly = automatic rollback. Progressive expansion.
Feature flags: Separating deploy from release. Code is in production but the feature is off. Enabled for specific users, percentages, segments. Kill switch for instant disable without redeploying.
Rollback policy¶
Every deploy must have a rollback plan. Automatic rollback on: - Error rate > threshold (5× baseline) - Latency > threshold (3× baseline) - Health check failure - Canary analysis failure
Rollback must be faster than fix forward. Typically < 5 minutes.
Implementation plan¶
- Audit — we map the current CI/CD pipeline and identify gaps
- Quick wins — linting, type checking, secret detection (days 1-3)
- Test gates — unit tests, coverage thresholds (weeks 1-2)
- Security gates — SCA, SAST, container scanning (weeks 2-3)
- E2E gates — Playwright/Cypress in the pipeline (weeks 3-4)
- Deployment policies — canary, feature flags, rollback automation (weeks 4-6)
- Monitoring — gate metrics dashboard, trend analysis, optimization
Stack¶
CI/CD: GitHub Actions, GitLab CI, CircleCI, Azure DevOps.
Code quality: SonarQube, ESLint, Pylint, golangci-lint.
Security: Snyk, Semgrep, CodeQL, Trivy, GitLeaks, FOSSA.
Coverage: Istanbul/nyc, coverage.py, JaCoCo, lcov.
Deployment: Argo Rollouts, Flagger, LaunchDarkly, Unleash.
Monitoring: Grafana, Prometheus — gate pass rates, pipeline duration trends.
Časté otázky
Automated checks in the CI/CD pipeline that code must pass before it reaches production. If a test fails, coverage drops below the threshold, or a security vulnerability is found — the merge/deploy is blocked. An automatic quality guardian.
Paradoxically, they speed it up. Without quality gates the team spends time debugging production bugs, hotfixing security issues, and handling incidents. With quality gates you catch problems within minutes of a commit — fixing them costs a fraction of the effort compared to fixing them in production.
80% is a healthy goal for most projects. 100% is counterproductive — you end up testing getters and setters instead of business logic. What matters more than overall coverage is coverage of critical paths — business logic, error handling, edge cases. Measure coverage, but don't optimize just for the number.
Every finding goes through triage. True positives → fix. False positives → suppress with documentation explaining why. Regular review of suppressions. The goal: zero alert fatigue — every security alert is actionable. Otherwise the team stops reading alerts.
Yes, and we recommend it. Start: linting + type checking (day 1). Then unit tests. Then coverage thresholds. Then security scanning. Gradual rollout lets the team adapt while the pipeline stays fast.