_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

Performance Testing

We know where your system breaks.

Systematic performance testing — load tests, stress tests, performance budgets. We find bottlenecks before your users find them on Black Friday.

<500ms
Response time p95
10k rps
Throughput
100%
Bottleneck detection
6 months
Capacity plan

Why performance testing

Your application works perfectly with 10 users. What happens with 10,000? With 50,000 during Black Friday? What if the marketing team launches a campaign and traffic spikes tenfold in an hour?

Performance issues are insidious. They don’t show up in the development environment. They don’t show up in code review. They don’t show up in unit tests. They show up in production, under real load, when it’s too late to fix anything.

Every extra second costs money. Amazon found that 100ms of added latency reduces sales by 1%. Google showed that a page taking 3 seconds to load loses 53% of mobile visitors. Netflix invests millions in performance testing — because the alternative is losing customers.

Performance testing isn’t a luxury. It’s insurance against the most expensive type of incident — a slow or unavailable application.

Types of performance tests

Each type answers a different question. A complete performance strategy combines all of them.

Load testing

Question: Can the system handle expected traffic?

A load test simulates real-world load — typical user count, typical operations, typical data volumes. We measure response time, throughput, error rate and resource utilization.

Example: An e-commerce platform expects 5,000 concurrent users during the daily peak. The load test simulates 5,000 virtual users performing a mix of operations: 60% browse, 25% search, 10% add to cart, 5% checkout. It runs for 30 minutes. Response time p95 under 500ms? ✅ Over 2s? 🔴 Time to find the bottleneck.

Stress testing

Question: Where is the limit and what happens when we exceed it?

A stress test gradually increases load beyond expected levels. 100% → 150% → 200% → 300% of normal traffic. We’re looking for the breaking point — where does the system start to degrade? Where does it fail? And crucially: how does it recover?

Graceful degradation is key. A system under extreme load should slow down, not crash. Rate limiting, queue backpressure, circuit breakers — the stress test verifies that these mechanisms work.

Spike testing

Question: Can the system handle a sudden surge in traffic?

We simulate a sharp spike — from 500 to 5,000 users in 30 seconds. Typical scenarios: push notifications (everyone opens the app at once), viral content, flash sales, live sports results. Does auto-scaling react fast enough? Does the connection pool hold? Does the cache stay valid?

Soak testing (endurance)

Question: Can the system sustain constant load over a long period?

Runs for hours or days under normal load. Reveals memory leaks, connection pool exhaustion, disk space issues, log rotation problems, garbage collection degradation. Problems that don’t show up in 5 minutes but do in 5 hours.

Breakpoint testing

Question: What is the system’s maximum capacity?

We automatically increment load until SLAs are breached — response time exceeds the limit, error rate exceeds the threshold. The result: an exact capacity number. “The system handles 8,200 concurrent users at p95 < 500ms.” Capacity planning based on data, not guesswork.

k6 — our primary tool

k6 by Grafana Labs combines easy test writing in JavaScript with the performance of a Go runtime. Ideal for modern API-first architectures and CI/CD integration.

Why k6

JavaScript/TypeScript: You write tests in a language your team already knows. No DSL, no new language. Module imports, TypeScript types, IDE support.

Performance: The Go runtime handles thousands of virtual users on a single machine. 10,000 concurrent connections without issues. For higher loads, distributed mode via k6-operator on Kubernetes.

CI/CD native: CLI tool with exit codes based on thresholds. k6 run --threshold 'http_req_duration{p(95)<500}' — the test fails if p95 exceeds 500ms. Integrates with GitHub Actions, GitLab CI, any CI system.

Grafana integration: Results go straight to a Grafana dashboard. Real-time visualization during the test. Historical comparison — was this build faster or slower?

k6 test example

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '2m', target: 100 },   // ramp up
    { duration: '5m', target: 100 },   // steady state
    { duration: '2m', target: 0 },     // ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'],
    http_req_failed: ['rate<0.01'],
  },
};

export default function () {
  const res = http.get('https://api.example.com/products');
  check(res, {
    'status 200': (r) => r.status === 200,
    'response < 500ms': (r) => r.timings.duration < 500,
  });
  sleep(1);
}

Readable, version-controlled, runnable in CI. No GUI, no vendor lock-in.

Gatling for enterprise scenarios

Gatling excels at complex scenarios with session handling, correlation and data-driven tests. Scala/Java/Kotlin DSL, strong in the enterprise Java ecosystem.

When Gatling: Complex workflows with authentication, CSRF tokens, session-based state. Enterprise systems with a Java backend where the team prefers the JVM ecosystem. Scenarios requiring sophisticated correlation and parameterization.

When k6: API testing, microservices, JavaScript/TypeScript teams, CI/CD integration, quick start. Most modern projects.

Performance budgets

A performance budget is a limit you must not exceed. Like a financial budget — but for speed.

Defining budgets

Response time budgets: - API endpoint: p95 < 200ms, p99 < 500ms - Page load (LCP): < 2.5s - Time to Interactive: < 3.5s - First Contentful Paint: < 1.5s

Resource budgets: - JavaScript bundle: < 200KB gzipped - Total page weight: < 1MB - Number of requests: < 50 - Web font size: < 100KB

Infrastructure budgets: - CPU per pod: < 70% sustained - Memory per pod: < 80% - Database query time: < 50ms p95 - Cache hit rate: > 90%

Enforcement

Performance budgets without enforcement are just wishes. We enforce them:

  • CI/CD gates: k6 thresholds, Lighthouse CI, webpack bundle analyzer — exceeding the budget = failed build
  • Monitoring alerts: Prometheus alerts when production budgets are exceeded
  • Review process: Performance impact assessment for architectural changes

Bottleneck analysis

A performance test reveals the symptom (slow response). Bottleneck analysis finds the cause.

Common bottlenecks

Database: N+1 queries, missing indexes, full table scans, lock contention, connection pool exhaustion. Solutions: query optimization, indexing, connection pooling, read replicas, caching.

Memory: Memory leaks, GC pressure, oversized caches, unbounded buffers. Solutions: profiling, heap dumps, memory-efficient data structures.

Network: DNS resolution, TLS handshake, chatty microservices, large payloads. Solutions: connection pooling, gRPC, payload compression, CDN.

Concurrency: Thread pool exhaustion, async bottlenecks, lock contention, event loop blocking. Solutions: async/await, non-blocking I/O, lock-free structures.

How we work together

  1. Baseline — we measure current performance and identify critical flows
  2. Test scenarios — we design load profiles based on production data
  3. Execution — we run a battery of tests (load, stress, spike, soak)
  4. Analysis — we identify bottlenecks and propose optimizations
  5. Optimization — we implement fixes, re-test, compare
  6. CI/CD integration — performance tests as part of the pipeline, budgets as gates
  7. Capacity plan — how many users you can handle, when you need to scale

Stack

Load testing: k6, Gatling, Artillery, Locust, JMeter (legacy).

Frontend performance: Lighthouse, WebPageTest, Core Web Vitals.

Profiling: async-profiler (Java), py-spy (Python), pprof (Go), Chrome DevTools.

Monitoring: Prometheus + Grafana for real-time visualization during tests.

CI/CD: GitHub Actions, GitLab CI — performance gates in the pipeline.

Časté otázky

Whenever speed matters — e-commerce, fintech, SaaS, API platforms. Specifically: before major traffic spikes (campaigns, seasonal peaks), after architectural changes, before infrastructure migrations, when onboarding a large customer.

A load test verifies that the system can handle expected traffic (e.g. 1,000 concurrent users). A stress test finds where the system breaks — we gradually increase the load until it fails. A load test says 'we can handle it'. A stress test says 'here's the limit and here's what happens when we exceed it'.

A one-off performance audit: 2-3 weeks. Continuous performance testing in CI/CD: setup in 1-2 weeks, then minimal maintenance. Cost depends on system complexity. ROI: 1 second of extra load time = -7% conversions (Amazon study).

k6 for most projects — JavaScript/TypeScript, easy CI/CD integration, excellent for API testing. Gatling for Java/Scala ecosystems and scenarios requiring complex correlation and session handling. The choice depends on your stack and team.

Yes, with care. Smoke tests (low load) against production are common and safe. Full load tests run against a staging environment identical to production, or against production during off-peak hours with feature flags for isolation.

Máte projekt?

Pojďme si o něm promluvit.

Domluvit schůzku