_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

Chaos Engineering — Advanced Techniques

04. 09. 2025 1 min read intermediate

DevOps Expert

Chaos Engineering — Advanced Techniques

Chaos EngineeringLitmusChaos MeshResilience 6 min read

Advanced chaos engineering experiments. Litmus, Chaos Mesh, steady state hypothesis and blast radius.

Principles

  1. Define steady state — what does normal behavior look like?
  2. Formulate hypothesis
  3. Inject failure — controlled
  4. Observe — was hypothesis confirmed/disproven?
  5. Fix — repair found weaknesses

Litmus Chaos

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: pod-kill-test
spec:
  appinfo:
    appns: production
    applabel: app=api-server
    appkind: deployment
  engineState: active
  experiments:
    - name: pod-delete
      spec:
        components:
          env:
            - name: TOTAL_CHAOS_DURATION
              value: "60"
            - name: CHAOS_INTERVAL
              value: "10"
        probe:
          - name: check-api-health
            type: httpProbe
            httpProbe/inputs:
              url: http://api-server.production/health
              method:
                get:
                  criteria: ==
                  responseCode: "200"
            mode: Continuous

Chaos Mesh

apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
  name: network-delay
spec:
  action: delay
  mode: all
  selector:
    namespaces: [production]
    labelSelectors:
      app: order-service
  delay:
    latency: "200ms"
    jitter: "50ms"
  duration: "5m"

Experiment Types

  • Pod failure — kill/delete pods
  • Network — latency, packet loss, DNS failure
  • Resource stress — CPU, memory, disk I/O
  • Node drain — pod eviction
  • AZ failure — availability zone outage simulation

Summary

Chaos engineering reveals weaknesses before production incident. Start simple, escalate and always have abort criteria.

Need Help with Implementation?

Our team has experience designing and implementing modern architectures. We’re happy to help.

Free Consultation

Share:

CORE SYSTEMS tým

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.