_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

AI Testing — How to Test Non-Deterministic Software

02. 04. 2025 1 min read CORE SYSTEMSai
AI Testing — How to Test Non-Deterministic Software

assert response == expected — doesn’t work with LLMs. The answer is different every time. We need a new testing paradigm.

New Approaches

Property-based testing: Test properties, not exact output. Metamorphic testing: A small change in input must not change the facts. LLM-as-judge: GPT-4 evaluates based on a rubric.

Evaluation Pipeline

  • Golden dataset: 100+ pairs
  • Automatic run on every PR
  • Metrics: faithfulness, relevance, toxicity
  • Regression detection: alert on >5% drop

Red Teaming

Automated adversarial testing: prompt injection, jailbreak, PII leakage. In CI, not as a one-off.

AI Testing Is Software Testing 2.0

Property-based tests + LLM-as-judge + evaluation pipeline = production-ready.

ai testingqualitytestingautomation
Share:

CORE SYSTEMS

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.

Need help with implementation?

Our experts can help with design, implementation, and operations. From architecture to production.

Contact us