Our monolithic order system had an endpoint that synchronously called the inventory, payment, shipping, and notification services when creating an order. When one went down, everything went down. Event-driven architecture broke that chain.
Request-Response and Its Limits¶
Classic architecture: service A calls service B, waits for a response, then calls service C. It’s simple, straightforward, and works great until service B has 3-second latency or service C is completely unavailable.
In the request-response model, services are temporally coupled (both must be running simultaneously) and spatially coupled (the caller must know the callee’s address). The more services in the chain, the more fragile the system. One slow link slows the entire chain. One dead link stops it.
What Is Event-Driven Architecture¶
Instead of direct calls, services publish events — messages about what happened. “Order created.” “Payment received.” “Shipment dispatched.” Other services subscribe to events they care about and react to them asynchronously.
The order service publishes an OrderCreated event. The payment service catches it and processes the payment. The inventory service catches it and reserves stock. The notification service catches it and sends an email. Each independently, each at its own pace. When the notification service is down, the order still goes through — the email is sent once the service comes back up.
Apache Kafka as the Backbone¶
For event-driven architecture, you need a message broker — a place where events flow to and are consumed from. RabbitMQ is the traditional choice for message queuing. We chose Apache Kafka because it offers something more: a distributed, replicated log with retention.
# Kafka topic for orders
Topic: orders
Partitions: 12
Replication factor: 3
Retention: 7 days
# Event schema (Avro)
{
"type": "record",
"name": "OrderCreated",
"fields": [
{"name": "orderId", "type": "string"},
{"name": "customerId", "type": "string"},
{"name": "items", "type": {"type": "array", "items": "OrderItem"}},
{"name": "totalAmount", "type": "double"},
{"name": "timestamp", "type": "long"}
]
}
Kafka doesn’t delete messages after reading — it retains them for a configured period. A new consumer can read history from the beginning. Need to add an analytics service that processes all orders from the past month? Deploy it and set the offset to the beginning. No replay needed, no database export.
Event Sourcing — The Truth Is in the Events¶
Traditional approach: you store the current state in a database. Event sourcing: you store the series of events that led to the current state. State is derived by replaying events. Like a ledger versus an account balance.
The advantages are significant. Complete audit trail — you know not just what the state is, but why. Temporal queries — “what was the order status yesterday at 3:00 PM?” Replay events up to that point. Debugging — reproduce a production bug by replaying the exact sequence of events.
The disadvantages are also significant. Complexity — it’s a different way of thinking, and many developers struggle with it. Event schema evolution — how do you change the event structure without breaking existing consumers? Eventual consistency — the read model may be temporarily inconsistent with the write model.
CQRS — Separate Reads and Writes¶
Command Query Responsibility Segregation — a separate model for writes (commands) and reads (queries). The write model accepts commands and generates events. The read model updates from events and is optimized for queries.
Need help with implementation?
Our experts can help with design, implementation, and operations. From architecture to production.
Contact us