_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

Telemetry & Streaming

Data from sensor to backend in under 100ms.

We build telemetry pipelines that deliver data reliably, quickly and in guaranteed order — from a single sensor to thousands of devices.

<100ms LAN
End-to-end latency
100K msg/s
Throughput
0%
Data loss
99.99%
Uptime

Telemetry as the Bloodstream of IoT Systems

Sensors without a reliable data pipeline are expensive hardware. They measure, but the data never arrives. Or it arrives late. Or it gets lost. Or it arrives out of order and anomaly detection generates false alarms.

We build telemetry chains where every message arrives where it should, when it should, and in the correct order. From sensor through MQTT broker to Kafka, through stream processing to a time-series database and dashboard. Every step monitored, scalable, resilient.

MQTT as Transport

Why MQTT

MQTT (Message Queuing Telemetry Transport) is the de facto standard for IoT communication. Designed in 1999 for oil pipelines — reliable data transfer over unreliable satellite connections. Today running on billions of devices.

Key properties for IoT:

  • Lightweight: 2 byte minimum header. An entire message with a 100B payload has 102B overhead. An HTTP request for the same data: 500B+.
  • Pub/sub model: Device publishes to a topic, backend subscribes. Decoupling — the device doesn’t know (and doesn’t need to know) who reads the data.
  • Persistent sessions: Broker remembers subscriptions and undelivered messages for offline devices. After reconnecting, it delivers everything that was missed.
  • Last Will and Testament: When connecting, a device defines its “last will” — a message the broker sends if the device unexpectedly disappears. Immediate offline device detection.

Quality of Service

Three levels of delivery guarantee:

  • QoS 0 (At most once): Fire-and-forget. No acknowledgement. For data where occasional loss is acceptable (high-frequency telemetry where a missing sample can be interpolated).
  • QoS 1 (At least once): Delivery confirmation. Possible duplicates. For most telemetry — the backend must be idempotent.
  • QoS 2 (Exactly once): Four-step handshake. No losses, no duplicates. For critical data (transactions, alarms, commands). Higher overhead — used only where necessary.

MQTT 5.0 Features

  • Shared subscriptions: Load balancing across multiple consumers. Topic $share/group/sensors/+/temperature — messages distributed round-robin among subscribers.
  • Topic aliases: Shortening repeating topic names. Reduces bandwidth by 10-30%.
  • Flow control: Receiver can say “slow down” — protection against overwhelming a slow consumer.
  • Request/response: Native request/response pattern via correlation data and response topic.

MQTT Broker

EMQX: Distributed, cluster-capable. Millions of concurrent connections. Rule engine for routing and transformation. Bridge to Kafka, HTTP, databases.

Mosquitto: Lightweight, single-node. Ideal for edge and smaller deployments. C implementation, minimal resources.

Azure IoT Hub / AWS IoT Core: Managed MQTT broker with integrated device management. Vendor lock-in, but zero operations.

Kafka for Stream Processing

MQTT delivers data to the broker. Kafka is the central nervous system — event store, stream processor, integration hub.

Why Kafka After MQTT

An MQTT broker is transport — it delivers messages to subscribers and (typically) forgets. Kafka is a durable event log — messages stored on disk, retention from days to years. Replay at any time. A new consumer can process historical data from the beginning.

Architecture pattern:

Device → MQTT Broker → Kafka Connect/Bridge → Kafka → Consumers

Stream Processing

Kafka Streams or Apache Flink for real-time processing:

  • Aggregation: Average temperature over the last 5 minutes, maximum vibration over an hour
  • Windowing: Tumbling, hopping, sliding windows. Session windows for activity detection.
  • Anomaly detection: Z-score, moving average, isolation forest. Alert when value deviates from baseline.
  • Enrichment: Adding context — device metadata, location, customer. Join stream with reference data.
  • Filtering: Noise filtering, deduplication, format validation.

Consumer Groups

Parallel processing via consumer groups:

  • Alerting consumer: Real-time rule evaluation, push notification on threshold breach
  • Storage consumer: Writing to time-series DB for historical queries
  • Analytics consumer: Aggregation for dashboards and reports
  • ML consumer: Feature store for ML models, training data pipeline

Each consumer independently scalable. Adding a new consumer without impact on existing ones.

Data Pipeline Architecture

From Sensor to Dashboard

Sensor → [Local buffer] → MQTT (QoS 1) → MQTT Broker
  → Kafka Connect → Kafka Topic (raw)
    → Stream Processing (validation, enrichment)
      → Kafka Topic (processed)
        → InfluxDB/TimescaleDB (storage)
        → Grafana (visualization)
        → Alert Engine (rules)
        → ML Pipeline (features)

Every step is independently scalable, monitored and recoverable.

Dead Letter Queue

Messages that fail validation (wrong format, unknown device, out of range) are not lost. They go to a dead letter queue for analysis:

  • Automatic notification on DLQ growth
  • Dashboard with top error reasons
  • Manual review and reprocessing
  • Root cause analysis — bad firmware? Bug in device code?

Compression and Efficiency

  • Protocol Buffers: Binary serialization, 3-10× smaller than JSON. Schema evolution with backward/forward compatibility. Ideal for high-throughput telemetry.
  • MessagePack: JSON-compatible binary format. Easier adoption than Protobuf, still 30-50% smaller than JSON.
  • Device-side batching: Local buffer, sending after N messages or T seconds. Reduces connection overhead.
  • Adaptive sampling: Normal operation: 1 sample/min. Anomaly detected: 10 samples/s. Dynamic granularity based on need.

Time-Series Storage

InfluxDB

Native time-series database. Optimized for write-heavy workloads:

  • Ingest: hundreds of thousands of points per second
  • Flux query language for transformations and analysis
  • Retention policies: automatic expiration of old data
  • Continuous queries for pre-aggregation

TimescaleDB

PostgreSQL extension for time-series:

  • Full SQL — JOINs with relational data (device metadata, customers)
  • Hypertables for automatic partitioning
  • Compression: 90-95% space savings for older data
  • Continuous aggregates for materialized views

Choice

InfluxDB for purely telemetric workloads (simpler operations, native time-series UX). TimescaleDB when you need SQL, JOINs, or already have a PostgreSQL stack.

Technology Stack

Transport: MQTT 5.0, AMQP, CoAP, HTTP/2.

Brokers: EMQX, Mosquitto, Azure IoT Hub, AWS IoT Core, HiveMQ.

Streaming: Apache Kafka, Kafka Streams, Apache Flink, Redpanda.

Storage: InfluxDB, TimescaleDB, QuestDB, Apache Parquet (archive).

Serialization: Protocol Buffers, MessagePack, Avro, JSON.

Visualization: Grafana, custom dashboards.

Časté otázky

MQTT is designed for unreliable networks and resource-constrained devices. Persistent sessions, QoS guarantees, minimal overhead (2 byte header vs. hundreds of bytes with HTTP). For IoT telemetry, MQTT is the standard.

It depends on the infrastructure. EMQX cluster: millions of concurrent connections, hundreds of thousands of messages per second. Kafka: millions of messages per second per cluster. Horizontally scalable.

MQTT QoS 1/2 guarantees delivery. Devices store data locally during outages (store-and-forward). After connectivity is restored, data is sent in order. Dead letter queue for undeliverable messages.

InfluxDB for smaller deployments (simpler operations). TimescaleDB for larger ones (PostgreSQL compatibility, SQL queries, JOINs with relational data). QuestDB for ultra-low latency ingest.

Máte projekt?

Pojďme si o něm promluvit.

Domluvit schůzku