Edge Computing
Decisions in 10ms. Without the cloud.
We process data where it is generated — on the device, in the local network. The cloud gets results, not terabytes of raw data.
Why edge computing¶
Not everything belongs in the cloud. When a sorting line needs a decision in 10ms, you cannot wait for a round-trip to Azure (50-100ms minimum). When a camera generates 25 fps at 4K resolution, you cannot send 15 GB/hour to the cloud. When a safety system must respond even during an internet outage, you cannot depend on cloud connectivity.
Edge computing processes data where it is generated. The cloud receives results — alarms, aggregated metrics, business events — not raw data. The result: lower latency, lower bandwidth costs, higher resilience, lower cloud bills.
When edge, when cloud¶
| Criterion | Edge | Cloud |
|---|---|---|
| Latency < 50ms | ✅ | ❌ |
| High-bandwidth data (video, audio) | ✅ Processed locally | ❌ Too expensive to transfer |
| Offline resilience | ✅ | ❌ |
| Compliance (data locality) | ✅ Data stays on-site | ⚠️ Depends on region |
| Complex ML training | ❌ Limited compute | ✅ GPU clusters |
| Fleet-wide analytics | ❌ Isolated data | ✅ Centralised view |
The right answer is hybrid. Edge for real-time decision-making and pre-processing. Cloud for training, fleet analytics, long-term storage.
Computer vision on the edge¶
Hardware¶
NVIDIA Jetson line: - Jetson Orin Nano (8GB): 40 TOPS, ideal for single-camera CV tasks. ~$200. - Jetson Orin NX (16GB): 100 TOPS, multi-camera, more complex models. - Jetson AGX Orin (64GB): 275 TOPS, enterprise workloads, multiple ML pipelines.
Alternatives: - Google Coral TPU: USB or M.2 accelerator. 4 TOPS, ultra-low power. For simple classification/detection. - Intel NCS2 / OpenVINO: x86 inference optimisation. Suitable when the edge node is Intel-based. - Hailo-8: 26 TOPS, M.2 form factor. Competitive alternative to NVIDIA for dedicated inference.
Model optimisation for edge¶
A cloud model (FP32, 200MB) cannot simply be moved to the edge. Optimisation pipeline:
- Quantisation: FP32 → FP16 → INT8. Up to 4× smaller model, 2-3× faster inference. Accuracy drop typically <1% with proper calibration.
- Pruning: Removal of neurons with low activation. 30-50% parameter reduction without measurable accuracy loss.
- Knowledge distillation: Large “teacher” model trains a small “student” model. Student achieves 90-95% of teacher accuracy with 10% of the parameters.
- TensorRT (NVIDIA): GPU-specific optimisations — layer fusion, kernel auto-tuning, dynamic tensor memory. 2-5× speedup compared to vanilla PyTorch.
- OpenVINO (Intel): Analogous optimisation for Intel CPU/GPU/VPU.
Use cases¶
Quality inspection: Camera above the production line, model detects defects (scratches, deformations, missing parts). Inference under 30ms per frame = real-time QC at 30 fps. Reject signal to PLC in < 50ms.
OCR on labels: Reading serial numbers, lot codes, expiration dates. Structured data instead of manual transcription.
People counting / occupancy: Detection and tracking of persons in a space. Heatmaps, flow analysis, capacity management. GDPR-compliant — video is processed on the edge, only counts go to the cloud.
Safety zone monitoring: Detection of persons in dangerous zones around machinery. Alert to operator, signal to PLC to stop the machine. Latency matters — 50ms edge vs. 200ms+ cloud.
Anomaly detection on the edge¶
Not every anomaly requires an ML model. Statistics often suffice — but they must run locally:
Statistical methods¶
- Z-score: Current value vs. historical mean and standard deviation. Simple, interpretable. Alert when |z| > 3.
- Moving average + bands: Exponential moving average with Bollinger bands. Adaptive to seasonal patterns.
- Change point detection: CUSUM, PELT algorithms. Detection of permanent distribution shift (not a spike, but a shift).
ML models on the edge¶
- Isolation Forest: Unsupervised, tree-based. Anomalies are “easily isolated” points. Fast, low memory.
- One-class SVM: Boundary around normal data. Everything outside the boundary = anomaly.
- Autoencoder: Neural network that compresses and reconstructs input. High reconstruction error = anomaly. Handles complex, multivariate patterns that statistics miss.
- Temporal models: LSTM/GRU for time-series anomalies. Predicts the next value, compares with actual. Large deviation = anomaly.
Typical applications¶
- Vibration analysis: Accelerometer on a motor/bearing. FFT spectrum on the edge. Change in frequency profile = bearing degradation.
- Power consumption: Current and voltage. Anomalous consumption = mechanical problem, clogged filter, overload.
- Temperature drift: Slow temperature increase that would not be visible in raw data. Edge detects the trend days in advance.
Edge orchestration¶
K3s (Lightweight Kubernetes)¶
Kubernetes cluster on the edge — same abstractions (pods, deployments, services), but single-binary with low resource requirements:
- Single node: K3s on one Jetson/RPi. Orchestrates multiple containers.
- Multi-node cluster: Multiple edge devices in one cluster. Service mesh for inter-node communication.
- GitOps deploy: Flux or ArgoCD. Git push → automatic rollout on edge cluster.
Docker for simpler scenarios¶
Not every edge needs Kubernetes:
- Docker Compose for multi-container applications
- Watchtower for automatic image updates
- Portainer for GUI management
Centralised management¶
- Azure IoT Edge: Runtime on edge devices, management from Azure. Module deployment, monitoring, remote troubleshooting.
- AWS Greengrass: Lambda functions on the edge, managed deployment, shadow sync.
- Custom: Ansible for configuration, Terraform for infrastructure, custom fleet dashboard.
Offline resilience¶
The edge node must operate autonomously during connectivity outages:
- Local decision engine: All rules and models are local. Decisions do not depend on the cloud.
- Store-and-forward: Telemetry is stored locally (SQLite, RocksDB). After connectivity is restored, chronological dispatch.
- Local alerting: Alerts are delivered locally (siren, signal light, display) even without internet.
- Autonomous operation: Production line runs with edge QC even without cloud. Cloud adds fleet analytics but is not needed for operation.
Technology stack¶
Hardware: NVIDIA Jetson (Nano/Orin), Raspberry Pi 4/5, Google Coral, Intel NUC, industrial IPCs.
ML Runtime: TensorRT, OpenVINO, TensorFlow Lite, ONNX Runtime, PyTorch Mobile.
CV: OpenCV, GStreamer, DeepStream (NVIDIA), YOLO, MediaPipe.
Orchestration: K3s, Docker, Azure IoT Edge, AWS Greengrass, Balena.
Storage: SQLite, RocksDB, Redis, local InfluxDB.
Monitoring: Prometheus node exporter, Grafana Agent, custom health checks.
Časté otázky
NVIDIA Jetson (Nano/Orin) for computer vision and ML inference. Raspberry Pi 4/5 for lighter workloads. Industrial ARM/x86 computers (Advantech, Kontron) for harsh environments. We choose based on performance, environment and certifications.
GitOps — same workflow as for cloud. Git push → CI/CD → container update on edge. K3s or Docker with Watchtower for orchestration. Rolling updates, health checks, automatic rollback.
Nothing critical. The edge node operates fully autonomously. Local decision-making, store-and-forward for telemetry, local alerting. After connectivity is restored, sync with cloud. Critical decisions never depend on the cloud.
Training in the cloud on full data (GPU cluster). Optimisation for edge: quantisation (FP32→INT8), pruning, knowledge distillation. Export to TensorRT/OpenVINO/TFLite. Validation on edge hardware before deploy.