_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

Modern Data Stack — Tool Overview for Modern Data Platforms

11. 11. 2023 1 min read intermediate

Modern Data Stack is an ecosystem of cloud-native tools for data platforms. From ingestion through transformations to visualization — an overview of key components and their alternatives.

Modern Data Stack Layers

1. Ingestion (EL)

  • Fivetran — SaaS, 300+ connectors
  • Airbyte — open-source alternative
  • Stitch — simple, Talend

2. Storage & Compute

  • Snowflake — separated compute/storage
  • BigQuery — serverless, Google
  • Databricks — lakehouse, Spark

3. Transformation

  • dbt — SQL transformations as code

4. Orchestration

  • Airflow / Dagster / Prefect

5. Data Quality

  • Great Expectations / Soda / Elementary

6. BI & Visualization

  • Metabase / Superset / Looker
# Typical MDS stack:
# Fivetran → Snowflake → dbt → Metabase
# Airflow (orchestration)
# Great Expectations (quality)
# DataHub (catalog)

# Open-source alternative:
# Airbyte → DuckDB/Postgres → dbt → Superset
# Dagster (orchestration)
# Soda (quality)
# OpenMetadata (catalog)
  • Consolidation — fewer tools, more functions in one
  • Open source — growing adoption of OSS alternatives
  • Lakehouse — replacing warehouse with lakehouse approach

Summary

Modern Data Stack is modular and cloud-native. Choose tools according to team size, budget and technical requirements.

modern data stackarchitecturetoolsoverview
Share:

CORE SYSTEMS tým

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.