Modern Data Stack is an ecosystem of cloud-native tools for data platforms. From ingestion through transformations to visualization — an overview of key components and their alternatives.
Modern Data Stack Layers¶
1. Ingestion (EL)¶
- Fivetran — SaaS, 300+ connectors
- Airbyte — open-source alternative
- Stitch — simple, Talend
2. Storage & Compute¶
- Snowflake — separated compute/storage
- BigQuery — serverless, Google
- Databricks — lakehouse, Spark
3. Transformation¶
- dbt — SQL transformations as code
4. Orchestration¶
- Airflow / Dagster / Prefect
5. Data Quality¶
- Great Expectations / Soda / Elementary
6. BI & Visualization¶
- Metabase / Superset / Looker
# Typical MDS stack:
# Fivetran → Snowflake → dbt → Metabase
# Airflow (orchestration)
# Great Expectations (quality)
# DataHub (catalog)
# Open-source alternative:
# Airbyte → DuckDB/Postgres → dbt → Superset
# Dagster (orchestration)
# Soda (quality)
# OpenMetadata (catalog)
Trends¶
- Consolidation — fewer tools, more functions in one
- Open source — growing adoption of OSS alternatives
- Lakehouse — replacing warehouse with lakehouse approach
Summary¶
Modern Data Stack is modular and cloud-native. Choose tools according to team size, budget and technical requirements.
modern data stackarchitecturetoolsoverview