Medallion architecture organizes data into three layers: Bronze (raw), Silver (cleaned), and Gold (business-ready). Each layer increases quality and adds value.
Three Data Layers¶
Bronze — Raw Data¶
- Append-only — never delete, never modify
- 1:1 copy of the source system
- Metadata: ingestion timestamp, source system
Silver — Cleaned Data¶
- Deduplication and cleansing
- Type conversion and normalization
- Validation — quality checks
Gold — Business Data¶
- Aggregation and business logic
- Dimensional models
- Consumption: BI, ML, API
# dbt implementation
# models/bronze/raw_orders.sql
SELECT *, current_timestamp() AS ingested_at
FROM {{ source('raw', 'orders') }}
# models/silver/stg_orders.sql
SELECT DISTINCT order_id, customer_id,
CAST(total AS DECIMAL(12,2)) AS total_czk
FROM {{ ref('raw_orders') }}
WHERE order_id IS NOT NULL
# models/gold/fct_daily_revenue.sql
SELECT order_date, SUM(total_czk) AS revenue
FROM {{ ref('stg_orders') }}
GROUP BY order_date
Summary¶
Medallion architecture is the standard for lakehouse. Bronze preserves raw data, Silver cleans, and Gold aggregates for business.
medallionarchitekturabronzesilvergold