_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

Lakehouse architektura — spojení data lake a warehouse

25. 09. 2025 1 min read intermediate

Lakehouse sjednocuje data lake a warehouse do jedné vrstvy. Open table formats, medalionová architektura a unifikovaný přístup k datům.

Od warehouse a lake k lakehouse

Medalionová architektura

  • Bronze — surová data, append-only
  • Silver — vyčištěná, validovaná
  • Gold — business agregace
# Bronze: ingestion z Kafka
bronze.writeStream.format("delta")
    .start("/lakehouse/bronze/orders")

# Silver: čištění
silver = spark.read.format("delta")
    .load("/lakehouse/bronze/orders")
    .dropDuplicates(["order_id"])
silver.write.format("delta").save("/lakehouse/silver/orders")

# Gold: agregace
gold = spark.read.format("delta")
    .load("/lakehouse/silver/orders")
    .groupBy("order_date").agg(sum("total_czk").alias("revenue"))
gold.write.format("delta").save("/lakehouse/gold/revenue")

Advantages

  • Jeden storage — žádná duplikace
  • Open formats — žádný vendor lock-in
  • Cost efficiency — levný object storage

Summary

Lakehouse s medalionovým vzorem je preferovaný přístup. Bronze-Silver-Gold zajišťuje postupné zvyšování kvality.

lakehousearchitekturadata lakewarehouse
Share:

CORE SYSTEMS tým

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.