_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

Dagster — Modern Orchestration with Asset-Based Approach

16. 03. 2024 1 min read intermediate

Dagster builds on the concept of software-defined assets. Instead of tasks, you describe data assets and Dagster automatically derives the pipeline.

Why Dagster

Dagster focuses on assets (what to create), not operations (what to do).

Software-Defined Assets

from dagster import asset
import pandas as pd

@asset(group_name="raw")
def raw_orders() -> pd.DataFrame:
    return pd.read_sql("SELECT * FROM orders", conn)

@asset(group_name="staging")
def clean_orders(raw_orders: pd.DataFrame) -> pd.DataFrame:
    df = raw_orders.copy()
    df['total_czk'] = df['total_eur'] * 25.2
    return df.dropna(subset=['customer_id'])

@asset(group_name="marts")
def daily_revenue(clean_orders: pd.DataFrame) -> pd.DataFrame:
    return clean_orders.groupby('order_date').agg(
        revenue=('total_czk', 'sum'), orders=('order_id', 'count')
    ).reset_index()

Asset Checks

from dagster import asset_check, AssetCheckResult

@asset_check(asset=clean_orders)
def no_negative_amounts(clean_orders):
    neg = clean_orders[clean_orders['total_czk'] < 0]
    return AssetCheckResult(passed=len(neg) == 0)

Summary

Dagster is ideal for asset-oriented data platforms. Software-defined assets, built-in testing and partitioning.

dagsterorchestrationdata assetspipeline
Share:

CORE SYSTEMS tým

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.