Skip to content
_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN DE
Let's talk

DataHub — Open Data Catalog for Modern Data Stack

23. 02. 2025 Updated: 27. 03. 2026 1 min read intermediate

DataHub centralizes metadata from entire data stack — automatic lineage, search, tagging and governance.

DataHub — Central Hub for Metadata

Solves the problem — where to find data and how to trust it.

Features

  • Automatic ingestion — 50+ connectors
  • Lineage — automatic dependency mapping
  • Search — full-text search
  • Ownership — assign owners
# DataHub — Open Data Catalog for Modern Data Stack
source:
  type: postgres
  config:
    host_port: "warehouse:5432"
    database: analytics
    profiling:
      enabled: true
sink:
  type: datahub-rest
  config:
    server: "http://datahub:8080"

Practical Deployment

DataHub is typically deployed as a Docker Compose stack or on Kubernetes using a Helm chart. After startup, you configure ingestion recipes for individual data sources — PostgreSQL, Snowflake, Airflow, dbt, and dozens more. Ingestion runs periodically (cron) or as part of a CI/CD pipeline.

DataHub’s greatest value lies in automatic column-level lineage — you can see where data comes from and where it flows, down to individual columns. This dramatically simplifies debugging data issues and impact analysis when schema changes occur. For teams managing dozens of databases and hundreds of tables, a data catalog is an essential tool for ensuring data governance and reducing time spent searching for the right data.

Summary

DataHub is leading open-source catalog with automatic lineage and rich integrations.

datahubdata catalogmetadatalineage
Share:

CORE SYSTEMS team

We build core systems and AI agents that keep operations running. 15 years of experience with enterprise IT.