Cloud Intermediate
Google BigQuery — Serverless Data Warehouse¶
GCPBigQueryAnalyticsData Warehouse 5 min read
BigQuery architektura, partitioning, clustering, ML a cost control.
Architecture¶
Oddělený storage (Colossus) a compute (Dremel). On-demand $5/TB nebo flat-rate slots.
Partitioning a Clustering¶
CREATE TABLE dataset.events
PARTITION BY DATE(event_timestamp)
CLUSTER BY user_id, event_type
AS SELECT * FROM dataset.raw_events;
Dramatické snížení skenovaných dat = nižší cost.
BigQuery ML¶
CREATE OR REPLACE MODEL dataset.churn_model
OPTIONS(model_type='LOGISTIC_REG', input_label_cols=['churned'])
AS SELECT days_since_last_login, total_purchases, churned
FROM dataset.user_features;
Summary¶
BigQuery = nejrychlejší cesta k analytice petabytů. Partitioning + clustering = klíč k nákladům.
Need Help with Implementation?¶
Our team has experience designing and implementing modern architectures. We’re happy to help.