Skip to content
_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN DE
Let's talk

Batch Processing

21. 08. 2025 Updated: 24. 03. 2026 1 min read intermediate

Processing millions of records at once = out of memory. Chunking, streaming, and parallelism are the solutions.

Chunking

Python — process in batches of 1000

def process_in_chunks(query, chunk_size=1000): offset = 0 while True: chunk = db.execute(query.limit(chunk_size).offset(offset)).fetchall() if not chunk: break for row in chunk: process(row) db.commit() offset += chunk_size

Server-Side Cursor (PostgreSQL)

SQLAlchemy — server-side cursor

with engine.connect().execution_options(stream_results=True) as conn: result = conn.execute(text(“SELECT * FROM big_table”)) for chunk in result.partitions(1000): for row in chunk: process(row)

Parallelism

from concurrent.futures import ProcessPoolExecutor with ProcessPoolExecutor(max_workers=4) as executor: futures = [executor.submit(process_chunk, chunk) for chunk in chunks] results = [f.result() for f in futures]

Key Takeaway

Chunking for memory efficiency, server-side cursors for streaming, ProcessPoolExecutor for CPU-bound tasks.

batchprocessingperformancepython
Share:

CORE SYSTEMS team

We build core systems and AI agents that keep operations running. 15 years of experience with enterprise IT.