_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

NLP in Practice — BERT, GPT, and Processing Czech Texts

14. 06. 2021 1 min read CORE SYSTEMSai
NLP in Practice — BERT, GPT, and Processing Czech Texts

Transformer models have revolutionized NLP. But how do they perform on Czech — a language with seven grammatical cases and rich inflection?

Czech BERT — Czert

English BERT can’t handle Czech morphology. Czert from ÚFAL MFF UK is trained on Czech, while XLM-RoBERTa is a good compromise.

Insurance Email Classification

15,000 labeled emails, fine-tuned Czert, 8 categories. Result: 94% accuracy. Low-confidence predictions go to manual review.

GPT-2 for Generation

Fine-tuned on customer support responses. Fluent text, but hallucinations. As an assistant for operators (suggesting a response to edit), it makes sense. GPT-3 promises dramatic improvement — but only via API.

NLP for Czech Is Real

For classification, the results are excellent. For generation, we’re waiting for better models.

nlpbertgpttransformersczech nlp
Share:

CORE SYSTEMS

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.

Need help with implementation?

Our experts can help with design, implementation, and operations. From architecture to production.

Contact us