Transformer models have revolutionized NLP. But how do they perform on Czech — a language with seven grammatical cases and rich inflection?
Czech BERT — Czert¶
English BERT can’t handle Czech morphology. Czert from ÚFAL MFF UK is trained on Czech, while XLM-RoBERTa is a good compromise.
Insurance Email Classification¶
15,000 labeled emails, fine-tuned Czert, 8 categories. Result: 94% accuracy. Low-confidence predictions go to manual review.
GPT-2 for Generation¶
Fine-tuned on customer support responses. Fluent text, but hallucinations. As an assistant for operators (suggesting a response to edit), it makes sense. GPT-3 promises dramatic improvement — but only via API.
NLP for Czech Is Real¶
For classification, the results are excellent. For generation, we’re waiting for better models.
Need help with implementation?
Our experts can help with design, implementation, and operations. From architecture to production.
Contact us