User Tools

Site Tools


diabiz_en

DiaBizEN

DiaBizEN is a representative sample of the DiaBiz transcriptions localized into English. It includes over 500 high-quality translations of conversations spanning all 9 business domains covered by DiaBiz, amounting to approximately 336,000 words—equivalent to around 10% of the entire DiaBiz corpus. The translations were carried out by experienced translators and proofread by a native English speaker.

Domain Dialogs Word count DiaBiz Percentage
Banking 127 69 184 773 858 9%
Telecommunications 117 64 805 416 333 16%
Tourism 71 58 626 674 066 9%
Insurance 57 31 009 307 760 10%
Energy services 55 29 740 248 295 12%
Retail 46 25 316 133 702 19%
Medical care 45 22 044 236 057 9%
Debt collection 34 17 776 245 031 7%
Car rental 31 17 199 189 741 9%
Total 583 335 699 3 224 843 10%

A sample od the localised corpus can be downloaded here

diabiz_en.txt · Last modified: 2025/01/24 12:55 by pezik