User Tools

Site Tools


diabiz

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
diabiz [2022/04/20 13:58] – [The domains covered:] madamczykdiabiz [2023/09/27 09:49] (current) pezik
Line 4: Line 4:
 **DiaBiz corpus** is a dialog corpus comprising **recordings** and annotated **transcriptions** of **phone-based customer-agent interactions** in several key business domains. **DiaBiz corpus** is a dialog corpus comprising **recordings** and annotated **transcriptions** of **phone-based customer-agent interactions** in several key business domains.
  
 +A general overview of the corpus can be found in this paper: 
 +
 +  * Pęzik, Piotr, Gosia Krawentek, Sylwia Karasińska, Paweł Wilk, Paulina Rybińska, Anna Cichosz, Angelika Peljak-Łapińska, Mikołaj Deckert, and Michał Adamczyk. ‘DiaBiz – an Annotated Corpus of Polish Call Center Dialogs’. In Proceedings of the Language Resources and Evaluation Conference, 723–26. Marseille, France: European Language Resources Association, 2022. [[http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.76.pdf]]
 +
 +
 +
 +
 +Also see the accompanying poster here:
 +  * [[https://drive.google.com/file/d/1f1PNXa98TdjnzVqaml16VCp5Z3myxt0i/view?usp=sharing]]
  
 === The corpus comprises: === === The corpus comprises: ===
    
-  * 4,036 conversations amounting to nearly 410 hours and over 3 million words +  * 4,036 conversations amounting to nearly 410 hours and over 3.2 million words 
-  * dialogues between 5 professional call-center agents and 191 participants as customers+  * dialogues between 5 call-center agents and 191 participants as customers
   * data from 9 business domains with high commercial demand for conversational analytics and automation solutions    * data from 9 business domains with high commercial demand for conversational analytics and automation solutions 
   * dialogues based on 251 real-life interaction scenarios   * dialogues based on 251 real-life interaction scenarios
Line 29: Line 38:
  
  
-The data was **manually transcribed****time-aligned** and **annotated**. +The data was automatically automatically **transcribed** and **time-aligned** and subsequently manually **corrected** and **annotated**. 
  
  
-{{:screenshot_2022-01-19_at_11.11.31.png?nolink&600|}}+{{:screenshot_2022-01-19_at_11.11.31_2.png?nolink&600|}}
  
  
Line 53: Line 62:
 =====Availability===== =====Availability=====
  
-Click [[https://uniwersytetlodzki-my.sharepoint.com/:u:/g/personal/michal_adamczyk_filologia_uni_lodz_pl/Eac40u7JvtJGpKOk58ZOw3EBZLXA2ZylYf1mNLZSWpsF0Q?e=MUOlWY|HERE]] to download sample recordings.+All the samples and supplementary materials available on this webpage are copyrightedThey are only included to illustrate the content of the DiaBiz database and should not be used for any other purposes without explicit permission from the University of Lodz representatives.
  
-The current version of the recording catalog is available [[https://uniwersytetlodzki-my.sharepoint.com/:x:/g/personal/michal_adamczyk_filologia_uni_lodz_pl/Ef3jgOahblpKldis9822wRcBtLV42z4eLl485QTBBJD3gw?e=IEzdct|HERE]].+Click [[https://uniwersytetlodzki-my.sharepoint.com/:u:/g/personal/michal_adamczyk_filologia_uni_lodz_pl/EXaiEOFQfv5CnO2UdJ-hPhsBYksrVb0fqh1P6CmwLvwYqA?e=JEvsvi|HERE]] to download sample recordings. 
 + 
 +The current version of the recording catalog is available [[https://docs.google.com/spreadsheets/d/1krwtcjsSUgUzbaTODsTQzqSzjfBWSbRA/edit?usp=sharing&ouid=108677668160278919979&rtpof=true&sd=true|HERE]]. 
 + 
 +For more information about the DiaBiz license for both commercial and scientific use, please contact piotr.pezik@uni.lodz.pl.
  
-For more information, please contact piotr.pezik@uni.lodz.pl . 
 =====Project Team==== =====Project Team====
   * Piotr Pęzik   * Piotr Pęzik
Line 80: Line 92:
   * Zuzanna Deckert   * Zuzanna Deckert
   * Piotr Górniak   * Piotr Górniak
 +  * Konrad Kaczyński
 +  * Łukasz Jałowiecki
 +
 +
 +=====DiaBiz EN=====
 +
 +[[https://docs.google.com/spreadsheets/d/1YTVvyRRb77wfCFOHYgSoRVaGwUfyM8k0kkCogwPGJCI/edit?usp=sharing|Sample]]
 +
  
 =====Acknowledgments==== =====Acknowledgments====
diabiz.1650455918.txt.gz · Last modified: 2022/04/20 13:58 by madamczyk