diabiz
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
diabiz [2022/04/20 13:54] – [The domains covered:] madamczyk | diabiz [2023/09/27 09:49] (current) – pezik | ||
---|---|---|---|
Line 4: | Line 4: | ||
**DiaBiz corpus** is a dialog corpus comprising **recordings** and annotated **transcriptions** of **phone-based customer-agent interactions** in several key business domains. | **DiaBiz corpus** is a dialog corpus comprising **recordings** and annotated **transcriptions** of **phone-based customer-agent interactions** in several key business domains. | ||
+ | A general overview of the corpus can be found in this paper: | ||
+ | |||
+ | * Pęzik, Piotr, Gosia Krawentek, Sylwia Karasińska, | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | Also see the accompanying poster here: | ||
+ | * [[https:// | ||
=== The corpus comprises: === | === The corpus comprises: === | ||
- | * 4,036 conversations amounting to nearly 410 hours and over 3 million words | + | * 4,036 conversations amounting to nearly 410 hours and over 3.2 million words |
- | * dialogues between 5 professional | + | * dialogues between 5 call-center agents and 191 participants as customers |
* data from 9 business domains with high commercial demand for conversational analytics and automation solutions | * data from 9 business domains with high commercial demand for conversational analytics and automation solutions | ||
* dialogues based on 251 real-life interaction scenarios | * dialogues based on 251 real-life interaction scenarios | ||
Line 16: | Line 25: | ||
==== The domains covered: ==== | ==== The domains covered: ==== | ||
- | ^ Domain ^ | + | ^ Domain ^ |
| Banking | 907 | 773, | | Banking | 907 | 773, | ||
| Car rental | 246 | 189, | | Car rental | 246 | 189, | ||
Line 25: | Line 34: | ||
| Telecommunications | 700 | 416, | | Telecommunications | 700 | 416, | ||
| Tourism | 451 | 674, | | Tourism | 451 | 674, | ||
- | | Retail | 270 | 133,702 | 24:24:00 | | + | | Retail | |
- | | **Total** | **3,766** | **3,091,141** | **385:33:32** | | + | | **Total** | **4,036** | **3,224,843** | **409:57:32** | |
- | The data was **manually | + | The data was automatically automatically |
- | {{: | + | {{: |
Line 53: | Line 62: | ||
=====Availability===== | =====Availability===== | ||
- | Click [[https:// | + | All the samples and supplementary materials available on this webpage are copyrighted. They are only included |
- | The current version of the recording catalog is available | + | Click [[https:// |
+ | |||
+ | The current version of the recording catalog is available [[https:// | ||
+ | |||
+ | For more information about the DiaBiz license for both commercial and scientific use, please contact piotr.pezik@uni.lodz.pl. | ||
- | For more information, | ||
=====Project Team==== | =====Project Team==== | ||
* Piotr Pęzik | * Piotr Pęzik | ||
Line 80: | Line 92: | ||
* Zuzanna Deckert | * Zuzanna Deckert | ||
* Piotr Górniak | * Piotr Górniak | ||
+ | * Konrad Kaczyński | ||
+ | * Łukasz Jałowiecki | ||
+ | |||
+ | |||
+ | =====DiaBiz EN===== | ||
+ | |||
+ | [[https:// | ||
+ | |||
=====Acknowledgments==== | =====Acknowledgments==== |
diabiz.1650455694.txt.gz · Last modified: 2022/04/20 13:54 by madamczyk