User Tools

Site Tools


pllumic

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
pllumic [2025/04/24 13:46] kkaczynskipllumic [2025/04/25 08:25] (current) kkaczynski
Line 4: Line 4:
 ====Description==== ====Description====
  
-We release the first representative subset of the PLLuM instruction corpus (PLLuMIC), which we believe to be useful in guiding and planning the development of similar datasets for other LLMsIt is a hand-crafted set of LLM fine-tuning instructions in Polish language, curated according to structured typology and thematic categorisationIt is an integral part of the upcoming scientific article "The PLLuM Instruction Corpus". The research was funded by the Polish Ministry of Digital Affairs in 2024, grant num. 1/WI/DBiI/2023. We plan to continue with the research and extend the dataset in future releases.+We release the first representative subset of the PLLuM Instruction Corpus (PLLuMIC), which we believe to be useful in guiding and planning the development of similar LLM datasets. PLLuMIC is a hand-crafted set of LLM fine-tuning Polish language instructionsdeveloped in line with the annotation guidelines and covering a functional typologyThe corpus is described in more detail in a forthcoming paper titled //The PLLuM Instruction Corpus// (Pęzik et al. 2025). We plan regular updates and significant extensions of the corpus.
  
 ---- ----
pllumic.txt · Last modified: 2025/04/25 08:25 by kkaczynski