HomeReseach Talks ➤ 036 10 11 2022

Synthesis and Evaluation of a Domain-specific Large Data Set for Dungeons & Dragons (WiNLP 2022 Poster Presentation)

Akila Peiris
Slides Video

This paper introduces the Forgotten Realms Wiki (FRW) data set and uses it for domain-specific natural language generation. The data set was extracted from the Forgotten Realms Fandom wiki dedicated to the eponymous Dungeons and Dragons (D&D) game setting. The FRW data set is constituted of 11 sub-data sets in varying degrees of pre-processing including plain text, annotated text, linked graphs, info-box dictionary, and several embedding models. This is the first data set of this size for the D&D domain. We then present a pairwise similarity comparison benchmark and perform D&D domain-specific natural language generation using the corpus and evaluate the named entity classification with respect to the lore of Forgotten Realms.

    Page: /