HomeGrants

Data Sets and Benchmark for Interactive Collaborative Storytelling Text Generation

Grant Amount:
Rs 300,000

Grant Source:
University of Moratuwa

Grant Program:
Senate Research Committee
(Short Term)

Grant Code:
SRC/ST/2022/01

Grant Duration:
2022/10 - 2023/09

Principal Investigator: Nisansa de Silva

Keywords: Natural Language Processing | Artificial Intelligence | Text Generation | Interactive Collaborative Storytelling | TTRPG | Dungeons and Dragons |


Tabletop Roleplaying Games (TTRPGs) are a billion dollar industry dominated by Dungeons and Dragons (D&D) since its inception in 1974. Therefore, many attempts have been made from both academic and industry perspective to automate certain aspects of the game. This ranges from dynamic map generation work to research on discourse analysis. But one aspect of the TTRPGs has so far been hard to automate. That is the task of adventure generation. It has long been considered fruit of creativity and human touch which a computer cannot replicate. However, with the advent of deep learning techniques, these long held beliefs can be called to question. Natural language text generation has become not only possible, but almost indistinguishable from human output.

The long term goal for this project is to solve the automatic adventure generation problem. However, this task requires a considerable amount of data on which the deep learning models can train. What is expected to be covered by this short term grant is to provide the stipend of two Technical Assistants for the duration of 6 months to collect, clean, process, annotate, and curate the required data. The data will be collected from publicly available sources using web crawling. The data will then be cleaned with manual inspection (for which, it is expected that the two Technical Assistants who are to be hired will have the necessary domain knowledge), the data will be put in rich formats to be easily used by researchers, and will then be put on publicly accessible international data repositories. The released data will be organized as a singular data set which houses multiple sub-data sets covering various perspectives and utilities of the data collected and anticipated needs of the text generation research task. Benchmarks for using and evaluating the data sets will be designed and released along with the data.


Objectives:

  • Create annotated data sets for the purpose of establishing a benchmark for interactive collaborative storytelling text generation. (Specifically in the TTRPG Dungeons & Dragons).