Nisansa de Silva

Nisansa Dilushan de Silva

I am a lecturer at the Department of Computer Science & Engineering, University of Moratuwa, Sri Lanka. I obtained my Ph.D (2020, advisor: Dejing Dou) and MS (2016) degrees in Computer and Information Science from University of Oregon, USA and my BSc (Hons) in Computer Science & Engineering (2011) degree from University of Moratuwa. I joined University of Moratuwa as a lecturer in 2011. From 2013 to 2014, I worked as a researcher at LIRNEasia. In 2018, I worked as a Givens Associate at Argonne National Laboratory, USA. I have over 80 peer reviewed publications in the field of computer science, mostly under the subfields of Natural Language Processing and Artificial Intelligence, earning more than 1400 citations. I am also an associate member of the IESL.

Nisansa de Silva ( Dr. Nisansa Dilushan de Silva Ph.D.), NisansaDdS

w, e

Computer Scientist

(Get IEEE bio)

Professional Experience

Present 2020

Senior Lecturer

Department of Computer science & Engineering, University of Moratuwa,
Sri Lanka
2021 2020

Research Fellow

LIRNEasia,
Sri Lanka
2020 2014

Graduate Research/Teaching Fellow

University of Oregon, Department of Computer and Information Science,
USA.
2018 2018

Givens Associate

Argonne National Laboratory,
USA.
2020 2011

Lecturer

Department of Computer science & Engineering, University of Moratuwa,
Sri Lanka
2014 2013

Researcher

LIRNEasia,
Sri Lanka
2014 2013

Visiting Lecturer

Northshore College of Business and Technology,
Sri Lanka

Education

Ph.D. 2020

Ph.D. in Computer & Information Science

University of Oregon, USA
MS 2016

MS in Computer & Information Science

University of Oregon, USA
BSc2011

B.Sc Engineering (Hons)in Computer Science & Engineering

University of Moratuwa, Sri Lanka

Featured Research

SiDiaC: Sinhala Diachronic Corpus

N. Jayatilleke, and N. de Silva

Proceedings of the 39th Pacific Asia Conference on Language, Information and Computation, 2025, pp. 511--527,

SiDiaC, the first comprehensive Sinhala Diachronic Corpus, covers a historical span from the 5th to the 20th century CE. SiDiaC comprises 58k words across 46 literary works, annotated carefully based on the written date, after filtering based on availability, authorship, copyright compliance, and data attribution. Texts from the National Library of Sri Lanka were digitised using Google Document AI OCR, followed by post-processing to correct formatting and modernise the orthography. The construction of SiDiaC was informed by practices from other corpora, such as FarPaHC, particularly in syntactic annotation and text normalisation strategies, due to the shared characteristics of low-resourced language status. This corpus is categorised based on genres into two layers: primary and secondary. Primary categorisation is binary, classifying each book into Non-Fiction or Fiction, while the secondary categorisation is more specific, grouping texts under Religious, History, Poetry, Language, and Medical genres. Despite challenges including limited access to rare texts and reliance on secondary date sources, SiDiaC serves as a foundational resource for Sinhala NLP, significantly extending the resources available for Sinhala, enabling diachronic studies in lexical change, neologism tracking, historical syntax, and corpus-based lexicography.

Nisansa de Silva

University of Moratuwa

Nisansa Dilushan de Silva

Professional Experience

Senior Lecturer

Research Fellow

Graduate Research/Teaching Fellow

Givens Associate

Lecturer

Researcher

Visiting Lecturer

Education

Featured Research

SiDiaC: Sinhala Diachronic Corpus