|
I am a senior lecturer at the Department of Computer Science & Engineering, University of Moratuwa, Sri Lanka. I obtained my Ph.D (2020, advisor: Dejing Dou) and MS (2016) degrees in Computer and Information Science from University of Oregon, USA and my BSc (Hons) in Computer Science & Engineering (2011) degree from University of Moratuwa. I joined University of Moratuwa as a lecturer in 2011.
From 2013 to 2014, I worked as a researcher at LIRNEasia. In 2018, I worked as a Givens Associate at Argonne National Laboratory, USA.
I have over 60 peer reviewed publications in the field of computer science, mostly under the subfields of Natural Language Processing and Artificial Intelligence, earning more than 1000 citations.
I am also an associate member of the IESL.
|
Sinhala Text Classification: Observations from the Perspective of a Resource Poor Language
N. de Silva
, 2015, [pdf] [bib]Sinhala, despite its several millennia long history, remains a resource poor language. The objective of this study was to explore the possibility of enhancing the text classification process of a resource poor language by means of data and tools from a resource rich language. However, it was discovered that if the feature space is based on an n-gram model, Sinhala, being a a highly inflected language, naturally performs better than English, which is a weakly inflected language. This result held true even when Sinhala was only utilizing the basic lexical level models and English was utilizing advanced semantic level models.