HomeProjects

Text Summarization

Principal Investigator: Nisansa de Silva

We propose to advance automatic summarization and generation techniques, with a focus on aspect-aware methods that capture fine-grained information and sentiment within concise textual representations.

Automatic summarization and text generation are crucial in natural language processing as they reduce long or complex texts into coherent, concise outputs that preserve the most salient information. This project aims to extend these methods by incorporating aspect-based sentiment analysis principles into summarization and generation tasks.
We explore deep hybrid models and transfer learning approaches to generate accurate, aspect-aware summaries, abstracts, and headlines. Special emphasis is placed on multi-document summarization, where information needs to be aggregated from multiple sources without losing coherence or sentiment orientation. Comparative evaluations are conducted to measure the effectiveness of different modeling strategies, ranging from neural architectures to adapter-based fine-tuning of large pre-trained models.
In addition, we aim to create and share multilingual and low-resource datasets that enable experimentation in summarization and sentiment-aware text generation. By combining innovations in modeling, evaluation, and data curation, this work contributes toward the development of summarization systems that can generate more nuanced and aspect-sensitive textual representations, supporting broader advances in natural language understanding.

Objectives:

  • Investigate neural and hybrid approaches for generating structured summaries and condensed representations of long or complex texts.
  • Develop multi-document summarization methods capable of integrating information from diverse sources into coherent outputs.
  • Explore transfer learning and adapter-based fine-tuning techniques to adapt large pre-trained models for summarization tasks.
  • Evaluate and benchmark summarization models across different methods (rule-based, neural, and hybrid) to identify performance trade-offs.
  • Build multilingual and low-resource datasets to support experimentation in summarization and text generation.
  • Advance automatic headline and abstract generation models that can capture both content and sentiment-linked aspects effectively.


Keywords: Natural Language Processing | Machine Learning / Deep Learning | Sinhala | Multi-document summarization | Pre-trained Models |




Publications

Theses: MSc Minor Component Research

Conference Papers

Extended Abstracts

Team

External Collaborators: | C D Athuraliya |


Faculty

Nisansa de Silva

Senior Lecturer
University of Moratuwa

Alumni-MSc Students

Dushan Kumarasinghe

Senior Software Engineer
Sysco LABS Sri Lanka

Kushan Hewapathirana

Machine Learning Engineer
ConscientAI

Pubudu Cooray

Lead Software Engineer
Insighture