Slicing the Bug: Context-aware Multi-line Bug Fixing with LLMs
Automated Program Repair is a technique that fixes bugs in computer programs without human intervention, addressing critical needs in software development. Over decades, APR has evolved through various stages, including search-based, template-based, constraint-based, and learning-based techniques. While learning-based approaches have shown significant progress in generating single-line patches, they still struggle with producing multi-line patches. This research proposal introduces a novel approach to overcome these challenges, particularly for Java applications. The proposed methodology utilizes program slicing techniques to extract relevant code context and fine-tunes a Code Language Model to generate accurate multi-line patches. By integrating both backward and forward slicing, the model gains a deeper understanding of bug dependencies and their effects. Additionally, the exploration of various code representations further enhances the model's bug-fixing capabilities. To address the limitations of existing APR tools, the approach extends the input sequence length to gather information from broader code contexts and captures semantic dependencies among generated patches. The use of parameter-efficient fine-tuning techniques ensures that training large models on custom datasets remains feasible. The methodology is rigorously evaluated on the Defects4J benchmark, ensuring its effectiveness and generalizability across diverse bug scenarios. This research has the potential to significantly improve the efficiency and reliability of software development by automating the repair of complex, multi-line bugs, contributing to the advancement of Automated Program Repair techniques.