Context Aware Models for
Automatic Source Code Summarization 

10:30 am
Wednesday March 13th, 2024
Room 3107
Patrick F. Taylor Hall




Source Code Summarization is a program comprehension task that consists of writing natural language descriptions of source code. The state-of-the-art for automatic source code summarization are neural networks developed for machine translation. These are usually designed to accept a snippet of source code as a sequence of tokens and generate a description, patterned on sequence-to-sequence learning. However, often some of the information required to summarize the subroutine descriptively is not inside the subroutine. The necessary information lives in the "context" around the code, such as other subroutines, files, and build files, as well as the pre-learnt human knowledge. In this talk, I will present my research on context-aware neural models for better automatic source code summarization. I will discuss the intuition behind each type of context we encode, as well as describe the techniques and results. Source code summarization is one of my application areas for foundational concepts in automatic program comprehension. I will discuss future ideas to extend my foundational work for other applications.

Aakash Bansal

Aakash Bansal

University of Notre Dame

Aakash Bansal is a Ph.D. candidate at the University of Notre Dame, advised by Collin McMillan. His research focuses on development  and application of foundational AI techniques for Software Engineering tasks. His long-term research objective is to bridge the gap between human program comprehension and automatic program comprehension. His short-term goal is the advancement of generative models for software engineering, specifically code summarization. His work has led to thirteen publications at premiere SE and cross-disciplinary venues  such as TSE, ICPC, ASE, ICSE, ICSME, FSE, ETRA, PACM-HCI, and SANER.