LSU CSE Professor Awarded $300,000 NSF Grant
June 04, 2025
LSU Division of Computer Science and Engineering Professor Rahul Shah was recently
awarded a $300,315 National Science Foundation grant for his research involving parameterized
pattern matching and compressed data structures. Shah’s research will affect how text
data is stored and will balance compression with fast query performance.
“Computing has ushered us into a whole new era of possibilities and discoveries as more and more data and computing resources are available,” Shah said. “While throwing computational resources at a problem has been a successful initial approach, eventually, the efficiency of resources becomes important. Theoretical computer science rigorously studies the difficulties of problems based on resource limitations like time, energy, space, compressibility, input/output, etc. For the text data, space resources and compressibility become important factors.”
Shah says that even though there is a huge amount of data generated, there is a lot of redundancy in data.
“This makes data highly compressible,” he said. “In the field of compressed data structures, the goal is to develop data structures which take space nearly optimal with respect to the best compression while at the same time can answer the queries as fast as the best uncompressed data structure can answer. This project considers parameterized pattern matching problem and constructing compressed index for it. The problem has direct applicability in software plagiarism detection, cloning and versioning systems. Many of the results and components considered in this project are likely good inclusion for course projects and homework problems.”
In parameterized pattern matching, the text consists of parameterized characters and static characters. Two strings are parameterized match if there is a bijective mapping between parameterized alphabets of the two strings, which when applied to the characters of one of the strings, it becomes equal to the other. For example, two strings consisting of parameterized alphabets “abcdaab” and “wxyzwwx” are considered the same because a maps to w, b maps to x, c maps to y, and d maps to z, while no such consistent mapping will exist if the strings were “aaba” and “yyzw.” In the context of comparing two computer programs, these parameterized symbols represent the variable names.
Shah, along with his previous graduate students, introduced succinct and compact indexes based on what they call parameterized Burrows-Wheeler Transform (pBWT). In this project, Shah aims to define entropy-based compression measures for parameterized text and achieve an index with such space occupancies; explore the construction aspects of parameterized index, including implementation issues like memory bottleneck and external memory construction; and obtain best space-time tradeoff results. Apart from these, the project will also focus on various related problems including compressed indexing for two-dimensional pattern matching.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
“We are incredibly proud of Dr. Rahul Shah for receiving this prestigious NSF grant,” LSU Division of Computer Science and Engineering Chair Ibrahim “Abe” Bagilli said. “His work in parameterized pattern matching is advancing the boundaries of what’s possible in data processing and algorithms. At the LSU Division of Computer Science and Engineering, we’re committed to supporting impactful research like this that drives innovation and strengthens LSU’s role as a leader in computing.”