Seminar: Building Scalable Big Data Pipelines: Graph Processing and Genome Assembly

Kisung Lee headshot



Kisung Lee

Assistant Professor, LSU Division of Computer Science and Engineering

Friday September 4, 2020

3:00 pm

Location: online-only


The volume of real-world data in many domains is growing at an unprecedented rate. To address such big data challenges, I have been working on several research projects for designing and building scalable techniques and frameworks. This presentation will focus specifically on two distributed frameworks, one for scalable assembly of third-generation genome sequences and the other for scalable graph data processing using a NoSQL store.

I will first present a distributed genome assembly framework that can assemble large-scale third-generation sequence datasets using thousands of cores, resulting in faster assembly. The framework is built on the map-reduce computation model. I will then describe a distributed graph processing framework for iterative algorithms. The framework utilizes a disk-based NoSQL system to process big graph data in a scalable manner while improving the overall performance through several optimization techniques.


Dr. Kisung Lee is an assistant professor in the Division of Computer Science and Engineering at Louisiana State University. He received his doctoral degree in computer science from the Georgia Institute of Technology in 2015. During his doctoral study, he spent three summers at IBM Research T.J. Watson as a research intern. His research interests lie in the intersection of big data and distributed data-intensive systems. He is also working on research problems in spatial data management, social network analytics, and bioinformatics. He is a recipient of the Tiger Athletic Foundation Undergraduate Teaching Award in 2020. He served as a Program Committee Vice-Chair for IEEE BigData 2018.