All are invited to attend the Dissertation Defense in the Department of Computer Science.Ìý

Student:ÌýSanjeeva Dodlapati
Date and Time:ÌýJul 16, 2025 at 01:00 PM
Location:ÌýECSB 3316

Committee Chair:ÌýJiangwen Sun
Committee Members:
Dr. Jiangwen Sun (Chair) (Computer Science, ODU)
Dr. Jing He (Computer Science, ODU)
Dr. Desh Ranjan (Computer Science, ODU)
Dr. Yet Nguyen (Mathematics & Statistics, ODU)Ìý

Title:ÌýLearning Regulatory DNA-Sequence Code of Epigenetic Events using Deep Neural Networks

Abstract:ÌýEpigenetic events, such as DNA methylation and histone modifications, arise from a complex interplay among genomic sequence, chromatin-remodeling factors, and environmental cues. These regulatory mechanisms can induce changes in gene expression without altering the underlying DNA sequence, playing critical roles in development, disease, and cellular differentiation. Among these events, DNA methylation is frequently profiled using bisulfite sequencing (e.g., whole-genome bisulfite sequencing [WGBS], reduced representation bisulfite sequencing [RRBS]). However, predictive modeling of epigenetic states—including methylation patterns and regulatory variant effects—remains challenging due to data sparsity, label noise, and limited uncertainty estimation in current deep learning approaches. This dissertation addresses these issues by introducing a suite of data-centric and uncertainty-aware deep learning frameworks. First, we propose a KL-divergence–based transfer learning method to impute sparse methylation profiles from WGBS and RRBS data. Second, we develop a Monte Carlo dropout–based pipeline that generates predictive confidence scores for non-coding variant effect predictions, aiding in the prioritization of potentially regulatory variants in tissue-specific contexts. Third, we conduct a systematic evaluation to quantify how data quality, sample size, and label noise affect model performance, thereby informing best practices for robust, generalizable training. Collectively, these contributions advance scalable, interpretable, and reliable computational approaches for epigenomic data analysis, paving the way for improved understanding and practical utilization of epigenetic events in developmental and disease settings.