The 1st Dependable Data-Driven Discovery D4 Annual Research Symposium was held at Iowa State University on August 14, 2024, at 1 pm at Hach Hall. This exciting event showcased the interdisciplinary research of the D4 National Research Trainees of 2024.
Our talented trainees from diverse disciplines—including Mathematics, Statistics, Neurobiology, Bioinformatics and Computational Biology (BBC), Biochemistry, Chemical and Biological Engineering (CBE), and Computer Science— delivered an engaging five-minute lightning talk and presented insightful posters detailing research in their domain and how they will demonstrate critical objectives of the Dependable Data-Driven Discovery (D4) program:
A.1: Identify risks to and measures of dependability in data science lifecycles.
A.2: Develop new methods and tools for mitigating risks in data science lifecycles
D4 Presenters
Detection of Brain Midline Shift using Convolutional Neural Networks - Laura Zinnel (MATH)
Traumatic brain injury (TBI) is a prevalent neurological disorder that can have life-long impacts, and a quick diagnosis of TBI can play an essential role in the effectiveness of treatment. Typically, computerized Tomography (CT) scans are used by radiologists to check for signs of TBI in a patient's brain, but this process is time-consuming. My work focuses on developing a method to automatically detect signs of TBI from CT scans using deep learning models. This would speed up the diagnosis of patients with TBI and lead to faster treatment.

Nonlinear Dynamic Bayesian Modeling for Disease Outbreak Forecasting - Spencer Wadsworth (STAT)
To better infom public decision-makers and the healthcare system, the US Centers for Disease Control (CDC) hosts an annual collaborative flu forecasting initiative involving dozens of research teams who build weekly forecasts of flu hospitalizations. To maintain uniformity, all forecasts are submitted as a series of predictive quantiles. I explore a new statistical model for recovering the continuous probability distributions from which the predictive quantiles are estimated thus allowing the various forecasts to be compared and aggregated into an ensemble under well-established scoring and aggregation methods.

Understanding Associations Between Pathogenicity and Transmission Dynamics in Avian Influenza - Sigournie Brock (BCB)
The transmission of avian influenza virus (AIV), otherwise known as “bird flu,” poses a significant threat to both avian populations and the larger public due to its potential to reassort within avian hosts and has majorly impacted domestic poultry and wild bird populations with recent global outbreaks. The goal of my work is to elucidate the evolution of the transition from low pathogenicity AIV (LPAIV) - the ability of the virus to cause disease in poultry - to high pathogenicity AIV (HPAIV), which is associated with high morbidity and mortality rates. This presentation will outline a comprehensive Bayesian phylogenetic framework that implements state-dependent speciation and extinction (SSE) models to analyze how pathogenicity influences speciation (i.e., transmission), extinction (i.e., becoming non-infectious), and the rate of evolution between high and low pathogenicity.

Optimization of Retinal Progenitor Cell Proliferation on Coated Microcircuit Interfaces - Austin Sympson (BCE)
Vision loss and blindness affect an estimated 7 million people in America, equating to roughly twice the population of Los Angeles. Retinal degenerative disease is the leading contributor to vision loss and blindness worldwide. Age-related macular degeneration, diabetic retinopathy, and genetic and neural-based diseases all result in apoptosis of varying cell layers, causing vision loss or blindness. The unique irreversibility of Retinal Degeneration has made it a critical target for stem cell-based therapies. However, the unique interdependence of the retinal neural layers and biologically complex in vivo growth conditions have halted reliable in vitro cell production and successful clinical application. There is a need to investigate robust methods of directing the differentiation of cost-effective eternal progenitor cell lines to address the morphology, functionality, and cell specificity required for transplantation treatment. In this work, we propose using an interdigitated capacitor to stimulate retinal progenitor cells electrically, induce differentiation to vary retinal phenotypes, evaluate the morphological and chemical outcome through immunocytochemistry, and model the conditions resulting in the differentiated state. We aim to uncover the factors and conditions necessary to direct differentiation of retinal progenitor cells by a cost-effective and robust electrical stimulation protocol.

The Role of GAUT Proteins in Root Development - Allison Triebe (IGG)
Cell wall dynamics are regulated during root development through the activity of cell wall-modifying enzymes. My lab has found that the pectin-modifying enzyme GALACTURONOSYLTRANSFERASE 10 (GAUT10) is involved in both primary root elongation and cell division. Through data mining and BiFC assays, we have found that GAUT10 interacts with three other GAUTs: GAUT 3, 8, and 11. To test the overlapping function of these GAUT proteins during root development, mutant combinations of these four GAUT genes have been made. Preliminary phenotyping has shown that these genes have non-redundant and epistatic interactions.

DeqDock: Language Models for Prediction of Protein-Portein Docking Interactions From Sequence Only - Xiang Ma (ComS)
In our study, we introduce Seq-Dock, a computational protocol designed to utilize natural language encodings from protein pairs jointly trained on normalized binding strength to identify essential amino acids driving binding. By iteratively substituting amino acids with alanines along polypeptide backbones, we identify critical residues for binding. Validation on 15 known protein-protein complexes confirms Seq-Dock's accuracy. Our analysis highlights the challenge of predicting interacting residues in mutated protein sequences, demonstrating the importance of assessing predictive model robustness.

Physical - Process - Based Cell Neural Network (CANN) - Cyna Nguyen (MATH)
In the last few years, machine learning with neural networks have been explored to directly solve partial differential equations (PDEs). We consider combining the learning mechanisms of neural networks and PDE theories to develop fast neural network solvers for time dependent PDEs. In particular, we propose assigning different feed forward networks to approximate different physical terms of the PDEs. The new method shows a better order of convergence when comparing to the original cell-average neural network (CANN) network. The linear convection-diffusion equation is used as the model equation

Individual Fairness in Graphs Using Local and Global Structural Information - Yonas Sium (ComS)
Graph neural networks are powerful graph representation learners in which node representations are highly influenced by features of neighboring nodes. Prior work on individual fairness in graphs has focused only on node features rather than structural issues. However, from the perspective of fairness in high-stakes applications, structural fairness is also important, and the learned representations may be systematically and undesirably biased against unprivileged individuals due to a lack of structural awareness in the learning process. In this work, we propose a pre-processing bias mitigation approach for individual fairness that gives importance to local and global structural features. We mitigate the local structure discrepancy of the graph embedding via a locally fair PageRank method. We address the global structure disproportion between pairs of nodes by introducing truncated singular value decomposition-based pairwise node similarities. Empirically, the proposed pre-processed fair structural features have superior performance in individual fairness metrics compared to the state-of-the-art methods while maintaining prediction performance.
