Achievements

In 2025, the Dependable Data-Driven Discovery (D4) NRT marked a significant milestone with the completion of its inaugural cohort traineeship. We celebrate our trainees for successfully integrating D4 principles into their research and look forward to even greater achievements in the years ahead. Below is a curated list of published and submitted works, and updates on the latest D4 talks given by our leadership, graduate trainees, and undergraduate research assistants. 

Published Work

Sium Y, Li Q, Varshney KR. In: Proceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society (AIES 2024); 2024. p. 1379-1389. 

Seok Hwan Song and W. Tavanapong. How Much Do Prompting Methods Help LLMs on Quantitative Reasoning with Irrelevant Information? To appear in Proc. of Int’l Conf. on ACM International Conference on Information and Knowledge Management (CIKM 2024), Boise, Idaho, USA, October 2024. 

Y. Sium, Q. Li, and K. R. Varshney, "Individual fairness in graphs using local and global structural information," in Proc. AAAI/ACM Conf. Artif. Intell., Ethics, and Soc. (AIES), San Jose, CA, October, 2024. 

David OBrien, Sumon Biswas, Sayem Mohammad Imtiaz, Rabe Abdalkareem, Emad Shihab, and Hridesh Rajan, "Are Prompt Engineering and TODO Comments Friends or Foes? An Evaluation on GitHub Copilot," ICSE’2024: The 46th International Conference on Software Engineering, April, 2024. 

David OBrien, Robert Dyer, Tien Nguyen, and Hridesh Rajan, "Data-Driven Evidence-Based Syntactic Sugar Design," ICSE’2024: The 46th International Conference on Software Engineering, April, 2024. 

Shibbir Ahmed, Hongyang Gao, and Hridesh Rajan, "Inferring Data Preconditions from Deep Learning Models for Trustworthy Prediction in Deployment," ICSE’2024: The 46th International Conference on Software Engineering, April, 2024. 

Shibbir Ahmed, Sayem Mohammad Imtiaz, Samantha Syeda Khairunnesa, Breno Dantas Cruz, and Hridesh Rajan, "Design by Contract for Deep Learning APIs," ESEC/FSE’2023: The 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, December, 2023. 

Giang Nguyen, Sumon Biswas, and Hridesh Rajan, "Fix Fairness, Don’t Ruin Accuracy: Performance Aware Fairness Repair using AutoML," ESEC/FSE’2023: The 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, December, 2023.

Ali Ghanbari, Deepak-George Thomas, Muhammad Arbab Arshad, and Hridesh Rajan, "Mutation-based Fault Localization of Deep Neural Networks," ASE’2023: 38th IEEE/ACM International Conference on Automated Software Engineering, September, 2023. 

Mohammed, A.A., Fonder, C., Sakaguchi, D., Tavanapong, W., Mallapragada, S.K., and Idris, A., “IDCIA: Immunocytochemistry Dataset for Cellular Image Analysis”, Proc. of the 14th Conf. on ACM Multimedia Systems, 451-457 (2023).

A. Triebe, "Roles Of Galacturonosyltransferases In Arabidopsis Development," Thesis submitted to the graduate faculty in partial fulfillment of the requirements for the degree of Master of Science, 2024

Accepted Work

Wei Y. LLM-Augmenter: Improving Clinical Information Extraction Through Structured LLM Augmentation. Accepted by World Congress on Medical and Health Informatics, Taipei, Taiwan, August 2025. 

Sium, Yonas, and Qi Li. "ComFairGNN: Community Fair Graph Neural Network." In Proceedings of the 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2025), Sydney, Australia, 2025. 

Good S, Charbonnet, JA. Molecular Interactions between PFOA and Heat Shock Proteins: Implications for Toxicokinetics in a Warming World. In: Cellular and Molecular Mechanisms of Toxicity Gordon Research Conference; August 10 –15, 2025. Proctor Academy in Andover, New Hampshire, United States. 

Sloan E, Rozier KY. Understanding Time in Space: Improving Timeline Understandability for Uncrewed Space Systems. In: SpaceChi 4.0, Cologne, Germany (2025).

Song SH, Chakraborty M, Li Q, Tavanapong W. Is Large Language Model Performance on Reasoning Tasks Impacted by Different Ways Questions Are Asked? Accepted to appear in Findings of the Annual Meeting of the Association for Computational Linguistics (ACL), Vienna, Austria, July 27-Aug. 1, 2025. 

Mohammed A, Tavanapong W, Fonder K, and Sakaguchi DS, “CountXplain: Interpretable Cell Counting with Prototype-Based Density Map Estimation,” To appear in Proceedings of Machine Learning Research, Salt Lake City, Utah, USA, Jul. 2025, pp. 1–15. 

Beulah, S. E. R (2024). Description of Bovine ATAC-seq dataset using the data cards Playbook  - manuscript in preparation

Baker AL Jr, Arruda B, Palmer MV, Nguyen T, et al. Dairy cows inoculated with highly pathogenic avian influenza virus H5N1. Nature. 2025;637:491-497. doi:10.1038/s41586-024-08166-6.  

Nguyen T-Q, et al. Emergence and interstate spread of highly pathogenic avian influenza A(H5N1) in dairy cattle in the United States. Science. 2025;388:eadq0900. doi:10.1126/science.adq0900.  

Fanijo S, Jannesari A, Dickerson J. IDCC-SAM: A Zero-Shot Approach for Cell Counting in Immunocytochemistry Dataset Using the Segment Anything Model. Bioengineering. 2025;12(2):184. 

Submitted Work

Song SH, Efat AA, Tavanapong W. Assessing Y-Axis Influence: Bias in Multimodal Language Models on Chart-to-Table Translation. Submitted to ACL Rolling Review, May 2025. 

Mohammed AA, Fonder C, Wei Y, Tavanapong W, Sakaguchi DS, Mallapragada SK, and Li Q. CellFMCount: A Fluorescence Microscopy Dataset, Benchmark, and Methods for Cell Counting. Submitted to the IEEE Conference on Data Mining. Submitted on June 6, 2025.

S. Wadsworth and J. Niemi, “Forecasting influenza hospitalizations using a Bayesian hierarchical nonlinear model with discrepancy,” submitted for publication, 2024. 

S. Wadsworth and J. Niemi, "Bayesian Stacking Via Proper Scoring Rule Optimization Using A Gibbs Posterior," submitted for publication, 2024. 

Zinnel, L., & Bentil, S. A. (2024). Comparing windowing methods in 2D and 3D convolutional neural networks to classify brain midline shift - manuscript under review.

Liu, H., & Zinnel, L. (2024). A primal-dual level set method for computing geodesic distances—manuscript under review.

D4 Talks

Leadership Team

Mallapragada, S. K. (2024), Invited talk at the University of the District of Columbia (HBCU), Washington, DC

Nettleton, D. (2024), New England Statistics Symposium, University of Connecticut, CT

Friedberg, I. (2024), Yes, Database (OUP)

Sukul, A. (2023 and 2024), Ames National Laboratory Data Science Workshop for Energy Solutions (5-weeks, in-person)

Tavanapong, W., Li, Q., Bahng, E.J., Wogpiromsarn, T., Weng, B. (2024) AI Jumpstart For Teachers (K-12)

Li, Q., Tavanapong , W  Data science workshop for Sacred Heart Middle School, Spencer, Iowa (2023)
Spencer, IA

Tavanapong, W. (2023), ACM Celebration of Women in Computing, MINC WIC 2023

Li, Q. (2024, April), Training for High School Teachers at Iowa State University, Ames, IA

Tavanapong, W. (2023), Panelist: “The Data Science Organizational Structures (Institutes vs. Departments and Schools),” 2023 Data Science Leadership Summit, Boston University, MA

Mallapragada, S. K. (2023), Rock Stars of Regenerative Engineering Conference, AIChE, San Diego, CAGraduate Trainees

Graduate Trainees

Allison Triebe, ASPB Midwest 2024 Meeting, American Society of Plant Biologists, Purdue University, West Lafayette, IN

Laura Zinnel & Geng Ding, “Responsible Automated Stem Cell Analysis with Convolutional Neural Networks,” NSF Annual Meeting 2023, Tempe, AZ

Laura Zinnel & Allison Triebe, Biological Sciences Symposium at ISU, April 2024

Spencer Wadsworth, International Symposium on Forecasting, Dijon, 2024

Sigournie Brock, Third Joint Congress on Evolutionary Biology, Montreal, 2024

Austin Sympson, 2024 AIChE® Annual Meeting, San Diego, 2024

Yonas Sium, 2024 AEIS Conference, San Jose, CA

All trainees, D4 Annual Research Symposium, Ames, IA, 2024

Undergraduate Trainees

Shah, A. (2024, February), Iowa State Conference on Race and Ethnicity (ISCORE), Ames, IA

Abbajabal, R. (2023), National Conference on Undergraduate Research, University of Wisconsin–Eau Claire, WI

Podlich, E. (2024), National Conference on Undergraduate Research, Long Beach, CA


This work is partially supported by the National Science Foundation under Grant No. 2152117. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.