In 2026, the Dependable Data-Driven Discovery (D4) NRT marked a significant milestone with the completion of its second cohort's traineeship. We celebrate our trainees for successfully integrating D4 principles into their research and look forward to even greater achievements in the years ahead. Below is a curated list of published and submitted works, and updates on the latest D4 talks given by our leadership, graduate trainees, and undergraduate research assistants.
Published / Accepted Work
2026
- Singh, P., Good, S., Summers, K., Sparks, A., Keating, A. F., Charbonnet, J.A. Quantification Of Per- and Polyfluoroalkyl Substances (PFAS) in Plasma and Follicular Fluid of Patients Undergoing In Vitro Fertilization (IVF) in Iowa: A Pilot Study. F&S Reports. (2026). DOI: https://doi.org/10.1016/j.xfre.2025.11.009
- Eric Weber, Heather Gallivan, Lydia Butters & Stephen Nathan Mercil (2026) Leveraging Mathematical Knowledge to Prepare Future Math Teachers to Teach Data Science, Scatterplot, 3:1, 2644686, DOI: 10.1080/29932955.2026.2644686
- Spencer Wadsworth and Jarad Niemi. “Quantile Forecast Matching with a Bayesian Quantile Gaussian Process Model.” Statistics and Computing, 36(3), Apr. 2026. https://doi.org/10.1007/s11222-026-10867-z
- Ma, X., Joshi, P., Friedberg, I., and Li, Q. “How Not to Be Seen: Predicting Unseen Enzyme Functions Using Contrastive Learning.” ISMB 2026. https://doi.org/10.64898/2026.02.23.707489
- Abdurahman Ali Mohammed and Wallapak Tavanapong. “Proto4DME: Interpretable Cell Counting via Additive Prototype Density Decomposition and Optimal-Transport Coverage.” 7th Annual Conference on Health, Inference, and Learning (CHIL 2026).
- Azher Ahmed Efat, Seok Hwan Song, and Wallapak Tavanapong. “Beyond Single Plots: A Benchmark for Question Answering on Multi-Charts.” To appear in Proc. of the Annual Meeting of the Association for Computational Linguistics (ACL 2026) Findings. San Diego, CA, USA, July 2-7, 2026.
Seok Hwan Song, Azher Ahmed Efat, and Wallapak Tavanapong. “Assessing Y-Axis Influence: Bias in Multimodal Language Models on Chart-to-Table Translation.” Proc. of the Annual Meeting of the Association for Computational Linguistics (ACL 2026) Findings. San Diego, CA, USA, July 2-7, 2026.
2025
- H. Liu and L. Zinnel. “A Primal-Dual Level Set Method for Computing Geodesic Distances.” Accepted by SIAM Journal on Numerical Analysis, Oct. 2025. https://doi.org/10.1137/24M1721086
- L. Zinnel and S. A. Bentil. “Comparing Windowing Methods in 2D and 3D Convolutional Neural Networks to Classify Brain Midline Shift." Biomedical Signal Processing and Control, Sept. 2024.
- Wei, Y., Li, Q., and Pillai, J. “Structured LLM Augmentation for Clinical Information Extraction.” In Proceedings of the 19th World Congress on Medical and Health Informatics, Aug. 10, 2025. DOI 10.3233/SHTI250984
- Dehghanmanshadi, Mohammad, and Wallapak Tavanapong. “Reducing Domain Gap with Diffusion-Based Domain Adaptation for Cell Counting.” In Proceedings of IEEE Int’l Conf. on Machine Learning and Applications (ICMLA). FL, USA, Dec. 3-5, 2025, pp. 1452-1459, doi: 10.1109/ICMLA66185.2025.00221.
- A. A. Mohammed, W. Tavanapong, C. Fonder, and D. S. Sakaguchi. “CountXplain: Interpretable Cell Counting with Prototype-Based Density Map Estimation.” Medical Imaging with Deep Learning (MIDL), 2025. In Proceedings of Machine Learning Research (PMLR).
- Mohammed, Abdurahman Ali, Catherine Fonder, Ying Wei, Wallapak Tavanapong, Donald S. Sakaguchi, Qi Li, and Surya K. Mallapragada. “CellFMCount: A Fluorescence Microscopy Dataset, Benchmark, and Methods for Cell Counting.” In Proceedings of IEEE Int’l Conf. on Data Mining, Washington DC, USA Nov. 12-15, 2025, pp. 613-622, doi: 10.1109/ICDM65498.2025.00069.
- Elizabeth Sloan and Kristin Yvonne Rozier. “Understanding Time in Space: Improving Timeline Understandability for Uncrewed Space Systems.” In Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025), OASIcs, Vol. 130, pp. 24:1–24:12, 2025. https://doi.org/10.4230/OASIcs.SpaceCHI.2025.24
- Gallivan, H., and Weber, E. “Scaffolding Data Science Concepts for Future Mathematics Teachers.” In Proceedings of the 47th Annual Meeting of the North American Chapter of the International Group for the Psychology of Mathematics Education, pp. 1739–1743, 2025. https://doi.org/10.51272/pmena.47.2025
- Moore, A., Chu, L., and Zhu, Z. “Adaptive Block-Based Change-Point Detection for Sparse Spatially Clustered Data with Applications in Remote Sensing Imaging.” To be published in Annals of Applied Statistics. https://doi.org/10.48550/arXiv.2505.21814
- Seok Hwan Song, Mohna Chakraborty, Qi Li, and Wallapak Tavanapong. “Is Large Language Model Performance on Reasoning Tasks Impacted by Different Ways Questions Are Asked?” In Findings of the Association for Computational Linguistics: ACL 2025, pp. 22066–22081, Vienna, Austria, 2025. https://aclanthology.org/2025.findings-acl.1138.pdf
- Spencer Wadsworth and Jarad Niemi. “Forecasting Influenza Hospitalizations Using a Bayesian Hierarchical Nonlinear Model with Discrepancy.” Bayesian Analysis, Advance Publication, pp. 1–29, 2025. https://doi.org/10.1214/25-BA1578
- Sium, Yonas, and Qi Li. “ComFairGNN: Community Fair Graph Neural Network.” In Proceedings of the 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2025), Sydney, Australia, 2025. https://doi.org/10.1007/978-981-96-8173-0_2
- Baker, A. L. Jr., Arruda, B., Palmer, M. V., Nguyen, T., et al. “Dairy Cows Inoculated with Highly Pathogenic Avian Influenza Virus H5N1.” Nature, 637, pp. 491–497, 2025. https://doi.org/10.1038/s41586-024-08166-6
- Nguyen, T.-Q., et al. “Emergence and Interstate Spread of Highly Pathogenic Avian Influenza A(H5N1) in Dairy Cattle in the United States.” Science, 388, eadq0900, 2025. https://doi.org/10.1126/science.adq0900
Fanijo, S., Jannesari, A., and Dickerson, J. “IDCC-SAM: A Zero-Shot Approach for Cell Counting in Immunocytochemistry Dataset Using the Segment Anything Model.” Bioengineering, 12(2), 184, 2025. https://doi.org/10.3390/bioengineering12020184
2024
- Seok Hwan Song and W. Tavanapong. “How Much Do Prompting Methods Help LLMs on Quantitative Reasoning with Irrelevant Information?” To appear in Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), Boise, Idaho, Oct. 2024. https://dl.acm.org/doi/10.1145/3627673.3679840
- Y. Sium, Q. Li, and K. R. Varshney. “Individual Fairness in Graphs Using Local and Global Structural Information.” In Proceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society (AIES), pp. 1379–1389, 2024. https://doi.org/10.1609/aies.v7i1.31731
- David OBrien, Sumon Biswas, Sayem Mohammad Imtiaz, Rabe Abdalkareem, Emad Shihab, and Hridesh Rajan. “Are Prompt Engineering and TODO Comments Friends or Foes? An Evaluation on GitHub Copilot.” ICSE 2024: The 46th International Conference on Software Engineering, Apr. 2024. https://doi.org/10.1145/3597503.363917
- David OBrien, Robert Dyer, Tien Nguyen, and Hridesh Rajan. “Data-Driven Evidence-Based Syntactic Sugar Design.” ICSE 2024: The 46th International Conference on Software Engineering, Apr. 2024. https://doi.org/10.1145/3597503.3639580
Shibbir Ahmed, Hongyang Gao, and Hridesh Rajan. “Inferring Data Preconditions from Deep Learning Models for Trustworthy Prediction in Deployment.” ICSE 2024: The 46th International Conference on Software Engineering, Apr. 2024. https://doi.org/10.1145/3597503.362333
2023
- Shibbir Ahmed, Sayem Mohammad Imtiaz, Samantha Syeda Khairunnesa, Breno Dantas Cruz, and Hridesh Rajan. “Design by Contract for Deep Learning APIs.” ESEC/FSE 2023: The 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Dec. 2023. https://doi.org/10.1145/3611643.3616247
- Giang Nguyen, Sumon Biswas, and Hridesh Rajan. “Fix Fairness, Don’t Ruin Accuracy: Performance Aware Fairness Repair Using AutoML.” ESEC/FSE 2023: The 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Dec. 2023. https://doi.org/10.1145/3611643.3616257
- Khairunnesa, S. S.; Ahmed, S.; Imtiaz, S. M.; Rajan, H.; Leavens, G. T. “What kinds of contracts do ML APIs need?” Empirical Software Engineering 28, no. 6, Article 142, 2023. https://doi.org/10.1007/s10664-023-10320-z
- Ali Ghanbari, Deepak-George Thomas, Muhammad Arbab Arshad, and Hridesh Rajan. “Mutation-Based Fault Localization of Deep Neural Networks.” ASE 2023: 38th IEEE/ACM International Conference on Automated Software Engineering, Sept. 2023.
- Mohammed, A. A., Fonder, C., Sakaguchi, D., Tavanapong, W., Mallapragada, S. K., and Idris, A. “IDCIA: Immunocytochemistry Dataset for Cellular Image Analysis.” In Proceedings of the 14th Conference on ACM Multimedia Systems, pp. 451–457, 2023
- Biswas, S.; Rajan, H. “Fairify: Fairness Verification of Neural Networks.” In Proceedings of the 45th International Conference on Software Engineering (ICSE ’23), pp. 1546–1558, 2023. DOI: 10.1109/ICSE48619.2023.00134.
- Gohar, U., Biswas, S., & Rajan, H. (2023). Towards understanding fairness and its composition in ensemble machine learning. ICSE ’23: The 45th International Conference on Software Engineering, 1533–1545. https://doi.org/10.1109/ICSE48619.2023.00133
- Imtiaz, S. M., Batole, F., Singh, A., Pan, R., Cruz, B. D., & Rajan, H. (2023). Decomposing a recurrent neural network into modules for enabling reusability and replacement. ICSE ’23: The 45th International Conference on Software Engineering, 1020–1032. https://doi.org/10.1109/ICSE48619.2023.00093
Theses and Dissertations
- Zinnel, L. (2026). Deep Learning and Level Set Methods for 3-Dimensional Brain Image Analysis to Advance Traumatic Brain Injury Detection and Outcome Prediction [Dissertation, Iowa State University].
- Nguyen, C. (2025). Bound-Preserving Neural Network Method for Linear Hyperbolic and Parabolic Equations [Master’s thesis, Iowa State University]. Iowa State University Digital Repository. https://doi.org/10.31274/td-20251215-154
- Triebe, A. R. (2024). Roles of GALACTURONOSYLTRANSFERASES in Arabidopsis Development [Thesis, Iowa State University]. Iowa State University Digital Repository. https://doi.org/10.31274/td-20250502-108
Submitted Work
- Seok Hwan Song, Jiwon Choi, Minoo Hong, Qi Li, and Wallapak Tavanapong. “Are Vision Language Models Faithful to Multimodal Context?” Submitted to ACL 2026.
- S. Wadsworth and J. Niemi. “Bayesian Stacking Via Proper Scoring Rule Optimization Using a Gibbs Posterior.” Submitted for publication, 2024.
- Beulah, S. E. R. (2024). Description of Bovine ATAC-seq Dataset Using the Data Cards Playbook [Technical report]. Submitted work. https://dr.lib.iastate.edu/entities/publication/1d51fd6e-cd1e-48d9-959a-e80aa667e5e5
- Fleming, S. G., Garner, N. E., Lee, D. J., Maertz, P. M., Barnes, E. M., Lundgren, S. L., Demczak, M. G., Spinelli, C. W., Bane, R., Kuivinen, H. S., Harlan, M. J., Schott, K. A., Ji, M. B., Lambert, Z., Zinnel, L., Schmidt, M. J., Hof, P. R., Sacco, J., Limon, K., Keeney, J., Sherwood, C. C., Manger, P. R., and Spocter, M. A. “Comparative Volumetric Analysis of the Striatum and Sequence Analysis of the slc398a Gene in the Euungulata: An Arms-Race Between Predator-Prey Neural Components and the Case for Possible Horizontal Gene Transfer.” Manuscript under review at Journal of Comparative Neurology.
Other D4 Products, Presentations, Theses, and Outreach Outputs
Software, Datasets, Code, and Technical Resources
- Development and Deployment of CyCounter Software in the Sakaguchi Lab. https://cycounter.streamlit.app/
- IDCIA: Immunocytochemistry Dataset for Cellular Image Analysis
The IDCIA dataset is a public microscopy dataset designed for automated cell counting and detection on immunocytochemistry-stained images. It contains 262 microscopic images of electrically stimulated neural progenitor cells, with ground-truth cell counts and point annotations identifying individual cell locations. The images span multiple antibody conditions, making the dataset useful for benchmarking automated cell analysis models under varied staining conditions.
Dataset: https://figshare.com/articles/dataset/Dataset/21970604
Code: https://github.com/ISU-NRT-D4/cell-analysis/tree/main/IDCIA - CellFMCount Dataset and Codebase
CellFMCount is a large-scale fluorescence microscopy dataset designed to support robust and generalizable automated cell counting. It contains 3,023 images from immunocytochemistry experiments involving neural progenitor cells, with more than 430,000 manually annotated cell locations. Each cell is annotated with a dot marking its approximate center.
Dataset: https://zenodo.org/records/17088532
Code: https://github.com/NRT-D4/CellFMCount - Creation of 9 Kaggle Competitions for the D4 Summer Bootcamp
Graduate trainees created nine Kaggle competitions that were used as part of the D4 Summer Bootcamp curriculum. - K–12 Large Language Model Teaching Resources
Teaching resources on bias in Large Language Models on resume screening were created and used to train approximately 40 K–12 teachers. - Graduate–Undergraduate Mentoring Research Projects
A mentoring program was created to support mentoring and academic development between graduate and undergraduate trainees.
Faculty Talks, Workshops, and Outreach Presentations
- Mohammed, A. A., Fonder, C., Wei, Y., Tavanapong, W. (Presenter), Sakaguchi, D. S., Li, Q., & Mallapragada, S. K. (2025, November 12–15). CellFMCount: A fluorescence microscopy dataset, benchmark, and methods for cell counting [Conference presentation]. 2025 IEEE International Conference on Data Mining (ICDM), Washington, DC, United States.
- Robert Jernigan. “Bioengineering Using Geometry to Validate Function Predictions.” Presented at Annual Research Day, Chicago, IL, 2025.
- Wang, Y., Wu, H., & Nettleton, D. (Presenter) (2025). A random forest prediction interval with coverage guarantees [Conference presentation]. International Conference on Statistics and Data Science, Seville, Spain.
- Mallapragada, S. K. (2024). Invited talk at the University of the District of Columbia, Washington, DC.
- Nettleton, D. (2024). New England Statistics Symposium, University of Connecticut, CT.
- Friedberg, I. (2024). Yes, Database. Oxford University Press.
- Sukul, A. (2023 and 2024). Ames National Laboratory Data Science Workshop for Energy Solutions, a 5-week in-person workshop.
- Tavanapong, W., Li, Q., Bahng, E. J., Wogpiromsarn, T., and Weng, B. (2024). AI Jumpstart for Teachers, K–12 outreach/training.
- Li, Q., Tavanapong, W., Huai, M (2023). Data science workshop for Sacred Heart Middle School, Spencer, Iowa.
- Tavanapong, W. (2023). ACM Celebration of Women in Computing, MINC WIC 2023.
- Li, Q. (2024, April). Training for High School Teachers at Iowa State University, Ames, IA.
- Tavanapong, W. (2023). Panelist, “The Data Science Organizational Structures: Institutes vs. Departments and Schools,” 2023 Data Science Leadership Summit, Boston University, MA.
- Mallapragada, S. K. (2023). Rock Stars of Regenerative Engineering Conference, AIChE, San Diego, CA.
- Wang, Y., Wu, H., and Nettleton, D. (Presenter) (2025). “A Random Forest Prediction Interval with Coverage Guarantees.” Conference presentation, International Conference on Statistics and Data Science, Seville, Spain.
Graduate Trainee Presentations
- Andrew Tan, Maharram Jabrayilov, Jeremy Essner, Abhijit Bera, and Matthew G. Panthani. “Silicon Nanosheet Memristors for Neuromorphic Computing.” Materials Research Society (MRS) Spring 2026 Conference, Honolulu, HI, 2026.
- Alan Moore, Lynna Chu, and Zhengyuan Zhu. “Adaptive Block-Based Change-Point Detection for Sparse Spatially Clustered Data with Applications in Remote Sensing Imaging.” ISU-NISS Conference on AI and Statistics, Ames, IA, 2025.
- Gretta Buttelmann and Matthew Hufford. “Comparative Genomic Analysis of Maize and Its Wild Relatives to Identify Loci Underlying Cold Tolerance and Nitrogen Recycling.” 2025 Maize Genetics Meeting, St. Louis, MO, 2025.
- Good, S., and Charbonnet, J. A. “Molecular Interactions between PFOA and Heat Shock Proteins: Implications for Toxicokinetics in a Warming World.” In Cellular and Molecular Mechanisms of Toxicity Gordon Research Conference, Proctor Academy, Andover, New Hampshire, Aug. 10–15, 2025.
- Gretta Buttelmann, Taylor AuBuchon-Elder, Mitra Menon, Sowmya Mambakkam, Robert Bukowski, M. Cinta Romay, Edward S. Buckler, Elizabeth A. Kellogg, Jeffrey Ross-Ibarra, and Matthew B. Hufford. “Identification of Locally Adapted Loci and Convergent Evolution within Andropogoneae Species.” 2026 Maize Genetics Meeting, Cologne, Germany, 2026.
- Matthew, W., Hattery, T., Chen, K., Granados-Nava, K., Schroeder, E., Myers, B., Moore, R., Loneman, D., Claussen, R., Gilbert, A., Garfin, J., Bjerklie, D., Lauter, N., Hirsch, C., and Yandeau-Nelson, M. D. (2026). “Parsing Genotype-Interaction Effects to Understand the Genetic Architecture of Maize Cuticular Wax Accumulation Across Environments.” Poster presentation, 68th Annual Maize Genetics Meeting, Cologne, Germany.
- Nine graduate trainees presented at the Responsible AI Student Research Symposium, Ames, IA, 2025
- Triebe, A. R. (2024). ASPB Midwest 2024 Meeting, American Society of Plant Biologists, Purdue University, West Lafayette, IN.
- Zinnel, L., and Ding, G. (2023). “Responsible Automated Stem Cell Analysis with Convolutional Neural Networks.” NSF Annual Meeting, Tempe, AZ.
- Zinnel, L., and Triebe, A. R. (2024, April). Biological Sciences Symposium at Iowa State University.
- Wadsworth, S. (2024). International Symposium on Forecasting, Dijon, France.
- Brock, S. (2024). Third Joint Congress on Evolutionary Biology, Montreal, Canada.
- Sympson, A. (2024). AIChE Annual Meeting, San Diego, CA.
- Sium, Y. (2024). AIES Conference, San Jose, CA.
Eight graduate trainees presented at the D4 Annual Research Symposium, Ames, IA, 2024.
Undergraduate Trainee Presentations
- Kashyap, R., Ravivenkatesh, V., Sharma, N., Tavanapong, W., and Huai, M. (2026). “An Empirical Study of Predictive Uncertainty Under the Right To Be Forgotten: Machine Unlearning.” Conference presentation, National Conference on Undergraduate Research, Richmond, VA, April 13–15, 2026.
- Harty, O., Separdson, N., Lee, V., & Song, S. H. (2025, April). Imperial vs. metric: Evaluating the impact of unit choice on large language models (LLMs) [Poster presentation]. National Conference on Undergraduate Research (NCUR) 2025, Pittsburgh, PA, United States.
- Ravindran, T., Feid, A., Hahn, B., Loo, M., Darling, D., Ngyen, C., & Song, S. H. (2025, April). What are the capabilities and pitfalls of LLMs on integral calculus problems with zero-shot reasoning? [Poster presentation]. National Conference on Undergraduate Research (NCUR) 2025, Pittsburgh, PA, United States.
- Shah, A. (2024, February). Iowa State Conference on Race and Ethnicity, Ames, IA.
- Abbajabal, R. (2023). National Conference on Undergraduate Research, University of Wisconsin–Eau Claire, WI.
- Podlich, E. (2024). National Conference on Undergraduate Research, Long Beach, CA.
Undergraduate trainees contributed 10 talks and poster presentations at the Responsible AI Student Research Symposium at Iowa State University.
This work is partially supported by the National Science Foundation under Grant No. 2152117. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.