Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Spring 2026

CDSS Data Discovery Spring 2026 Symposium

About Data Discovery

The Data Discovery Program connects teams of Berkeley students with mentors across campus and beyond, using data to understand real-world problems in academia, industry, government and nonprofits.

The Symposium is the culmination of the semester as student researchers share their work through poster presentations. Please join us in celebrating their contributions to the data research community.

You can visit our webpage at https://cdss.berkeley.edu/discovery or email us at cdss-datadiscovery@berkeley.edu.

Schedule

All parts of the Symposium take place in the lobby of the Hearst Memorial Mining Building at UC Berkeley on Thursday, May 7, 2026.

Spring Symposium 2026 Awardees

Research and Storytelling Awards

Agents in the Loop for Small Molecule Drug Design Akansha Jain, Annie Gao, Ava Joshi, Gautam Ramkumar, Shreyash Goli Advised by Alan Cheng, Alec Glisman, Claire Suen, David He and Xiaodong Zhu Merck & Co

Predicting Total Clay Content from Well Logs Joseph Wong, Lisa Hao, Theo Dela Cruz Advised by Kimberly Baldwin & Kevin Smith, U.S. Department of the Interior

Governing the Sea: Automating Detection of Deep Sea Polymetallic Nodules Brian Hwang, Kailash Ramesh, Thang Nguyen Advised by Kimberly Baldwin, Erick Huchzermeyer, Kevin Smith, Obediah Racicot, U.S. Department of the Interior

Arctic Vegetation Classification Using Machine Learning & Remote Sensing Jose de Jesus Ochoa Tolento & Thomas Teixeira Scarioli Advised by Colette Brown, UC Berkeley

Training and Benchmarking AI/ML Models Using Space Biology Transcriptomic Data Abraham Guan, Brian Zhou, Karen Meng, Ishanth Hombaiah, Martin Li Advised by Walter Alvarado, NASA

Honorable Mentions

Retinal Laser Scans for Disease Progression Prediction in MS Patients Adrian Nguyen, Diana Tao, Jenny Shang, Naman Rudrakshi Advised by Yifan Zhang and Emily Chen, CLight Technologies

Identifying Risk Factors of Student Attendance Edan Wong, Jeremy Li, Olivia Lee Advised by Ana Costa and Yusuf Ajiboye, Oakland Natives Give Back

Can Analysts Predict Markets? Using Equity Research Reports to Forecast Gilead Sciences Stock Price Amy Tran, Holden Carrillo, Qamil Mirza, Tara Timm, Zhen Liu Advised by Ethan Yen, Gilead Sciences


Poster Categories

Environment

Are California School Districts Ready for Climate Impacts?
Karina Parikh, Patrick Chi, Stesha Simon, Zach Makari
Mentored by Devin Ngo & Sarah Whiting, Ten Strands

Analyzing the Landscape of Funding for Environmental & Climate Action
Jason Chang, Natasha Gill, Jeremy Kemmerer, Kelly Le
Mentored by Sarah Whiting, Ten Strands

Operational Decisions and a Changing Climate: Shaping California’s Water Future
Aryan Achuthan, Isabelle Wang, Rain Zou, Isabelle Goebel, Sabi Can Ruso
Mentored by Dino Bellugi and James Gilbert, UC Berkeley

Analysis of Surface Albedo and Climate Variability Relationships at Donner Pass
Allen Liu, Christopher Chang, Angelo Payavala, Josh Tsang
Mentored by Gabe Lewis, Central Sierra Snow Laboratory

Arctic Vegetation Classification Via Machine Learning and Remote Sensing
Jose de Jesus Ochoa Tolento, Thomás Teixeira Scarioli
Mentored by Colette Reifsnyder-Brown, Energy and Resources Group, UC Berkeley

Satellite-based Power Plant CO2 Emissions Quantification Using Deep Learning
Ria Voodi, Rohin Juneja, Deleena Ghosh
Mentored by Hikari Murayama and Ronald Cohen, UC Berkeley

Quantifying Power Plant NOx Emissions Using Deep Learning
Pranav Walimbe, Ryuto Tsuruoka
Mentored by Hikari Murayama and Ronald Cohen, UC Berkeley

Meeting the NASEM Grand Challenges: A Data-Driven NLP Temporal Analysis of AEESP Research Publications/Themes
Abhiram Bhavaraju, Ryan Ngo, Riya Sehgal
Mentored by Kara Nelson, Association of Environmental Engineering & Science Professors

Building Data Processing and Analysis Pipeline for Electrochemical Impedance Spectroscopy (EIS)
Anqi Tao, Medha Rakesh, Ayush Guha
Mentored by Raluca O. Scarlat & Matei Ignuta-Ciuncanu, Momoka Imoto, The SALT Research Group

Real World Analysis of Electric Bus Operation in Anaheim Phase II
Annie Liu, Jocelyn Wu, Krithika Muralidhar, Lavin Yau
Mentored by Timothy Lipman, UC Berkeley

Interactive Dashboard Design for Microbial Data Exploration
Martin Sim, Ju Ho Kim, Zouxuan Wu, Jeffrey Gao
Mentored by Jennifer Kuehl & Shekhar Mishra, LBL

How Much Clay? Predicting Mineral Composition from Wireline Logs
Joseph Wong, Theo Dela Cruz, Lisa Hao
Mentored by Kimberly Baldwin, U.S. Department of the Interior

Governing the Sea: Automating Detection of Deep Sea Polymetallic Nodules
Brian Hwang, Thang Nguyen, Kailash Ramesh
Mentored by Kimberly Baldwin, Erick Huchzermeyer, Kevin Smith, Obediah Racicot, U.S. Department of the Interior

AI-Powered Quality Assurance and Quality Control for Water Data
Ciann Amalan, Tanush Obili, Aneya Sobalkar, Joshua Min
Mentored by Dan Wang, CA Water Board

Data Science for a NASA Air Quality Sensor Database
Holden Fees, Ngan Nguyen, Zakaria Al-Alie
Mentored by Kristen Okorn, NASA

2D Spatial Detection and Tracking of Emerging Solar Active Regions Using Acoustic Power Suppression Maps
Umar Ghani, Tanay Pant, Tanish Kher
Mentored by Irina Kitiashvili, NASA

Business

Optimizing Cross-Platform AI Token Allocation Through Predictive Usage Modeling
Andrew Taylor, Angela Wu, Taewoo Kim, Yirina Wang
Mentored by Nick Cawthon & Ashish Singh, COFAIR

Can Analysts Predict Markets? Using Equity Research Reports to Forecast Gilead Sciences Stock Price
Qamil Mirza bin Abdullah, Holden Carrillo, Zhen Liu, Tara Timm, Amy Tran
Mentored by Ethan Yen, Gilead Sciences

Rivian Product Development AI Assistant
Canhui Huang, Ethan Ngo
Mentored by Zhangli Hu, Rivian

BART: Legacy Codebase Documentation Agent
Rianna Zhu, Haylie Yee, Nikhil Kotta, Carlos Ganoza
Mentored by Yu Shen, BART

LLM-as-a-Judge: Using Large Language Models to Evaluate Machine Translation Quality
Elaine Zhang, Syed Rayyan Ali
Mentored by Goran Muric, InferLink Corporation

Optimal Risk Assessment through Likelihood x Consequence
Carter Chen, Nico Cruz, Reyansh Pallikonda
Mentored by Rodney Martin, NASA Ames Research Center

Education

Identifying Risk Factors of Student Attendance
Edan Wong, Jeremy Li, Olivia Lee
Mentored by Ana Costa & Yusuf Ajiboye, Oakland Natives Give Back

CourseWise Education Meta-Analytics II: Developing an Internal Metrics Dashboard
Cynthia Wen, Rena Shrestha, Chunmin Zheng
Mentored by Angikaar Singh Chana, Equivalence Systems LLC

Evaluating Small Language Models for AI Education
Ria Jain, Sanjana Sathishkumar, Yiran Hu, SzuLun Huang, Seoha Choi, Lance Santana
Mentored by Eric Van Dusen, UC Berkeley

Does Human Capital Matter?: Analysis Of The Impact Of Dedicated Staff On Environmental And Climate Action In Schools
Pratham Rangwala, Huy Nguyen, Uichan Lee
Mentored by Judy He and Sarah Whiting, Ten Strands

Text Simplification for Children: Evaluating Large Language Models vis-à-vis the human baseline
Cole Fees, Katie Kee, Isole Kim, Tanvi Munjeti, Matthew Takagi
Mentored by Anastasia Smirnova, Experimental and Computational Linguistics Ensemble Lab (ECOLE), San Francisco State University

EmpowerHer Too: Cal Women’s Network Mentorship Program
Cadence Loh, Maria Amabella Ava, Rebecca Kong
Mentored by Jolie Lam, CITRIS and the Banatao Institute

Leveraging AI to Assist with Transfer Course Articulation
Erin Kim, Dave Datugan, Shanaya Wickremesinghe, Elizabeth Varghese, Ryan Suh
Mentored by Angikaar Singh Chana, Equivalence Systems, LLC

ConflictQuery: An AI-Powered Compliance Classification System for Academics
Amber Gupta, Jack Hu, Manya Sriram
Mentored by Lawrence Ebringer, ConflictQuery

Medicine & Public Health

Retinal Laser Scans for Disease Progression Prediction in Multiple Sclerosis Patients
Naman Rudrakshi, Adrian Nguyen, Diana Tao, Jenny Shang
Mentored by Yifan Zhang & Emily Chen, C. Light Technologies

Agents in the Loop for Small Molecule Drug Design
Akansha Jain, Ava Joshi, Annie Liao, Gautam Ramkumar, Shreyash Goli
Mentored by Alan Cheng, Alec Glisman, Claire Suen, David He, Xiaodong Zhu, Merck & Co.

LLM-Orchestrated Mixture-of-Experts for Solubility Prediction
Samantha Alonso, Surya Appana, Ameya Kiwalkar, Hilary Wang
Mentored by Alan Cheng & Claire Suen, Merck & Co.

Optimizing DNA Sequencing Pipelines
Anjana Niranjanan, Anthony Ho, Sahana Gopalan, Siya Patel
Mentored by Scott Geller, UC Berkeley DNA Sequencing Facility

The Future of Rapid Research: AI-Powered Preprint Review
Khushi Kolte, Lillian Yao, Noah Fond, Ted Nguyen
Mentored by Angel Paul, Stef Bertozzi, Hildy Fong Baker, RRID

CITRIS Health Newsletter and Chatbot Hub
Namira Khanum, Michelle Bao, David Veksler
Mentored by Jolie Lam, CITRIS Health

Training and Benchmarking AI Models for Space Biology
Abraham Westley Guan, Brian Zhou, Ishanth Hombaiah, Karen Meng, Martin Li
Mentored by Walter Alvarado, NASA

Public Policy & Social Science

NLP for Cuneiform Languages
Zoya Brahimzadeh, Mahi Nagananda, Ojas Sathaye
Mentored by Adam Anderson, Fact Grid

GenAI in the Classroom: Instructor Policy Variation and Shifting Grading Components
Christopher Mach, Lynn Chien, Smrithi Senthilnathan, Sophie McKenna
Mentored by Igor Chirikov, UC Berkeley

HERO2: Incentivizing Sustainable Transportation Behavior through Predictive Analytics & Behavioral Nudging
Aayan Agarwal, Sanjana Alluri, Caden Luu, Aahan Bagga, Carl Djapardi
Mentored by Aurora Garrity, Ryan Xu, Anferney Walther, Alexander Stepanov, HERO2

TokenWorks - OCR for Cuneiform Sources: Evaluating & Refining a Pipeline
Vedashnii Raghu, Tiantong Lu, Rithik Anumula, Jerry Cheng
Mentored by Adam Anderson, Token Works

Other Presentations Elsewhere

TreeAgent: A Multi Agent AI Framework for Tree Labeling and Uncertainty Quantification
Collin Hargreaves, Nicholas Saban, Shiyi Chen
Mentored by Huiqi Wang, UC Berkeley

CANS Scores, Incident Risks, and Formulating Youth Treatment Plans
Brooke Uyeda, Chunmin Zheng, Daniel Lancet
Mentored by Alex Borja, Aspiranet

Analyzing Declining Response Rates in Client Satisfaction Surveys
Matthew Yang, Stephen Tan, Wendy Liu
Mentored by Alex Borja, Aspiranet

Physics-guided AI for Streamflow Forecasting, Flood Control, and Water Supply Prediction in California
Alexander Zhai
Mentored by Dino Bellugi & Sabi Can Russo, UC Berkeley

Exploring Inequitable Access to the SF Bay Shoreline
Ishita Bhadra, Jeffrey Ding
Mentored by Sasha Harris-Lovett, SF Estuary Partnership

Meta-meetings
Amir Rafiei, Peyton Li
Mentored by Dhushan Thevarajah, Coeuraj

Full-Stack RAG Chatbot for Policy Database Search
Bo Wang
Mentored by Karen Chapple & Chenghao Li, Urban Displacement Project, UC Berkeley

Poisson Disk Sampling for 3D Forensic Scene Point Cloud Compression Arya Prince Mentored by Jackie Sun, Yu Xuan Bu, Simon Su, NIST

Automated Detection of Accessibility Compliance in Architectural Floorplans Using Computer Vision
Amogh Janganure, Anna Buckeye, Dylan Cha, Noor Rauf
Mentored by Mike Hoppe, Geopogo

Acknowledgments

We would like to thank everyone who’s contributed to the College of Computing, Data Science, and Society’s Data Discovery program and making the Symposium come to life. Data Discovery would not be possible without the leadership of Professor Deb Nolan. Associate Dean of Students Professor Narges Norouzi and Robbie Powers have provided invaluable guidance and support. Jessica McDaniels arranged crucial details of the event, and Camille Simsuangco Roxas helped execute it. CDSS Development contributed funds to the Symposium. Claire Chen and Rohan Bhagat have helped keep Data Discovery running smoothly this semester. Ryan Lovett helped arrange access to computing resources and provided technical support, and we acknowledge Research IT and the National Science Foundation’s National Artificial Intelligence Research Resource Pilot for providing computational resources (award number NAIRR240387). Finally, we’d like to thank the Data Discovery mentors who provided the projects and mentorship that make the program possible.