My Projects
Explore my portfolio of data science and machine learning projects

Grandma vs. Data Scientist Student: Information-Theoretic Wordle Solver
This project is an intelligent Wordle solver that uses information theory and optimization algorithms to play the New York Times Wordle game with high accuracy and efficiency. It models each guess as an information-gathering step, selecting words that maximize expected information gain and minimize the number of guesses needed to find the correct answer. I originally built it to compete playfully with my grandmother, a retired English professor and lifelong word-game enthusiast, and it has become a fun way for us to connect, compare strategies, and talk about language from two very different perspectives: hers as a human expert in words and mine as a data science student building algorithms.

Handwritten Digit Recognition with Neural Networks
This project is a neural network implementation from scratch for handwritten digit recognition. Built entirely using fundamental machine learning principles, it demonstrates the core concepts of feedforward neural networks, backpropagation, and gradient descent without relying on high-level deep learning frameworks. The project includes an interactive graphical user interface that allows users to draw digits on a canvas and receive real-time predictions from the trained model, making it both an educational tool and a practical demonstration of neural network capabilities.

Programming for Data Science Coursework: MCMC Algorithms & Flight Data Analysis (University of London)
A comprehensive statistical computing project completed for ST2195 (Programming for Data Science) at the University of London, consisting of two parts: (1) implementation and analysis of the Metropolis-Hastings MCMC algorithm for simulating random numbers from a Laplace distribution, and (2) analysis of commercial flight data from the 2009 ASA Statistical Computing and Graphics Data Expo. The project demonstrates proficiency in both R and Python, covering topics from Bayesian statistics and convergence diagnostics to logistic regression modeling and large-scale data analysis.

In Progress: Machine Learning Analysis of Diabetes-Related Health Outcomes (University of London Coursework)
An ongoing machine learning coursework project for ST3189 (Machine Learning) at the University of London, analyzing diabetes-related health outcomes using the CDC Behavioral Risk Factor Surveillance System (BRFSS) dataset. The project requires implementing three core machine learning tasks: unsupervised learning for population segmentation, regression for continuous target variables, and classification for categorical outcomes. The analysis will compare multiple techniques for regression and classification tasks, with a focus on presenting results in an accessible format for audiences with quantitative backgrounds but no prior machine learning knowledge.

Project NoCap: AI-Powered Fact-Checking for Instagram
An AI-powered fact-checking assistant for Instagram that helps users quickly assess the credibility of posts and reels. By forwarding content to the @project_nocap account, users receive an automated analysis that highlights potential misinformation, bias, and links to more reliable sources, making it easier to navigate the information overload on social media.
Interested in Collaborating?
I'm always open to discussing new projects, creative ideas, or opportunities to be part of your vision.
Get In Touch