Final Project for Applying Machine Learning – Spring 2025
Columbia University, New York City, USA
Credit ratings influence billions in capital flows — but do raters behave differently in times of crisis?
This project investigates whether human credit rating agencies add value or diverge from machine learning-based predictions during periods of financial instability. Using XGBoost and real-world corporate ratings data, we explore:
- 📉 Prediction accuracy during crisis vs. normal periods
- 🤖 Model-human disagreements and their underlying drivers
- 📊 Feature analysis of override patterns
- 🔍 Behavioral insights into rating decisions across time
ml_final_project.ipynb– the full analysis notebookcorporate_ratings.csv– cleaned dataset used in modeling
- XGBoost classifier with hyperparameter tuning
- Cross-validation with custom metrics
- Confusion matrix + risk-adjusted scoring
- Precision-Recall and ROC analysis
- Time series resampling and crisis segmentation
- Visualization with matplotlib & seaborn
- Michelle Ren
- Louis Sellier
- Columbia, Professor Björkegren, and TAs for guidance
- Original dataset and prior academic literature on credit risk modeling
- XGBoost and scikit-learn libraries