This project performs exploratory data analysis (EDA) on a consumer credit dataset to identify factors associated with repayment risk and lending outcomes.
The goal is to help lenders balance two competing risks:
- Rejecting reliable applicants (opportunity loss)
- Approving risky applicants (credit loss)
We analyze structural, demographic, and financial variables to identify meaningful risk signals.
- EDA.ipynb — analysis notebook
- EDA_Report.pdf — final report
- Stability indicators (employment tenure, education) are stronger risk signals than raw income or loan size.
- Loan–income alignment exists among reliable borrowers but weakens among defaulters.
- Absolute loan size alone is not predictive; relative measures are more informative.
- Socioeconomic attributes influence financing patterns more than demographic factors.
- Distribution analysis
- Univariate analysis
- Bivariate analysis
- Correlation analysis
- Outlier diagnostics
Data Bootcamp
Group 12
Skye Xi
Jonathan Cain
Sab Rajesh Krishnan