I AM YUEYIN JI
Predictive Analysis Case
Objective
A credit card company wants me to estimate a predictive model to predict whether a consumer is approved for a credit card.
The company gave me a dataset that contains the characteristics of applicants for a major credit card. The key dependent variable is ‘card’, which indicates whether a consumer was approved for a credit card. The remaining variables contain other relevant information about each consumer.
Exploratory Data Analysis
I used corrplot to visualize the correlation between factors

I used regression analysis and ANOVA analysis to find out which factors are important to the approval of the credit card, and which factors are not.
I used leaps and regsubsets to interpret simpler models.
The purpose of EDA is to determine which factors should be included in the predictive model.
Data Preprocessing
I divided the data into training sets and test sets.

Data Modeling
In this part, I ran 13 different model specifications to find the best predictors for credit card approval.
These models include linear models with and without interaction terms, and non-linear models using the earth package, which implements multivariate adaptive regression splines (MARS).
Moreover, I Implemented K-Fold cross-validation to make the MSE result more accurate.
Result Validation
Highlights:
I created the “getRMSE” (Get Root Mean Square Error) and the “getDataKFoldRMSE” function to automate the process of calculating the RMSE for a given model using validation data.
These functions significantly streamline the model evaluation process. Instead of manually computing RMSE every time a new model is built, I can simply pass the model into this function to get the RMSE.


Decision Support
The final selected model can be used to support decisions regarding credit card approval, based on its predictive accuracy (RMSE).
The project focuses on building a predictive model for credit card approval.
Various statistical and machine-learning techniques are used to analyze the data.
The best model is selected based on its performance (lowest RMSE) in predicting credit card approvals.
The functions enable more efficient and accurate model performance assessments, allowing for quicker iterations and enhancements to the models being developed.
This approach to data analysis is comprehensive and methodical, showcasing skills in data handling, statistical analysis, machine learning, and model evaluation - all of which are valuable in a data analytics or business intelligence role.