Kaggle dataset loaded in Python using pandas. Null values handled and categorical data encoded. Features like Amount, Category, State, Merchant, Location used for model training.
Target:is_fraud.
Algorithms used:
Logistic Regression, KNN, Naive Bayes, Decision Tree, Random Forest, SVC.
Models stored as pickle files and predictions made based on user inputs.
December end shows spike in total transaction volume ~4 lakh+.
Random Forest achieved 90%+ accuracy. Categories like shopping network payment (shopping_net),unclassified sites( misc_net) and grocery point of sale (grocery_pos or supermarket) had some frauds. Overall, most categories were clean.
The project successfully implemented a credit card fraud detection system using various machine learning algorithms. The Random Forest model with an accuracy of over 90%. The analysis of transaction categories provided fraud patterns and more robust fraud detection systems.