The main objective of this research is to evaluate the performance of machine learning algorithms in the field of credit card fraud detection and then compare them according to various performance metrics. Seven different supervised classification algorithms including Logistic Regression, Decision Trees, Random Forest, XGBoost, Naive Bayes, K-Nearest Neighbors and Support Vector Machine were used. The performance of these algorithms was measured through a comprehensive evaluation of metrics including Accuracy, Precision, Recall, F-Score, AUC and AUPRC values. Furthermore, ROC curves and confusion matrices were used to evaluate these algorithms. The data preparation phase is critical in this study. The data imbalance problem arises as an unequal distribution between fraudulent and non-fraudulent transactions. Addressing this imbalance is imperative for successful model training and subsequent reliable results. Various techniques, such as Scaling and Distribution, Random Under-Sampling, Dimensionality Reduction, and Clustering, are employed to ensure an accurate evaluation of model performance and its ability to generalize effectively. As a result, the "Random Forest" and "K-Nearest Neighbors" algorithms exhibit the highest performance levels in this research with 97% accuracy rates. This study contributes significantly to the ongoing fight against financial fraud and provides valuable guidance for future research efforts.
Credit card fraud Fraud detection Data mining Machine learning Imbalanced datasets
Birincil Dil | İngilizce |
---|---|
Konular | İletişim Mühendisliği (Diğer) |
Bölüm | Articles |
Yazarlar | |
Erken Görünüm Tarihi | 7 Nisan 2024 |
Yayımlanma Tarihi | 30 Nisan 2024 |
Gönderilme Tarihi | 4 Kasım 2023 |
Kabul Tarihi | 3 Aralık 2023 |
Yayımlandığı Sayı | Yıl 2024 Cilt: 8 Sayı: 2 |