Credit Card Transaction Analysis

Fall 2018

With the rise of the internet and net-banking, card transactions are common, everyday occurrences to many citizens and companies in the world today. Though, as commonplace as these transactions may be, they simultaneously have the potential to wreak havoc on the financial system and lives of those impacted by them. As such, identifying fraudulent transactions is an important area of study for banks and financial companies in order to protect themselves and their users from disaster.

By obtaining a dataset, courtesy of Kaggle (https://www.kaggle.com/mlg-ulb/creditcardfraud/home), consisting of credit card transaction data by European credit cards in 2013, my teammates and I analyzed various approaches for detection of fraudulent transactions. We investigated the application of machine learning algorithms in order to develop a classification system that can distinguish between fraudulent and non-fraudulent transactions.

Our solution was to use and analyze two Support Vector Machine (SVM) classifiers - linear, primal SVM and non-linear, dual SVM classifiers. Hence, we developed a binary classifier - fraudulent vs. non-fraudulent. In this process, we trimmed the data to a computable size, trained the classifiers, and tuned hyperparameter variables (values we may adjust to produce a better result).

From our results, we determined that we can successfully apply a classification algorithm for fraudulent and non-fraudulent transactions. The error for the Linear/Primal SVM was smaller than the Non-Linear/Dual SVM. In addition, the former takes significantly less time to run. This allowed us to conclude that a Linear classifier was better fit for this use case.

In the real world, False Negatives (in which a transaction in incorrectly believed to be real when it was actually fraudulent) are a significant concern as people may be scammed out of thousands or millions of dollars. From our results, we noted that our classifier produced a very small False Negative error ratio and concluded that we produced a beneficial and rather safe classifier.

Team members of this project included Utkarsh Agarwal, Dimple Dhawan, Kalpan Jasani, and Jonathan Poholarz. The full text of the report in addition to visual aides can be found in a link above.