In this paper, we present the performance of machine learning-based methods for detection of phishing sites. We employ 9 machine
learning techniques including AdaBoost, Bagging, Support Vector Machines, Classification and Regression Trees, Logistic Regression,
Random Forests, Neural Networks, Naive Bayes, and Bayesian Additive Regression Trees. We let these machine learning techniques
combine heuristics, and also let machine learning-based detection methods distinguish phishing sites from others. We analyze
our dataset, which is composed of 1,500 phishing sites and 1,500 legitimate sites, classify them using the machine learning-based
detection methods, and measure the performance. In our evaluation, we used f
1 measure, error rate, and Area Under the ROC Curve (AUC) as performance metrics along with our requirements for detection
methods. The highest f
1 measure is 0.8581, the lowest error rate is 14.15%, and the highest AUC is 0.9342, all of which are observed in the case
of AdaBoost. We also observe that 7 out of 9 machine learning-based detection methods outperform the traditional detection
method.