Serial Number: 40
Year: 2016
Kind of Traffic: Real
Publicly Available: Yes
Count of Records: 114400 urls
Features Count: 79
No. of citations: 233
Attack Type: URL attacks
Download Links: https://www.kaggle.com/datasets/teseract/urldataset
Abstract: Mamun et al. proposed URL dataset, which includes “5” different types of URLs: I. phishing URLs, II. Spam URLs, III. Benign URLs, IV. Defacement URLs, & V. malware URLs. From Alexa’s top website 35300 URL’s which are benign are collected in the Benign URL. From WEBSPAM-UK2007 dataset 12000 URL’s which are spam are gathered in Spam URL. From OpenPhish repository of active phishing sites 10000 URL’s which are phishing are gathered in phishing URL. From DNS_BH, which is a maintained list of malware sites, 11500 URLs are gathered in malware URL. From Alexa ranked trusted websites that host hidden or fraudulent links, 45450 URLs are gathered in Defacement URL.