Serial Number: 110
Year: 2023
Kind of Traffic: Simulated + Real
Publicly Available: Yes
Count of Records: 1000,000 samples
Features Count: 21
No. of citations: 19
Attack Type: DDoS, Brute Force, etc
Download Links: Not Available
Abstract: The IDSAI dataset is a recently developed well-balanced dataset designed for assessing the effectiveness of supervised machine learning methods in identifying intrusions through the analysis of traffic captures of network in IoT communications. Formed in a real-world attack setting, the dataset comprises intrusions, totalling 1,000,000 samples. Initially, it featured twenty four (24) attributes, but after initial pre-processing some features like ports and IP addresses susceptible to manipulation by attackers are removed and dataset was refined to two label columns and nineteen (19) variables, resulting in a 1,000,000 × 21 matrix. The dataset is evenly split between non-intrusion (500,000 samples) and intrusion (500,000 samples) categories. Within the intrusion class, there are ten distinct types, each represented by 50,000 data samples. The ten intrusion types encompass, SYN/ACK and RST Flooding, ICMP echo request Flood/Ping Flood, SYN/ACK Flooding, ARP spoofing, SYN Flooding faster, DDoS MAC Flood, IP Fragmentation, Brute Force SSH, TCP Null, and UDP port scan. The dataset has been made publicly available for research purposes.