Serial Number: 73
Year: 2020
Kind of Traffic: Real
Publicly Available: Yes
Count of Records: 484056 flows
Features Count: 48
No. of citations: 37
Attack Type: N/A
Download Links: https://github.com/Vibek/Anomaly_based_IDS
Abstract: NetML dataset was released in “open challenge network traffic analytics using machine learning workshop sponsored by intel corporation” in 2020. This dataset was created by obtaining 30 traffic data from stratosphere IPS. In JSON format flow features are extracted, and are listed in output file line by line. For each flow metadata features are extracted and if flow sample have packets for protocol, then DNS, HTTP, TLS features are extracted. To obtain NetML dataset a unique number is assigned to identify every flow & label information from raw traffic packet capture file and is appended to output JSON file. There are 484,056 flows & 48 feature attributes. IP addresses of source and destinations are replaced by IP masked string to mask the IP’s. Dataset is divided into training, test-std, & test-challenge sets.