Dataset Details: CUPID

Dataset Information

Serial Number: 68

Year: 2019

Kind of Traffic: Real

Publicly Available: Yes

Count of Records: 50GB

Features Count: 84

CITE

No. of citations: 1003

Attack Type: DoS, R2L, U2R, Probe

Download Links: Not Available

Abstract: CUPID is publicly available dataset. Incorporating human-guided traffic into the CUPID dataset is an important feature. CUPID was developed with the help of “10” (pentesters) ethical penetration testers. These ten pentesters were seen performing the similar activities as scripted users for a period of one hour, and then capturing malicious traffic for a subsequent hour, or whenever the pentester stopped operating (Generally, if the timer had expired and the server had been successfully exploited). On the basis of the Kali instance’s IP address, benign traffic was labelled with a '0,' while malicious traffic was labelled with a '1'. During the first sampling day of April 2019, one of the 24-hour baseline data samples was collected. (4,346,077) packets are included in raw 042219 1000.pcapng data file and takes up about storage space of “3.3 GB”. The size of whole CUPID dataset is around “50 GB”. 179 distinct hosts are contained in sample outside the local address space. This sample consisted primarily of TCP packets (75%) using protocols such as Distributed Computing Environment (DCE) / Remote Procedure Call (RPC), Internet Control Message Protocol (ICMP), Hypertext Transfer Protocol (HTTP), Kerberos, Simple Mail Transfer Protocol (SMTP), Network Basic Input/Output System (NetBIOS) / Server Message Block included (SMB), and Lightweight Directory Access Protocol (LDAP). The pcap includes UDP-based data (7.5%) as well as (DNS) Domain Name System information and (NTP) Network Time Protocol. The remaining traffic is made up of addressing protocols such as ARP & 802.1Q Virtual LAN. CUPID contains a large variety of protocols due to enterprise-specific services such as DNS lookups, email, & active directory access. SSL/TLS traffic accounts for 25,341 (0.6%) of the total packets.

Back to Datasets