Dataset Details: Malicia

Dataset Information

Serial Number: 21

Year: 2014

Kind of Traffic: Real

Publicly Available: Yes

Count of Records: 11363 samples

Features Count: 7

CITE

No. of citations: 120

Attack Type: N/A

Download Links: Not Available

Abstract: The Malicia dataset is available publically and contains the data from 7th march 2012 to 25th march 2013. There are 5 files and the main file is MYSQL database which has all the information about collected malware, collection time, source of malware, malware classification and details about exploit server. Also, there is a figure that captures the database schema, a tarball with the malware binaries, another tarball with icons extracted from those malware binaries, and a signature file for the Snort IDS produced by our FIRMAtool. The database consists of eight tables. The MILK table is the most crucial one, as it contains a row for each instance when malware was collected from an exploit server. Each row includes the timestamp of when the malware was collected, the landing URL, and identifiers that establish connections to other tables in the database. The FILES and LANDING_IP tables are also significant. The FILES table has a row for every unique malware binary, identified by its SHA1 hash, along with classification information. The LANDING_IP table contains a row for each exploit server, identified by its landing IP, including details such as the installed exploit kit, the server's autonomous system number, and the country code it belongs to. The malware tarball consists of 11,363 samples in the form of .exe and .dll files. In the FILES table of the database, various information can be found related to each malware binary, including its network traffic, icon, screenshot labels, and the ultimate family label. Additionally, the icons tarball contains 5,777 icons that have been extracted from the executable files. These icons are provided for convenience, as they can be extracted from the provided malware itself.

Back to Datasets