Since last year, Holmes-Processing has acquired a large dataset of labeled malware samples, which can be used for deep learning based malware relationship mining. This labeled dataset of over 50k samples should be a big help to do malware relationship detection. Besides, as a result of the previous GSoC’17, we also have an efficient data model for the malware relationships.
Therefore, the goals of this project are to
- implement a decent learning model to predict labels of each malware sample
- discover relationships between different malware samples
- visualize relationships in frontend
- and build an analytic pipeline to integrate the implemented services.