Contributor: Hinna Zeejah

ML-based web attack classification project for TANNER

Mentors: Evgeniia Tokarchuk
Organization: The Honeynet Project
Technologies: python, machine learning, git, tensorflow, docker, numpy, pytorch, scikit-learn, pandas, FastAPI, XGBoost/LightGBM, OWASP Dependency-Check, Bandit
Topics: machine learning, cybersecurity, Data Preprocessing, Data annotation, imbalanced data, Model Evaluation, Real-time Analysis, Web Attack Detection, AI in Security, System Integration

The rise of sophisticated web attacks poses significant challenges to current detection mechanisms, often relying on traditional regex-based approaches that struggle with accuracy and latency. This project proposes the integration of a Machine Learning (ML) classifier into TANNER, a renowned tool for detecting web-based attacks. By leveraging advanced ML algorithms, including Convolutional Neural Networks (CNNs) and Transformer-based models like BERT for text classification, the project aims to significantly improve TANNER’s efficiency in real-time attack analysis. The deliverables include a detailed project plan, a data preprocessing tool, baseline and enhanced ML models, integration prototypes, comprehensive documentation, and a final presentation. The project is structured around key milestones spanning from initial dataset analysis and model development to final integration and testing. This approach not only promises to advance TANNER's capabilities but also contributes to the broader cybersecurity community by offering a scalable and effective solution to web attack detection.