DNS Anomaly Detection Using Machine Learning - Team Presentatipn
DNS Anomaly Detection Using Machine Learning | Data Science for Cybersecurity Project In this presentation, our team (Durga, Surabhi & Guru) walks through a complete machine learning pipeline to detect malicious domain names generated by Domain Generation Algorithms (DGA) - a technique used by malware to secretly communicate with attacker-controlled servers. What we cover: • What is a DGA attack and why traditional blacklists fail • Dataset: 20,000 real DGA domains + 50,000 Alexa legitimate domains = 70,000 records • Exploratory Data Analysis (EDA) - Shannon entropy as key finding • Feature extraction from raw domain name strings (8 features) • Preprocessing: dropna, VarianceThreshold, StandardScaler, LabelEncoder • PCA dimensionality reduction - variance analysis • Model training: Decision Tree, Random Forest, SVM, ANN/MLP • Evaluation: F1-score, Precision, Recall, Confusion Matrices • Best result: ANN with F1-score of 0.8999 (90%) on 80:20 split Tools used: Python, Anaconda, Jupyter Notebook, Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn Course: CIS6378 - Data Science for Cybersecurity, University Of Houston (MS in Cybersecurity)
Download
0 formatsNo download links available.