Computer Science (3rd Year) | Data Science & Machine Learning | Cybersecurity Analytics
I build practical systems that treat real‑world activity as data — user behavior, system logs, and web data — and apply Python + machine learning to analyze, detect patterns, and surface anomalies. My work focuses on the intersection of data analysis, ML models, and security monitoring, with an emphasis on deployable solutions rather than notebook‑only experiments.
- LinkedIn: https://www.linkedin.com/in/anand-raj-airafar
- Email: anandraj18110@gmail.com
- GitHub: https://github.com/AstroAirafar
Programming & Data: Python, C++, Pandas, NumPy, scikit‑learn, EDA, Feature Engineering, Model Evaluation Machine Learning: Supervised & Unsupervised Learning, Deep Learning basics, NLP, Computer Vision, LLM concepts Security Analytics: Threat Intelligence, Threat Hunting, Incident Response concepts, Log Analysis, OSINT Tools: AWS, Git/GitHub, Jupyter, Google Colab, Kaggle, Linux (basic) Web: HTML, CSS, JavaScript, Web Scraping (BeautifulSoup, Selenium)
A production‑style unsupervised anomaly detection system that learns normal user activity patterns and flags suspicious events without labeled attack data.
Highlights
- Isolation Forest–based anomaly scoring
- Behavioral feature engineering (session duration, event frequency, time‑of‑day deviation)
- Percentile‑based thresholding
- FastAPI inference service
- Dockerized deployment with Nginx HTTPS reverse proxy
Stack: Python, scikit‑learn, FastAPI, Docker, Linux, Nginx
An encoder‑decoder model that generates English descriptions for images, combining Computer Vision and NLP.
Key Work
- Transfer learning using InceptionV3 (2048‑dim feature vectors)
- LSTM decoder for sequence generation
- Custom preprocessing & tokenization pipeline
- Greedy Search and Beam Search inference
- Evaluation with BLEU scores
Dataset: Flickr8k (~8K images, 40K+ captions) Stack: TensorFlow/Keras, NLTK, NumPy, Pandas, Matplotlib, Google Colab
End‑to‑end data pipeline: scraping → cleaning → analysis → visualization for Hyundai listings.
Work Done
- Selenium scraping of dynamic pages and pagination
- Regex‑based data extraction and normalization
- Cleaning mileage and price formats (k, lakh, crore → numeric INR)
- Statistical analysis and correlation studies
- 9 visualizations (price distribution, fuel trends, depreciation, correlation heatmap)
Stack: Python, Selenium, Pandas, NumPy, Matplotlib, Seaborn
Interactive case‑study assessment platform with real‑time grading and analytics dashboard.
Features
- 10‑question quiz engine with server‑side grading
- UUID‑based result tracking
- Admin analytics dashboard (scores, accuracy, trends)
- Previously deployed with HTTPS and process management
Stack: Node.js, Express, SQLite, HTML/CSS/JS, Nginx, PM2
- Anomaly detection techniques
- ML model deployment and monitoring
- Security telemetry analysis (logs as datasets)
- System design fundamentals
- Exploring machine behavior and multimodal monitoring concepts
- Build deployable ML systems, not only notebooks
- Work with real datasets and operational constraints
- Combine Data Science with Security Analytics
- Start as a Junior Data/ML Engineer or Security Analyst and grow into system‑level engineering roles
This GitHub documents my learning journey and applied experiments. Some repositories are exploratory by design — the goal is understanding systems end‑to‑end (data → model → service → deployment), not just training models in isolation.
