A real-time NGINX anomaly detection and alert system.
Demo video
noginx_demovid.mp4
-
Make sure you have NGINX installed
-
It will automatically capture logs from "/var/log/nginx/access.log", this is where Unix stores its NGINX logs.
-
Make sure you are in the root directory ie noginx
-
Create a python virtual env using
python3 -m venv venv -
Download all the dependencies using
pip install -r requirements.txt -
Make a .env file, store it at root, it will include these fields:-
PREDICT_ONE_URL = "http://127.0.0.1:8000/predict_one" NOGINX_MAIL_ID = "***@gmail.com" (add yours) NOGINX_MAIL_PASSWORD = "**** **** **** ****" (Find out the "App password" for your gmail account after enabling 2fa and paste it here) RECEIVER_MAIL_ID = "***@gmail.com" (mail will be sent to this ID)
-
Make sure you are in the root directory
-
Start the server
-
cd fastapi_backend uvicorn api_server:app --reload
-
On another terminal, Start the watcher
-
cd watcher python3 watcher.py
- Model is trained on the NGINX logs dataset publicly available at nginx logs
- Model is trained using Isolation Forest algorithm.
- Model is saved at location noginx/data/model.pkl
- Raw logs is processed using parse_logs.py and CSV file is stored at location noginx/data/nginx_logs.csv
- Features are extracted from the csv file using process_csv.py
- Mapping of raw features(string) to integer is done and stored in encoder_mappings.json at noginx/data/encoder_mappings.json
-
FastAPI is used, and the server can be run via the file api_server.py at noginx/fastapi_backend/api_server.py
-
Gives two endpoints
-
-
/predictThis returns the result if anomaly true or not for multiple logs at once.
Request JSON format:
{ "features": [{ "status": 304, "size": 0, "method": 1, "path": 0, "user_agent": 54, "hour_of_day": 8 }] }
-
-
-
/predict_oneThis returns the result if anomaly true or not for just one log.
Request JSON format:
{ "feature": { "status": 304, "size": 0, "method": 1, "path": 0, "user_agent": 54, "hour_of_day": 8 } }
-
-
Response format
[ { "anomaly": false } ]
- It alerts the system by sending it a notification
- Also send a mail to the user containing all the necessary info of the anomaly.
- For company specific usecases, you will need to train the model on the company specific data.
- For that either contact me at Abhinav Jha else if you think you know the stuff then here are the steps
- Store all your NGINX logs at nginx.log at noginx/data/nginx.log
- RUN
python3 parse_logs.py - RUN
python3 process_csv.py - To train the model,
- RUN
python3 train_model.py - You can customise the model via the feature engineering or tweaking with the Isolation Forest.