This project simulates a real-world experience where I applied data analysis and database management techniques during my internship. The goal is to forecast retail sales using various time series models (ARIMA, VAR, SARIMA) and manage data efficiently using PostgreSQL. This simulation reflects the type of work I carried out during my internship.
- Overview
- Project Structure
- Data Simulation
- Time Series Modeling
- Database Management with PostgreSQL
- Usage
- Dependencies
- License
Retail-Sales-Forecasting/ │ ├── data/ # Folder containing simulated datasets ├── notebooks/ # Jupyter Notebooks with detailed analysis and modeling ├── scripts/ # Python scripts for data generation and model automation ├── sql/ # SQL scripts for database management (PostgreSQL) ├── .venv/ # Virtual environment (added to .gitignore, not included in repo) ├── requirements.txt # Python dependencies ├── README.md # Project documentation └── LICENSE # License file
The dataset simulates weekly retail sales (e.g., chicken sales) over several years, with additional variables such as temperature and promotional activity. The goal is to recreate a realistic scenario where external factors (like weather or promotions) influence sales trends.
- The ARIMA model is used for univariate analysis on sales data to understand past sales behavior and make predictions based solely on historical values.
- The VAR model includes multiple variables such as temperature and promotions to forecast sales. This approach is effective when multiple time series influence each other.
- The SARIMA model incorporates seasonality, capturing recurring patterns (e.g., annual cycles) in the sales data, providing a comprehensive forecasting solution.
The project includes database management tasks, where a PostgreSQL database is designed and deployed to store and manage sales and log data efficiently. The schema design is structured to allow for easy integration of daily logs and data analysis.
- Clone the repository:
- Install the dependencies:
- Run the Jupyter Notebooks in the
/notebooksfolder to see the detailed analysis and modeling. - Simulate Data: Use the scripts in the
/scriptsfolder to generate and manipulate datasets. - Database Setup: SQL scripts in the
/sqlfolder help set up and manage the PostgreSQL database.
- Python 3.8+
- pandas
- numpy
- matplotlib
- statsmodels
- SQLAlchemy (for database interaction)
- PostgreSQL
This project is licensed under the MIT License. See the LICENSE file for details.