An intuitive, user-friendly machine learning application that simplifies the entire ML workflow from data analysis to model deployment. Built with Streamlit, this tool provides a comprehensive suite of ML capabilities accessible through a modern web interface.
- Interactive data upload and preview
- Automated data type detection and quality analysis
- Missing value visualization and handling
- Feature correlation analysis
- Automated data preprocessing pipeline
-
Traditional Models
- Ridge Regression
- Lasso Regression
- Random Forest
- Gradient Boosting
- SVR (Support Vector Regression)
-
Advanced Models
- XGBoost
- LightGBM
- Prophet (Time Series)
- SARIMA (Time Series)
-
Features
- Automated feature preprocessing
- Model performance metrics
- Cross-validation support
- Feature importance analysis
- Interactive parameter tuning
- Feature distribution plots
- Correlation heatmaps
- Model performance comparisons
- Prediction vs Actual plots
- Time series forecasting plots
- Step-by-step guidance
- Best practices
- Troubleshooting tips
- Real-world examples
- Advanced topics
- Python 3.8 or higher
- pip package manager
- Clone the repository:
git clone https://github.com/yourusername/ml-swiss-army-knife.git
cd ml-swiss-army-knife
- Create and activate a virtual environment:
# Windows
python -m venv venv
venv\Scripts\activate
# macOS/Linux
python -m venv venv
source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Launch the application:
streamlit run app.py
- Data Upload
# Example CSV structure
date,feature1,feature2,target
2024-01-01,23.5,high,100
2024-01-02,24.1,low,95
- Data Preprocessing
- Select features for encoding
- Handle missing values
- Scale numerical features
- Model Training
# Example model configuration
model_params = {
"n_estimators": 100,
"learning_rate": 0.1,
"max_depth": 5
}
- Evaluation & Prediction
- View performance metrics
- Analyze feature importance
- Make predictions on new data
# Example Prophet configuration
prophet_params = {
"yearly_seasonality": True,
"weekly_seasonality": True,
"daily_seasonality": False
}
- Parameter tuning
- Cross-validation
- Feature selection
- RAM: 8GB minimum (16GB recommended)
- Storage: 1GB free space
- Processor: Multi-core processor recommended
# Optional configuration
STREAMLIT_SERVER_PORT=8501
STREAMLIT_SERVER_ADDRESS=localhost
ml-swiss-army-knife/
βββ .streamlit/ # Streamlit configuration
β βββ config.toml # Theme and settings
βββ app.py # Main application
βββ requirements.txt # Dependencies
βββ README.md # Documentation
βββ styles.py # Custom styling
βββ modules/ # Application modules
βββ data_analysis.py # Data analysis functionality
βββ time_series.py # Time series analysis
βββ model_training.py # Model training functionality
βββ predictions.py # Prediction functionality
βββ tutorial.py # Tutorial content
- Start the application
- Upload your data (CSV format)
- Explore data insights automatically
- Train models with guided selection
- Make predictions with trained models
Located in .streamlit/config.toml
:
[theme]
primaryColor = "#7C3AED"
backgroundColor = "#FFFFFF"
secondaryBackgroundColor = "#F3F4F6"
textColor = "#111827"
font = "sans serif"
# Sample data structure
sales_data = {
'date': ['2024-01-01', '2024-01-02'],
'sales': [1000, 1200],
'promotion': ['yes', 'no']
}
# Sample categorical features
category_data = {
'feature1': ['A', 'B', 'C'],
'feature2': [1, 2, 3],
'target': ['cat1', 'cat2', 'cat1']
}
We welcome contributions! Please follow these steps:
- Fork the repository
- Create a feature branch:
git checkout -b feature/AmazingFeature
- Commit changes:
git commit -m 'Add AmazingFeature'
- Push to branch:
git push origin feature/AmazingFeature
- Open a Pull Request
- Installation Problems
# If you encounter SSL errors
pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org -r requirements.txt
- Memory Issues
# Reduce memory usage
import pandas as pd
pd.read_csv('large_file.csv', nrows=1000) # Load subset for testing
- Large Datasets
- Use chunked processing
- Implement memory optimization
- Consider data sampling
- Model Training
- Start with simple models
- Use cross-validation
- Monitor resource usage
-
v1.0.0 (2024-01-01)
- Initial release
- Basic model support
- Data preprocessing
-
v1.1.0 (2024-02-01)
- Added time series support
- Improved visualization
- Bug fixes
- Create an issue on GitHub
- Email: [email protected]
- Documentation: [Wiki Link]
This project is licensed under the MIT License - see the LICENSE file for details.
- Streamlit team
- Scikit-learn community
- All contributors
Made with β€οΈ by [Mlawali]