CryptoMLOps is a fully automated MLOps pipeline for cryptocurrency time series forecasting. The project combines:
- Data ingestion from CSV or API
- Data preprocessing & feature engineering
- Anomaly detection and volatility regime detection
- ARIMA + GARCH model training
- Backtesting & drift detection
- Forecasting with uncertainty intervals
- Model monitoring & dashboard generation
- MLflow logging & experiment tracking
- Automated notification & cleanup
The pipeline is designed for reproducibility, monitoring, and continuous learning. Built with Python, Airflow, MLflow, and Docker, it is ready for production deployment.
- Fetch, preprocess, and validate cryptocurrency datasets
- Detect anomalies and regime changes in market behavior
- Train ARIMA + GARCH models and log metrics & artifacts to MLflow
- Backtest models using walk-forward validation
- Detect dataset drift using PSI and KS statistics
- Generate forecasts with uncertainty intervals
- Blend multiple models for ensemble predictions
- Explain predictions using SHAP for tree models
- Monitor model performance and alert on issues
- Automated cleanup of old runs and artifacts
- Fully orchestrated via Apache Airflow DAGs
┌──────────────┐
│ Fetch Data │
└─────┬────────┘
│
┌─────▼────────┐
│ Preprocessing│
└─────┬────────┘
│
┌─────▼──────────┐
│ Anomaly Detection│
└─────┬──────────┘
│
┌─────▼──────────┐
│ Regime Detection│
└─────┬──────────┘
│
┌─────▼──────────┐
│ Train ARIMA+GARCH│
└─────┬──────────┘
│
┌─────▼──────────┐
│ Backtesting │
└─────┬──────────┘
│
┌─────▼──────────┐
│ Drift Detection │
└─────┬──────────┘
│
┌─────▼──────────┐
│ Forecast & Dashboard│
└─────┬──────────┘
│
┌─────▼──────────┐
│ Notifications & Cleanup │
└───────────────────────┘
- Python 3.11
- Apache Airflow 2.8.1 (DAG orchestration)
- MLflow (experiment tracking & model registry)
- Docker & Docker Compose (containerization)
- Pandas / NumPy / Scikit-learn (data processing & ML)
- Statsmodels / Arch (ARIMA + GARCH)
- Joblib (model serialization)
- Matplotlib / Seaborn (visualization)
- SHAP (feature importance)
- Optional: SMTP for notifications
- Clone the repository:
git clone https://github.com/gamzeakkurt/cryptomlops.git
cd cryptomlops- Build and start containers:
docker-compose build
docker-compose up- Access services:
- Airflow:
http://localhost:8081 - MLflow:
http://localhost:5050 - FastAPI app (if deployed):
http://localhost:8001
.
├── dags/ # Airflow DAGs
│ └── ml_pipeline_full.py
├── src/ # Python modules for each task
│ ├── data_pipeline.py
│ ├── train.py
│ ├── backtest.py
│ ├── drift_detector.py
│ ├── detect_anomalies.py
│ ├── detect_regime.py
│ ├── forecast_with_uncertainty.py
│ ├── explainability.py
│ ├── should_retrain.py
│ ├── register_model.py
│ ├── generate_dashboard.py
│ ├── notify_team.py
│ └── cleanup_old_runs.py
├── data/ # Raw and processed data
├── models/ # Saved models
├── mlruns/ # MLflow experiment logs
├── dockerfile.airflow # Airflow Dockerfile
├── Dockerfile # API / main app Dockerfile
├── docker-compose.yml
├── requirements.txt
└── README.md
-
Place your historical crypto data in
./data/(CSV format) -
Start Docker Compose as above
-
Airflow DAG
mlops_crypto_full_pipelinewill automatically execute:- Fetch → preprocess → anomaly → regime → train → backtest → drift → forecast → dashboard → notify
-
Check MLflow UI for metrics and artifacts
-
Check dashboard in
./data/dashboard.png -
Models saved in
./models
- Change ARIMA/GARCH parameters directly in
train.py - MLflow allows logging multiple experiments & parameters
- Fork the repository
- Create a feature branch:
git checkout -b feature/my-feature - Commit changes:
git commit -am 'Add new feature' - Push to branch:
git push origin feature/my-feature - Open a pull request
MIT License – See LICENSE