An AI-powered personal study and productivity backend — built to learn, built to grow.
Atlas Project is a REST API that helps me track, analyse, and predict my own study habits. It started as a Pomodoro timer. It became a machine learning project. The system logs every study session to a SQLite database, exposes a FastAPI backend for CRUD operations and analytics, and serves a trained classification model that predicts — given a subject, duration, and time of day — whether a session will be completed or interrupted. I built this to learn Python engineering properly: no vibe-coding, no AI-generated functions. Every line is mine.
| Endpoint | Method | Description |
|---|---|---|
/sessions |
GET | List all study sessions |
/sessions |
POST | Log a new session |
/sessions/{id} |
GET | Retrieve one session |
/sessions/{id}/complete |
PATCH | Mark a session as completed |
/sessions/{id} |
DELETE | Remove a session |
/predict |
POST | ML prediction: will this session be completed? |
/analytics |
GET | Aggregated stats: time by subject, completion rate, trends |
/docs |
GET | Auto-generated Swagger UI (FastAPI built-in) |
| Layer | Technology | Why |
|---|---|---|
| Language | Python 3.11 | Primary language for AI/DS engineering roles |
| API framework | FastAPI + Uvicorn | Async, typed, auto-documentation, industry standard |
| Database | SQLite + SQLAlchemy | Simple persistence with a real ORM |
| Data analysis | Pandas + Matplotlib | Aggregation and visualisation of session data |
| Machine Learning | scikit-learn | Classification model (RandomForestClassifier) |
| Model serialisation | joblib | Save and reload trained model between restarts |
| Validation | Pydantic v2 | Request/response schemas, type safety at the boundary |
atlas-dev-os/
├── backend/
│ ├── api/
│ │ ├── routes.py # all FastAPI endpoints
│ │ └── schemas.py # Pydantic request/response models
│ ├── core/
│ │ ├── database.py # SQLAlchemy engine, session, Base
│ │ ├── models.py # ORM table definitions
│ │ ├── crud.py # create / read / update / delete
│ │ └── ml.py # model loading, feature engineering, predict()
│ └── main.py # app factory, router registration
├── ml/
│ ├── train.py # training script — run once to produce model
│ ├── evaluate.py # accuracy, classification report, feature importance
│ └── completion_model.pkl # serialised RandomForestClassifier
├── notebooks/
│ ├── 01_eda.ipynb # exploratory data analysis on session CSV
│ └── 02_model_selection.ipynb # comparing Logistic Regression vs Random Forest
├── scripts/
│ └── study_timer.py # CLI Pomodoro timer — the data source
├── tests/
│ ├── test_crud.py
│ ├── test_routes.py
│ └── test_ml.py
├── .env.example # copy to .env and fill in your keys
├── requirements.txt
└── README.md
Requirements: Python 3.11+, Git
# 1. Clone
git clone https://github.com/xdmanflow/atlas-project.git
cd atlas-project
# 2. Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Set up environment variables
cp .env.example .env
# Edit .env and add your keys if needed
# 5. Run the API
uvicorn backend.main:app --reload
# 6. Open the interactive docs
# http://localhost:8000/docsTrain the ML model (required before using /predict):
# First generate some sessions with the timer
python scripts/study_timer.py --sessions 4
# Then train the model on your data
python ml/train.py
# Outputs: ml/completion_model.pklProblem: binary classification — will a study session be completed?
Features used:
duration_minutes— longer sessions correlate with lower completionstart_hour— time of day affects focus (encoded fromstart_time)subject_encoded— some subjects are harder to stay focused on (LabelEncoder)
Training:
Dataset size : 200+ logged sessions
Train/test split: 80% / 20% (random_state=42)
Best model : RandomForestClassifier(n_estimators=100)
Test accuracy : ~78% (improves with more logged data)
Example prediction request:
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"subject": "FastAPI", "duration_minutes": 25, "start_hour": 9}'{
"will_complete": true,
"confidence": 0.84
}{
"total_sessions": 47,
"completed_sessions": 38,
"completion_rate": 0.81,
"total_minutes": 1025,
"avg_duration_minutes": 26.4,
"top_subjects": [
{ "subject": "Python", "total_minutes": 325 },
{ "subject": "FastAPI", "total_minutes": 200 },
{ "subject": "ML basics", "total_minutes": 175 }
]
}This project was a deliberate practice run across the full Python engineering stack.
Starting from a simple CLI script (study_timer.py), I progressively added:
- OOP layer — refactored the timer into
StudySession,DeepWorkSession, andBreakSessionclasses with proper inheritance,__str__/__repr__, andto_csv_row()serialisation. - HTTP layer — built a
requests-based API client before writing my own server, so I understood what "an endpoint" actually is from the client side first. - Functional layer — rewrote data processing pipelines using
map,filter,functools.reduce, anditertools.groupby— understanding lazy evaluation in the process. - API layer — built a FastAPI backend with Pydantic v2 validation, proper HTTP status codes, and path/query parameter handling.
- Persistence layer — replaced in-memory storage with SQLAlchemy + SQLite, implementing full CRUD with proper session management.
- Analytics layer — loaded the database into Pandas DataFrames to compute aggregates and generate charts.
- ML layer — engineered features from raw session data, compared two classifiers, evaluated with
classification_report, serialised the best model withjoblib.
The hardest part was Day 7: understanding why SessionLocal is a factory, not a session, and why you always call .close() in a finally block or use a context manager.
- Add JWT authentication (FastAPI + OAuth2 + password hashing)
- Docker + docker-compose for one-command local setup
- GitHub Actions CI pipeline (pytest on every push)
- Deploy to Railway (free tier)
- React frontend — study dashboard with real-time analytics charts
- Improve ML model — add
day_of_weekandprevious_session_completedas features
Manil Doudou — Computer Engineering student, CESI Engineering School, Toulouse, France
Specialising in AI and Data Science · Looking for a 3–4 month internship from September 2026
This is Atlas Project. Mind the gap between the train and the platform. The next station is Destiny