Atlas Project

An AI-powered personal study and productivity backend — built to learn, built to grow.

What this is

Atlas Project is a REST API that helps me track, analyse, and predict my own study habits. It started as a Pomodoro timer. It became a machine learning project. The system logs every study session to a SQLite database, exposes a FastAPI backend for CRUD operations and analytics, and serves a trained classification model that predicts — given a subject, duration, and time of day — whether a session will be completed or interrupted. I built this to learn Python engineering properly: no vibe-coding, no AI-generated functions. Every line is mine.

Live features

Endpoint	Method	Description
`/sessions`	GET	List all study sessions
`/sessions`	POST	Log a new session
`/sessions/{id}`	GET	Retrieve one session
`/sessions/{id}/complete`	PATCH	Mark a session as completed
`/sessions/{id}`	DELETE	Remove a session
`/predict`	POST	ML prediction: will this session be completed?
`/analytics`	GET	Aggregated stats: time by subject, completion rate, trends
`/docs`	GET	Auto-generated Swagger UI (FastAPI built-in)

Stack

Layer	Technology	Why
Language	Python 3.11	Primary language for AI/DS engineering roles
API framework	FastAPI + Uvicorn	Async, typed, auto-documentation, industry standard
Database	SQLite + SQLAlchemy	Simple persistence with a real ORM
Data analysis	Pandas + Matplotlib	Aggregation and visualisation of session data
Machine Learning	scikit-learn	Classification model (RandomForestClassifier)
Model serialisation	joblib	Save and reload trained model between restarts
Validation	Pydantic v2	Request/response schemas, type safety at the boundary

Project structure

atlas-dev-os/
├── backend/
│   ├── api/
│   │   ├── routes.py          # all FastAPI endpoints
│   │   └── schemas.py         # Pydantic request/response models
│   ├── core/
│   │   ├── database.py        # SQLAlchemy engine, session, Base
│   │   ├── models.py          # ORM table definitions
│   │   ├── crud.py            # create / read / update / delete
│   │   └── ml.py              # model loading, feature engineering, predict()
│   └── main.py                # app factory, router registration
├── ml/
│   ├── train.py               # training script — run once to produce model
│   ├── evaluate.py            # accuracy, classification report, feature importance
│   └── completion_model.pkl   # serialised RandomForestClassifier
├── notebooks/
│   ├── 01_eda.ipynb           # exploratory data analysis on session CSV
│   └── 02_model_selection.ipynb  # comparing Logistic Regression vs Random Forest
├── scripts/
│   └── study_timer.py         # CLI Pomodoro timer — the data source
├── tests/
│   ├── test_crud.py
│   ├── test_routes.py
│   └── test_ml.py
├── .env.example               # copy to .env and fill in your keys
├── requirements.txt
└── README.md

Getting started

Requirements: Python 3.11+, Git

# 1. Clone
git clone https://github.com/xdmanflow/atlas-project.git
cd atlas-project

# 2. Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Set up environment variables
cp .env.example .env
# Edit .env and add your keys if needed

# 5. Run the API
uvicorn backend.main:app --reload

# 6. Open the interactive docs
# http://localhost:8000/docs

Train the ML model (required before using /predict):

# First generate some sessions with the timer
python scripts/study_timer.py --sessions 4

# Then train the model on your data
python ml/train.py
# Outputs: ml/completion_model.pkl

The ML model

Problem: binary classification — will a study session be completed?

Features used:

duration_minutes — longer sessions correlate with lower completion
start_hour — time of day affects focus (encoded from start_time)
subject_encoded — some subjects are harder to stay focused on (LabelEncoder)

Training:

Dataset size   : 200+ logged sessions
Train/test split: 80% / 20%  (random_state=42)
Best model     : RandomForestClassifier(n_estimators=100)
Test accuracy  : ~78%  (improves with more logged data)

Example prediction request:

curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"subject": "FastAPI", "duration_minutes": 25, "start_hour": 9}'

{
  "will_complete": true,
  "confidence": 0.84
}

Example analytics response

{
  "total_sessions": 47,
  "completed_sessions": 38,
  "completion_rate": 0.81,
  "total_minutes": 1025,
  "avg_duration_minutes": 26.4,
  "top_subjects": [
    { "subject": "Python", "total_minutes": 325 },
    { "subject": "FastAPI", "total_minutes": 200 },
    { "subject": "ML basics", "total_minutes": 175 }
  ]
}

What I learned building this

This project was a deliberate practice run across the full Python engineering stack.

Starting from a simple CLI script (study_timer.py), I progressively added:

OOP layer — refactored the timer into StudySession, DeepWorkSession, and BreakSession classes with proper inheritance, __str__/__repr__, and to_csv_row() serialisation.
HTTP layer — built a requests-based API client before writing my own server, so I understood what "an endpoint" actually is from the client side first.
Functional layer — rewrote data processing pipelines using map, filter, functools.reduce, and itertools.groupby — understanding lazy evaluation in the process.
API layer — built a FastAPI backend with Pydantic v2 validation, proper HTTP status codes, and path/query parameter handling.
Persistence layer — replaced in-memory storage with SQLAlchemy + SQLite, implementing full CRUD with proper session management.
Analytics layer — loaded the database into Pandas DataFrames to compute aggregates and generate charts.
ML layer — engineered features from raw session data, compared two classifiers, evaluated with classification_report, serialised the best model with joblib.

The hardest part was Day 7: understanding why SessionLocal is a factory, not a session, and why you always call .close() in a finally block or use a context manager.

Roadmap

Add JWT authentication (FastAPI + OAuth2 + password hashing)
Docker + docker-compose for one-command local setup
GitHub Actions CI pipeline (pytest on every push)
Deploy to Railway (free tier)
React frontend — study dashboard with real-time analytics charts
Improve ML model — add day_of_week and previous_session_completed as features

Author

Manil Doudou — Computer Engineering student, CESI Engineering School, Toulouse, France

Specialising in AI and Data Science · Looking for a 3–4 month internship from September 2026

This is Atlas Project. Mind the gap between the train and the platform. The next station is Destiny

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Atlas Project

What this is

Live features

Stack

Project structure

Getting started

The ML model

Example analytics response

What I learned building this

Roadmap

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Atlas Project

What this is

Live features

Stack

Project structure

Getting started

The ML model

Example analytics response

What I learned building this

Roadmap

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages