Skip to content

DoomDust7/Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Titanic Survival Predictor

A full-stack machine learning web app that predicts Titanic passenger survival using 8 classifiers simultaneously, built from the original exploratory notebook.

Predict tab FastAPI scikit-learn Tailwind


What's New

The original ML-Titanic.ipynb notebook has been rebuilt as a modern full-stack web application:

Before After
Jupyter Notebook FastAPI backend + React frontend
Loads local CSV files Bundled dataset, served via REST API
Static cell output Live interactive UI with real-time predictions
Manual model comparison Model leaderboard with accuracy bars
Matplotlib/Seaborn charts Interactive Recharts bar charts

Features

  • Survival Predictor — enter passenger details (class, sex, age, fare, port) and get an instant prediction
  • 8-model consensus — all classifiers vote; result shown with probability bar and per-model breakdown
  • Model Leaderboard — ranked accuracy table for all 8 classifiers
  • EDA Dashboard — survival rates by sex, passenger class, age group, and title (interactive charts)

Tech Stack

Backend

  • Python 3.9+
  • FastAPI — REST API
  • scikit-learn — ML models
  • pandas + NumPy — data processing

Frontend


ML Models & Accuracy

Rank Model Accuracy
🥇 Random Forest 93.6%
🥈 Decision Tree 93.6%
🥉 K-Nearest Neighbors 87.1%
4 Support Vector Machine 83.5%
5 Logistic Regression 80.0%
6 Linear SVC 79.7%
7 Naive Bayes 77.0%
8 Perceptron 66.6%

Feature Engineering

Replicates the original notebook's pipeline:

  • Title extraction from passenger names (Mr / Mrs / Miss / Master / Rare)
  • Age imputation via median by Sex × Pclass grid
  • Age banding into 5 bins (0–16, 16–32, 32–48, 48–64, 64+)
  • Categorical encoding for Sex and Embarked
  • Models trained on: Pclass, Sex, Age, Fare, Embarked, Title

Project Structure

├── ML-Titanic.ipynb          # Original exploratory notebook
├── backend/
│   ├── main.py               # FastAPI app (3 endpoints)
│   ├── pipeline.py           # Feature engineering + model training
│   ├── data/
│   │   └── titanic.csv       # Training dataset (891 rows)
│   └── requirements.txt
└── frontend/
    ├── src/
    │   ├── App.tsx            # Tab layout (Predict / Models / Stats)
    │   ├── components/
    │   │   ├── PredictionForm.tsx
    │   │   ├── ResultCard.tsx
    │   │   ├── ModelLeaderboard.tsx
    │   │   └── StatsDashboard.tsx
    │   └── types.ts
    ├── vite.config.ts         # Proxies /api → localhost:8000
    └── package.json

Setup & Run

1. Backend

cd backend
pip install -r requirements.txt
python3 -m uvicorn main:app --port 8000

Models train automatically on startup (~2 seconds). API available at http://localhost:8000.

2. Frontend

cd frontend
npm install
npm run dev

Open http://localhost:5173


API Endpoints

Method Endpoint Description
POST /api/predict Predict survival for a passenger
GET /api/models Accuracy scores for all 8 models
GET /api/stats EDA stats (survival rates by group)

Predict request body:

{
  "pclass": 1,
  "sex": "female",
  "age": 30,
  "sibsp": 0,
  "parch": 0,
  "fare": 100,
  "embarked": "S"
}

Dataset

The Titanic dataset — 891 passengers, 38.4% survival rate. Features: PassengerId, Survived, Pclass, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, Embarked.

About

Data Analysis on the Titanic dataset on Kaggle

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors