An end-to-end Data Science project to detect the most ideal flagship smartphones (2020–2022) using ML price prediction, user survey analysis, and interactive Tableau dashboards — for both Gaming and Non-Gaming categories.
🔗 View Live Tableau Dashboard 📋 View Original Survey Form
There is no single smartphone that has all ideal features under a reasonable price. This project aims to:
- Predict the ideal price of a flagship phone based on user-preferred specs
- Identify real existing devices that come closest to those ideal specs
- Compare ideal vs real devices using interactive Tableau dashboards
Data Collection → Feature Engineering → ML Model → Streamlit Web App
↓ ↓
Survey (Google Form) → Ideal Spec Analysis → Tableau Dashboard
├── FLAGSHIP 20-22 Data.xlsx # Raw collected dataset (69 devices)
├── Ideal_Product_Detection_Feature_Engineering_ # Feature engineering notebook
│ and_ML_Model.ipynb
├── Ideal_spec_finding_analysis(from_survey_data).ipynb # Survey analysis notebook
├── code_made_by_sreehari.ipynb # Custom countplot utility function
├── flaghship pycharm.txt # Streamlit web app source code
├── Best_device_Tblx_dashboradlink.txt # Tableau dashboard link
├── Ideal Product detection-Report.pdf # Full project report
└── Project_Summary.txt # Project overview
- Source: Gadgets360, 91mobiles, GSMArena
- Scope: 69 flagship smartphones from 2020–Jan 2022
- Brands: Apple, Samsung, OnePlus, Xiaomi, Poco, Realme, Vivo, Asus, IQOO, Lenovo, Motorola, Oppo
- Criteria: Snapdragon 855+, Dimensity 1000+, or Apple A-series chipset; 5G enabled
- Fields: 21 raw features → expanded to 30+ after feature engineering
Notebook: Ideal_Product_Detection_Feature_Engineering_and_ML_Model.ipynb
- Removed unit strings (
MP,GB,Kg) from columns for numeric conversion - Split multi-value columns (e.g.
48+16+8camera → 3 separate columns) - Engineered PPI (Pixels Per Inch) from resolution + screen size
- Verified splits by checking unique values; corrected misaligned columns
- Final cleaned data exported as
Pythonsplit FLAGSHIP 20-22 Data.csv
Model: Voting Regressor (RandomForest + GradientBoosting + ExtraTrees)
| Metric | Score |
|---|---|
| R² Score | 0.7747 |
| MAE | 0.1628 (log scale) |
- Correlation analysis to select best features
- Label encoding + OneHotEncoder via
ColumnTransformer - 80/20 train-test split
- Model exported using
pickle→pipe.pkl,vd.pkl
A web-based price predictor built with Streamlit in PyCharm.
- Select brand, chipset, display, cameras, battery, and more
- Click "Predict Product Price" → get predicted INR price instantly
- Used to predict ideal device prices for each brand with survey-identified specs
streamlit run ipd.pyNotebook: Ideal_spec_finding_analysis(from_survey_data).ipynb
- Survey created via Google Form for both gamers and non-gamers
- Responses analyzed using custom
cp()countplot function - Most-selected features identified separately for Gaming and Normal flagship categories
- Example finding: Most users preferred 480Hz touch sample rate for gaming devices, Samsung sensor for camera
- 15 feature sheets created for gaming + normal flagship comparison
- Devices with 8+ out of 15 matching ideal features selected as best real products
- Detected: 5 best Gaming Flagships (ASUS ROG 5, Lenovo Legion Duel 2, OnePlus 9RT, Realme GT 2, GT Neo 2) and 11 best Normal Flagships
- Interactive comparison: click any brand → see how it compares to the ideal predicted device
| Tool | Purpose |
|---|---|
Python |
Core language |
Pandas, NumPy |
Data manipulation |
Matplotlib, Seaborn |
Visualization |
Scikit-learn |
ML models, encoding, evaluation |
Streamlit |
Web app |
Pickle |
Model export |
Tableau Public |
Interactive dashboards |
Google Forms |
Survey |
Google Colab |
Notebook environment |
# Clone the repo
git clone https://github.com/Sreeharipavithran/PG-Diploma-Project-2022.git
# Install dependencies
pip install streamlit scikit-learn pandas numpy matplotlib seaborn openpyxl
# Run the web app (requires pipe.pkl and vd.pkl)
streamlit run "flaghship pycharm.txt"
⚠️ Note: The pre-trained model files (pipe.pkl, vd.pkl) are not included in this repo as they were built on a local system. To reproduce them, run the Feature Engineering and ML Model notebook end-to-end first, which will generate the pickle files needed for the Streamlit app.