Compute is expensive, and biology is slow. Pre-screening protein variants before wet-lab synthesis usually requires hours of compute on enterprise HPC infrastructure.
Protein Explorer Atlas democratizes structural biology by bringing sub-second, interactive in-silico mutagenesis to edge devices. By leveraging Meta's highly efficient ESM-2 (35M parameters) foundation model and exposing its Masked Language Modeling (MLM) head, we achieve zero-shot, biologically accurate Variant Effect Prediction (VEP) and Conservation scoring without relying on custom downstream training or A100 GPU clusters.
This empowers biotech startups, EdTech undergrad programs, and global researchers to mutate, analyze, and visualize structural constraints in real time, directly in the browser.
- In-Silico Mutagenesis Lab: Edit a sequence and instantly compute a Masked Marginal Log-Likelihood (LLR) variant effect.
- Real-Time Conservation Profiling: Identifies functional hotspots via per-position 20-AA Shannon Entropy.
- Universal Structural Coverage: Dynamic fallback integration across RCSB PDB (Experimental) and AlphaFold EBI APIs (Predicted) guarantees 3D visualization.
- Sub-Second Edge Inference: Optimized 1D tensor batching achieves near-instantaneous neural throughput.
You can run the engine directly using Python or via a containerized Docker build.
Requires Python 3.10+ and a standard CPU (CUDA optional but recommended).
git clone https://github.com/yourusername/Protein-Explorer-Atlas.git
cd Protein-Explorer-Atlas
# Create a virtual environment and install dependencies
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
pip install -r requirements.txt
# Boot the engine
python run.pyThe web app will be available at http://localhost:5000
For an isolated, production-ready environment:
docker compose up --buildYou can quickly test this by loading the .fasta provided in the repository or by uploading your own sequence.
- 4GB System RAM minimum
- Optional: CUDA-compatible GPU (auto-detected and utilized for accelerated tensor math).
- Architecture Details - System design and the Zero-Shot Pipeline.
- Scientific Methodology - Algorithmic explanations for LLR and Shannon Entropy.