SinkFix

SinkFix is a deployed full-stack app for inspecting attention sinks in BERT-style transformer models.

The project is best described as an attention interpretability tool. It helps answer:

Which tokens receive unusually concentrated attention, and do those tokens look structural, useful, or suspicious?

SinkFix takes a Hugging Face model name and input text, runs the model with attention and hidden-state outputs enabled, and returns a token-level diagnostic report.

Live App

SinkFix is deployed at:

Current Status

SinkFix currently works as a deployed attention diagnostics app:

Backend API built with FastAPI
Frontend built with Next.js, React, TypeScript, and Tailwind CSS
Default model input set to google-bert/bert-base-uncased
Analysis designed around BERT-style encoder internals
Results shown as a token-level diagnostic table in the frontend
Averaged attention heatmap shown on the results page
JSON and CSV export available from the displayed analysis result

The app focuses on internal attention behavior. It does not claim to fully explain why a model produced a specific final prediction.

What The App Shows

For each input, SinkFix returns:

model tokens from the tokenizer
normalized attention received by each token
normalized value-vector norm for each token
a token classification: beneficial, neutral, or detrimental
summary counts by classification
the strongest attention receiver
the top attention sinks
a full table of token-level diagnostics

Special tokens such as [CLS] and [SEP] are included in the report. That is intentional in the current version because structural-token behavior is part of what the project is inspecting.

Backend Method

The current backend pipeline is:

Load the requested model and tokenizer with Hugging Face Transformers.
Tokenize the input text.
Run the model with attention and hidden-state outputs enabled.
Average attention across layers and heads.
Compute normalized attention received by each token.
Compute normalized value-vector norms from BERT layer 0.
Detect tokens above the attention threshold.
Classify detected sink candidates.

The classification rule is intentionally simple:

token index 0, usually [CLS], is classified as beneficial when evaluated at early layer depth
high attention received with lower value norm is classified as detrimental
everything else is classified as neutral

The frontend displays the token-level diagnostics as summary cards, top sinks, and a full results table.

What This Is Not

not a training or fine-tuning pipeline
not a model repair system
not a claim that attention diagnostics fully explain model decisions
not an ML monitoring system
not currently designed for autoregressive language models

Tech Stack

Backend:

Python
FastAPI
Hugging Face Transformers
PyTorch

Frontend:

Next.js
React
TypeScript
Tailwind CSS

Project Structure

backend/
  api/
    main.py       FastAPI app and CORS setup
    routes.py     analysis endpoint
    schemas.py    request and response models
  ml/
    utils.py          model loading and attention extraction
    sink_detector.py  attention sink detection
    classifier.py     sink classification rule

frontend/
  app/                 Next.js routes
  src/features/analysis/
    api/               frontend API request helper
    components/        analysis form and results UI
    types/             TypeScript response types

Run Locally

Install backend dependencies:

pip install -r requirements.txt

If PyTorch is not already installed in your environment, install the build appropriate for your machine from the official PyTorch instructions.

Start the backend:

uvicorn backend.api.main:app --reload

Install frontend dependencies:

cd frontend
npm install

Start the frontend:

npm run dev

Open the frontend at:

http://localhost:3000

The frontend expects the backend at:

http://localhost:8000

Allowed frontend origins can be configured with:

FRONTEND_ORIGINS=https://www.sinkfix.xyz,https://sinkfix.xyz,http://localhost:3000

API Usage

Call the analysis endpoint:

curl -X POST http://127.0.0.1:8000/api/analyze \
  -H "Content-Type: application/json" \
  -d '{"model_name":"google-bert/bert-base-uncased","text":"The wheels on the bus go round and round."}'

Request body:

{
  "model_name": "google-bert/bert-base-uncased",
  "text": "The wheels on the bus go round and round."
}

Response fields:

token_list: model tokens
classifications: one label per token
att_received_scores: normalized attention received by each token
value_norms: normalized value-vector norm per token

Frontend Flow

The input page submits a model name and text to the backend. On success, the frontend stores the latest response in sessionStorage and navigates to /results.

The results page reads that stored response and renders:

total token count
classification counts
strongest attention receiver
top five attention sinks
averaged attention heatmap
full token table
JSON and CSV export actions

Refreshing or opening /results without a stored response shows an empty-state message.

Checks

Backend syntax check:

python -m compileall backend

Frontend lint:

cd frontend
npm run lint

Frontend production build:

cd frontend
npm run build

Current Limitations

The value-vector extraction assumes BERT internals at model.encoder.layer[...].
The model is loaded on every request, which is slow and inefficient.
Classification thresholds are heuristic and may need validation for broader model coverage.
The frontend only keeps the latest result in browser sessionStorage.
Autoregressive language models are not supported.
The project currently inspects attention behavior, not full causal explanations of model predictions.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
.dockerignore		.dockerignore
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SinkFix

Live App

Current Status

What The App Shows

Backend Method

What This Is Not

Tech Stack

Project Structure

Run Locally

API Usage

Frontend Flow

Checks

Current Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SinkFix

Live App

Current Status

What The App Shows

Backend Method

What This Is Not

Tech Stack

Project Structure

Run Locally

API Usage

Frontend Flow

Checks

Current Limitations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages