HiringBase is AI-powered recruitment assistant that streamlines hiring through ticket-based applications, async AI screening, document validation, semantic skill matching, and deterministic weighted scoring.
git clone https://github.com/boyblanco/HireBase_Server.git
cd HireBase_Server
python -m venv venv
source venv/bin/activate # Linux/macOS
# pip install -r requirements.txtcp .env.example .env
# Configure NEON_DATABASE_URL, R2_BUCKET, and GROQ_API_KEYalembic upgrade head
uvicorn app.main:app --reloadHiringBase follows a Domain-Driven Design (DDD) approach with a clean, layered architecture:
app/
├── ai/ # AI Engine (OCR, Matcher, Scoring, LLM)
├── core/ # System Core (Config, Auth, Database, Exceptions)
├── shared/ # Global Resources (Enums, Schemas, Constants)
├── features/ # Feature-based Modules (Auth, Jobs, Applications, etc.)
│ └── <domain>/
│ ├── routers/ # API Endpoints
│ ├── services/ # Business Logic
│ ├── repositories/ # Data Access
│ ├── schemas/ # Pydantic Models
│ └── models/ # SQLAlchemy ORM Models
├── workers/ # Async Background Tasks
└── main.py # Entry Point
We use a 3-layer intelligence strategy to ensure both accuracy and explainability.
| Layer | Technology | Purpose |
|---|---|---|
| Layer 1: Deterministic | Rule Engine | Knockout rules, scoring formula, ranking. |
| Layer 2: Semantic | Hugging Face Inference API + keyword logic | Semantic skill matching, synonym detection, soft skill baseline. |
| Layer 3: Reasoning | Groq + Mistral OCR | Document validation, red-flag analysis, HR explanation generation. |
Important
Scoring Philosophy: The LLM is never used to compute the final score. Scores are calculated using a deterministic, weighted formula based on verified form data.
Public apply
-> save applicant + answers + documents + ticket
-> status = APPLIED
-> no direct AI call
Manual trigger or hourly batch
-> Redis dedupe + quota guard
-> Taskiq worker
-> DOC_CHECK
-> knockout validation
-> AI_PROCESSING
-> OCR (Mistral)
-> semantic doc validation (Groq)
-> semantic skill match (HF Inference API)
-> anchored component scoring
-> deterministic weighted scoring
-> red flag detection
-> explanation generation
-> CandidateScore saved
-> AI_PASSED / UNDER_REVIEW / REJECTED
- Ticket-Based Public Flow: No applicant login required; track status via
TKT-YYYY-NNNNN. - Custom Form Builder: Create job-specific forms with varied field types and knockout logic.
- Async Screening Pipeline: Taskiq worker + scheduler with Redis quota guard, retry, and stale recovery flow.
- Semantic Document Validation: Mistral OCR + Groq verification for KTP, Ijazah, and SKCK-class admin docs.
- Anchored Scoring Governance: component rating 1-5 mapped to deterministic 0-100, plus confidence-based review gate.
- Advanced Auth Security: Stateful JWT with rotation, reuse detection, and global kill-switch.
- Structured Logging: JSON-based logging via
structlogfor easy monitoring.
- Core: FastAPI (Async), Pydantic v2
- Persistence: PostgreSQL, SQLAlchemy 2.0 (Async), Alembic
- AI/ML: Mistral OCR API, Hugging Face Inference API, Groq Cloud
- Queue/Cache: Upstash Redis, Taskiq
- Infrastructure: Cloudflare R2 (S3-compatible storage)
- Security: python-jose, passlib (bcrypt)
- V2: WhatsApp/Email auto-notifications for candidate status updates.
- V2: Computer Vision for advanced fake document detection.
- V2: Integrated Interview Scheduler with Google/Outlook Calendar.
- V2: Multi-tenant Dashboard for Super Admins.
pytest app/tests -vCurrently, there are 107+ unit and integration tests covering:
- Auth (JWT, password hashing, refresh token rotation)
- AI Scoring (semantic matcher, soft skill scorer, red flag detector)
- Knockout Rules (all rule types with mocks)
- Screening Pipeline (DOC_FAILED, KNOCKOUT, AI_PASSED, UNDER_REVIEW)
- Document Validation (OCR + LLM pipeline)
- HR Workflows (vacancy lifecycle, screening, tenant isolation)
- Public Application Flow (ticket tracking, duplicate detection)
- Password Hashing: Bcrypt via Passlib
- JWT Security: Stateful tokens with rotation, reuse detection, and kill-switch
- Rate Limiting: 60 requests per minute per IP
- File Validation: Type and size checks for uploads
- SQL Injection Protection: SQLAlchemy ORM
- FastAPI: 0.115.0
- Pydantic: v2
- SQLAlchemy: 2.0 (Async)
- Alembic: Latest
- Groq: For document validation, explanation, soft skill enhancement, and red-flag analysis
- Mistral OCR: For OCR extraction from PDF/image documents
- Hugging Face Inference API: For semantic skill matching
MIT License. See LICENSE for more information.