Skip to content

sriinnu/kosha-discovery

Repository files navigation

Kosha — AI Model Discovery

kosha-discovery — कोश

AI Model & Provider Discovery Registry

npm version npm downloads release v0.8.0 license node version release workflow status snapshot workflow status provider smoke status LiteLLM pricing source

v0.8.0 — adds tokenizer-family inference, tool-dialect & structured-output mode heuristics, multimodal pricing (image / audio / video / character / long-context tiers), deprecation metadata, and Llama 4 support. ModelCard gains toolDialect, structuredOutputModes, supportsParallelToolCalls, status, deprecationDate, and replacedBy. New ProviderDescriptor.minCachePrefixTokens exposes the provider's prompt-cache floor. Hono bumped to 4.12.14 (closes 7 security advisories).

Kosha (कोश — treasury/repository) automatically discovers AI models across providers, resolves credentials from CLI tools and environment variables, enriches models with pricing data, and exposes the catalog via library, CLI, and HTTP API.

Why

AI applications hardcode model IDs, pricing, and provider configs. When providers add models or change pricing, every app breaks. Kosha solves this:

  • Dynamic discovery — fetches real model lists from provider APIs
  • Offline direct catalogs — OpenAI, Anthropic, and Google fallback model coverage even without API keys
  • Smart credentials — finds API keys from env vars, CLI tools (Claude, Copilot, Gemini CLI), and config files
  • Pricing enrichment — fills in input/output/reasoning/cache/batch costs and context windows from litellm's community-maintained dataset
  • Proxy vs origin pricing — preserves route pricing and exposes origin-provider reference pricing for proxy-served models
  • Persistent cache + portable manifest — 24h on-disk cache at ~/.kosha/cache, plus a stable v1 JSON manifest at ~/.kosha/registry.json that any language or tool can read directly
  • Model aliasessonnetclaude-sonnet-4-20250514, updated as models evolve
  • Role matrix — query provider -> model -> roles (chat, embedding, image_generation, etc.)
  • Cheapest routing — rank cheapest eligible models for tasks like embeddings or image generation
  • Local LLM scanning — detects Ollama models alongside cloud providers
  • Three access patterns — use as a library, CLI tool, or HTTP API

Install

npm install kosha-discovery      # library or HTTP server
npm install -g kosha-discovery   # global `kosha` CLI

Or with pnpm:

pnpm add kosha-discovery

Getting Started (CLI)

# 1. First run — discovers all reachable providers and writes the cache + manifest
kosha discover
# → Anthropic: 3 models, OpenAI: 7 models, ...
# → Cached to ~/.kosha/cache  ·  Manifest: ~/.kosha/registry.json

# 2. Subsequent commands read instantly from the 24h on-disk cache
kosha list
# → Loaded 380 models from cache (9h ago). Run "kosha update" to refresh.

# 3. Force a fresh pull from all provider APIs
kosha update     # alias for `kosha refresh`

After any discovery, a stable, third-party-readable manifest is written to ~/.kosha/registry.json. It holds the full v1 snapshot — providers, models, pricing, capabilities, and health — in a documented schema. Any tool that can read JSON can consume it:

jq '.models[] | select(.pricing.inputPerMillion < 0.1) | .modelId' ~/.kosha/registry.json
import json, pathlib
data = json.loads(pathlib.Path("~/.kosha/registry.json").expanduser().read_text())
print(len(data["models"]), "models from", len(data["providers"]), "providers")

Development (pnpm)

pnpm install
pnpm run build
pnpm run test

Quick Start

Library

import { createKosha } from "kosha-discovery";

const kosha = await createKosha();

const models = kosha.models();                           // all models
const embeddings = kosha.models({ mode: "embedding" });  // filter by mode
const model = kosha.model("sonnet");                     // resolve alias
const cheapest = kosha.cheapestModels({ role: "image", limit: 3 });

console.log(model.pricing); // { inputPerMillion: 3, outputPerMillion: 15, ... }

CLI

kosha discover                          # discover all providers (writes cache + manifest)
kosha list                              # list models (instant from cache)
kosha list --provider anthropic         # filter by provider
kosha search gemini                     # fuzzy search
kosha model sonnet                      # model details
kosha cheapest --role embeddings        # cheapest for a task
kosha routes gpt-4o                     # all provider routes
kosha providers                         # provider status
kosha update                            # force re-discover (alias: refresh)
kosha latest                            # force-fetch latest provider/model details
kosha latest --provider openai          # latest for one provider
kosha serve --port 3000                 # start HTTP API

Results live at ~/.kosha/cache (24h TTL) and ~/.kosha/registry.json (stable v1 manifest). See docs/cli.md for the full reference.

Auto-Fetch JSON Snapshot

# one-shot latest snapshot
pnpm run autofetch:once

# custom output/provider
pnpm run autofetch:once -- --provider openai --output ./data/openai-latest.json

# continuous loop (default 3600s)
pnpm run autofetch -- --interval-seconds 900

By default this writes JSON to ./data/kosha-latest.json.

CI / Smoke Checks

Workflow files:

  • .github/workflows/update-kosha-snapshot.yml
  • .github/workflows/provider-smoke.yml

Snapshot workflow:

  • Scheduled: weekly (Monday 06:00 UTC)
  • Manual: GitHub UI -> Actions -> Update Kosha Snapshot -> Run workflow
  • Optional manual inputs:
    • provider (empty = all providers)
    • output (default data/kosha-latest.json)
  • Disable scheduled runs without removing the workflow by setting KOSHA_SNAPSHOT_SCHEDULE_ENABLED=false
  • Manual Run workflow still works even when schedule is disabled

Provider smoke workflow:

  • Scheduled: nightly (03:00 UTC)
  • Manual: GitHub UI -> Actions -> Provider Smoke Checks -> Run workflow
  • Runs only when repository variable KOSHA_PROVIDER_SMOKE_ENABLED=true
  • Manual dispatch can override that gate with force=true
  • Installs with pnpm, builds, then runs a node inline smoke script against real provider endpoints
  • Providers without the required secrets are skipped instead of failing the job
  • Always uploads artifacts/provider-smoke-report.json

Provider smoke secrets:

  • OpenAI: OPENAI_API_KEY
  • Google/Gemini: GOOGLE_API_KEY or GEMINI_API_KEY
  • Mistral: MISTRAL_API_KEY
  • DeepSeek: DEEPSEEK_API_KEY
  • Moonshot: MOONSHOT_API_KEY or KIMI_API_KEY
  • GLM: GLM_API_KEY or ZHIPUAI_API_KEY
  • Z.AI: ZAI_API_KEY
  • MiniMax: MINIMAX_API_KEY
  • OpenRouter: optional OPENROUTER_API_KEY
  • Bedrock: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION
  • Vertex AI: GOOGLE_APPLICATION_CREDENTIALS_JSON, GOOGLE_CLOUD_PROJECT

Security controls in these workflows:

  • Snapshot workflow commits only the configured snapshot file path plus its checksum file, never broad git add -A
  • Snapshot workflow validates the generated snapshot against a local JSON schema before commit
  • Snapshot workflow runs a high-signal secret-pattern scan on snapshot output before commit
  • Snapshot workflow writes data/kosha-latest.sha256 alongside the snapshot
  • Snapshot workflow uploads an always-on artifact with run metadata, provider summaries, and failure details
  • Provider smoke workflow never echoes secret values and records a machine-readable JSON report

Branch Protection Audit

Workflow file: .github/workflows/branch-protection-check.yml

  • Runs on main and via workflow_dispatch
  • Reads the main branch protection rule through the GitHub API
  • Expects the required status checks list to include Branch Protection Check / audit
  • Uploads a warning artifact instead of failing when the token cannot read branch protection settings

Keep that check name stable when you rename the workflow or job, and update the protection rule whenever you add more required checks.

Where Data Is Stored

  • Git repo: model/provider discovery data is not committed by default.
  • Runtime cache: ~/.kosha/cache/*.json (machine-local, TTL-based).
  • Exported snapshot: only if you run autofetch/autofetch:once with an output file and commit it yourself.
  • Stable manifest: ~/.kosha/registry.json is rewritten after every discovery and holds the full v1 snapshot for third-party consumers.

HTTP API

kosha serve --port 3000
GET /api/models                    — All models (filterable)
GET /api/models/cheapest           — Cheapest ranked models
GET /api/models/:idOrAlias         — Single model
GET /api/models/:idOrAlias/routes  — All provider routes
GET /api/roles                     — Provider → model → roles matrix
GET /api/providers                 — All providers
POST /api/refresh                  — Re-discover
GET /health                        — Health check

Supported Providers

Provider Discovery Credential Sources
Anthropic API (/v1/models) ANTHROPIC_API_KEY, Claude CLI, Codex CLI
OpenAI API (/v1/models) OPENAI_API_KEY, GitHub Copilot tokens
Google API (/v1beta/models) GOOGLE_API_KEY, GEMINI_API_KEY, Gemini CLI, gcloud
AWS Bedrock SDK → CLI → static AWS_ACCESS_KEY_ID, ~/.aws/credentials, SSO, IAM
Vertex AI API + gcloud GOOGLE_APPLICATION_CREDENTIALS, gcloud ADC
Ollama Local API None needed (local)
OpenRouter API OPENROUTER_API_KEY (optional)
NVIDIA API NVIDIA_API_KEY
Together AI API TOGETHER_API_KEY
Fireworks AI API FIREWORKS_API_KEY
Groq API GROQ_API_KEY
Mistral AI API MISTRAL_API_KEY
DeepInfra API DEEPINFRA_API_KEY
Cohere API CO_API_KEY
Cerebras API CEREBRAS_API_KEY
Perplexity API PERPLEXITY_API_KEY
DeepSeek API DEEPSEEK_API_KEY
Moonshot (Kimi) API MOONSHOT_API_KEY / KIMI_API_KEY
GLM (Zhipu) API GLM_API_KEY / ZHIPUAI_API_KEY
Z.AI API ZAI_API_KEY
MiniMax API MINIMAX_API_KEY

Security

All external data (API responses, CLI output, cache reads) is scanned for 9 threat types before use: credential leaks, base64 payloads, script/shell injection, data URIs, null bytes, prototype pollution, hex blobs, and oversized strings. A pre-commit hook blocks secrets at commit time.

See docs/security.md for the full threat catalogue and architecture.

Architecture

Kosha Architecture

┌─────────────────────────────────────────────────────┐
│                  Your Application                    │
│        import { createKosha } from "kosha"          │
└───────────────────────┬─────────────────────────────┘
                        │
┌───────────────────────▼─────────────────────────────┐
│                  ModelRegistry                        │
│  models() · providerRoles() · cheapestModels()       │
└──┬──────────┬──────────────┬───────────────┬────────┘
   │          │              │               │
┌──▼───┐ ┌───▼────────┐ ┌───▼──────────┐ ┌──▼─────────┐
│Alias │ │ Discovery   │ │ Enrichment   │ │ Resilience  │
│System│ │ Layer       │ │ Layer        │ │ Layer       │
└──────┘ └───┬────────┘ └──────┬───────┘ └────────────┘
             │                 │          CircuitBreaker
    ┌────────┼────────┐        │          HealthTracker
    ▼        ▼        ▼        ▼          StaleCachePolicy
 Direct   OpenAI-   Cloud      litellm
  API    Compatible  Proxies    JSON

Documentation

Doc What's in it
Credentials Setup for all 21 providers (env vars, CLI tools, config files)
CLI Reference All commands, flags, and example output
HTTP API All endpoints, parameters, and response schemas
Configuration Aliases, routing, pricing enrichment, programmatic config
Architecture Discovery flow, module map, data pipeline, adding providers
Resilience Circuit breakers, stale cache fallback, health monitoring
Security Threat catalogue, runtime scanning, pre-commit hook
Discovery Plane v1 Stable daemon contract (deltas, SSE watch, binding hints)

Release & Tagging

Package: @sriinnu/kosha-discovery

This repo uses a human-in-the-loop release flow:

  1. Update version in package.json and lockfiles locally.
  2. Create a signed tag and push it:
    git tag -s v0.6.0 -m "v0.6.0"
    git push origin v0.6.0
  3. In GitHub Actions, run Manual Release (Tag + npm) and provide tag=v0.6.0.
  4. Workflow verifies tag/version match, builds/tests, then publishes to npm (if enabled) and creates a GitHub Release.

Required secret for publish:

  • NPM_TOKEN (publish rights for @sriinnu scope)

Credits

  • litellm -- Community-maintained model pricing database
  • openrouter -- Model aggregation API
  • ollama -- Local LLM runtime
  • chitragupta -- Autonomous AI Agent Platform whose registry patterns inspired kosha
  • takumi -- AI coding agent TUI whose routing needs drove kosha's creation

What "Kosha" Means

Kosha comes from Sanskrit -- a container, treasury, or layered sheath of knowledge. A standalone model-discovery utility for any AI system.

License

MIT

About

Discovery registry for AI models, credentials, and pricing across local and cloud providers. Library, CLI, and HTTP API.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages