Skip to content

Build classification layer with deterministic baseline and optional LLM provider#6

Open
Spbd1 wants to merge 1 commit into
codex/build-deterministic-analysis-corefrom
codex/build-classification-layer-for-offline-use
Open

Build classification layer with deterministic baseline and optional LLM provider#6
Spbd1 wants to merge 1 commit into
codex/build-deterministic-analysis-corefrom
codex/build-classification-layer-for-offline-use

Conversation

@Spbd1
Copy link
Copy Markdown
Owner

@Spbd1 Spbd1 commented May 18, 2026

Motivation

  • Provide a taxonomy-grounded classification layer that runs deterministically offline but can optionally call a configured LLM provider.
  • Ensure classifications are conservative, evidence-grounded, and do not invent taxonomy labels or overreach on intent/truth judgments.

Description

  • Added an orchestration classifier with RiskAssessment, ClassificationResult, and ArgumentRiskClassifier in engine/argument_risk_engine/classification/classifier.py that supports deterministic_baseline and llm modes.
  • Implemented a conservative deterministic baseline in engine/argument_risk_engine/classification/deterministic.py that uses retrieval candidates, exact evidence-span matching, exclusion checks, healthy-suppressor logic, confidence thresholds, severity guarding, and limits to 3 risks for short claims.
  • Implemented strict LLM integration in engine/argument_risk_engine/classification/llm_client.py and prompt construction in engine/argument_risk_engine/classification/prompts.py which supply only candidate taxonomy entries, require JSON output, and forbid inventing labels or classifying without exact textual evidence.
  • Added validation and safe fallback behavior so LLM outputs are checked against supplied candidates and exact evidence spans, malformed or failing LLM responses produce warnings, and (when enabled) the deterministic baseline is used as a fallback; tests added in tests/test_classifier.py.

Testing

  • Ran linter checks with python -m ruff check engine/argument_risk_engine/classification tests/test_classifier.py and applied fixes, which completed successfully.
  • Ran the full test suite with pytest -q, where all tests passed (31 passed, 3 warnings) including the new classifier tests exercising offline deterministic behavior, LLM candidate restrictions, evidence-span validation, provider-failure fallback, and malformed-output handling.

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant