Skip to content

Commit 2dcda8b

Browse files
authored
test: add unit tests for granite formatters (#812) (#818)
* test: add 199 unit tests for granite formatters (#812) Unit tests for Granite 3.2 and 3.3 input/output processors, shared utilities, IntrinsicsResultProcessor canned data regression, and OpenAI SDK compatibility. No GPU, network, or model downloads required. - test_granite3_shared.py: find_substring_in_text, create_dict, parse_hallucinations_text, hallucination/citation span helpers - test_granite32_output.py: citation parsing, model output splitting, validation, transform pipeline - test_granite33_output.py: JSON-delimited citations/hallucinations, think/response extraction, controls cleanup - test_granite32_input.py: system message matrix, sanitize, transform - test_granite33_input.py: available_tools role, per-document roles - test_intrinsics_canned_output.py: canned model outputs through IntrinsicsResultProcessor, Pydantic schema validation of fixtures - test_openai_compat.py: ChatCompletion round-trip through OpenAI SDK Closes #812 * test: remove test_empty_substring (tests stdlib not our code) * fix: nltk_check() now catches LookupError for missing punkt_tab data nltk_check() only caught ImportError (package not installed) but missed LookupError (package installed, data not downloaded). Users with nltk present via transitive deps (rouge_score) got an unhelpful ValueError instead of install instructions when punkt_tab was missing. Also adds punkt_tab download to CI workflow — same pattern as Ollama model pulls. * fix: declare nltk as explicit dependency nltk is required by granite citation/hallucination parsing (nltk.sent_tokenize) but was only present as a transitive dependency of rouge_score. Pin >=3.9 for punkt_tab support (security fix over pickle-based punkt). * fix: code review cleanups for granite formatter tests - Fix granite_io → mellea in nltk_check error message (copy-paste from upstream) - Consolidate duplicate nltk_check into optional.py (single source of truth) - Remove unused imports in test_granite32_output.py - Fix assertion logic in test_balanced_tags_no_warnings (or → and) * test: harden granite test robustness - Fix invalid TCP port 98765 → 127.0.0.1:1 in OpenAI compat test - Add assertions that testdata directories are non-empty (fail loudly instead of silently collecting zero test cases) - Add unit tests for nltk_check LookupError handling * fix: address PR #818 review comments - Replace stale granite_io reference with mellea in optional.py docstring - Use record-level caplog assertions in granite32 output tests for consistency with granite33 tests * fix: address code review findings - Fix stale granite_io reference in util.py import_optional docstring - Tighten test_missing_colon_skipped assertion to verify empty result
1 parent 25745b7 commit 2dcda8b

14 files changed

Lines changed: 1794 additions & 36 deletions

File tree

.github/workflows/quality.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,8 @@ jobs:
4141
- name: Send failure message pre-commit
4242
if: failure() # This step will only run if a previous step failed
4343
run: echo "The quality verification failed. Please run precommit "
44+
- name: Download NLTK data
45+
run: uv run python -m nltk.downloader punkt_tab
4446
- name: Install Ollama
4547
run: curl -fsSL https://ollama.com/install.sh | sh
4648
- name: Start serving ollama

mellea/formatters/granite/base/optional.py

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ def import_optional(extra_name: str):
2727
2828
Args:
2929
extra_name: Package extra to suggest in the install hint
30-
(e.g. ``pip install granite_io[extra_name]``).
30+
(e.g. ``pip install mellea[extra_name]``).
3131
"""
3232
try:
3333
yield
@@ -49,14 +49,21 @@ def nltk_check(feature_name: str):
4949
feature_name: Name of the feature that requires NLTK, used in the error message.
5050
5151
Raises:
52-
ImportError: If the ``nltk`` package is not installed, re-raised with
53-
a descriptive message and installation instructions.
52+
ImportError: If the ``nltk`` package is not installed or required
53+
NLTK data (e.g. ``punkt_tab``) has not been downloaded,
54+
re-raised with a descriptive message and installation
55+
instructions.
5456
"""
5557
try:
5658
yield
5759
except ImportError as err:
5860
raise ImportError(
5961
f"'nltk' package not installed. This package is required for "
60-
f"{feature_name} in the 'granite_io' library."
62+
f"{feature_name} in the 'mellea' library."
63+
f"{_NLTK_INSTALL_INSTRUCTIONS}"
64+
) from err
65+
except LookupError as err:
66+
raise ImportError(
67+
f"NLTK data required for {feature_name} is not installed."
6168
f"{_NLTK_INSTALL_INSTRUCTIONS}"
6269
) from err

mellea/formatters/granite/base/util.py

Lines changed: 1 addition & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -23,22 +23,14 @@
2323
# First Party
2424
from .types import ChatCompletionResponse, ChatCompletionResponseChoice
2525

26-
NLTK_INSTALL_INSTRUCTIONS = """
27-
Please install nltk with:
28-
pip install nltk
29-
In some environments you may also need to manually download model weights with:
30-
python -m nltk.downloader punkt_tab
31-
See https://www.nltk.org/install.html#installing-nltk-data for more detailed
32-
instructions."""
33-
3426

3527
@contextlib.contextmanager
3628
def import_optional(extra_name: str):
3729
"""Handle optional imports.
3830
3931
Args:
4032
extra_name: Package extra to suggest in the install hint
41-
(e.g. ``pip install granite_io[extra_name]``).
33+
(e.g. ``pip install mellea[extra_name]``).
4234
"""
4335
try:
4436
yield
@@ -52,27 +44,6 @@ def import_optional(extra_name: str):
5244
raise
5345

5446

55-
@contextlib.contextmanager
56-
def nltk_check(feature_name: str):
57-
"""Variation on import_optional for nltk.
58-
59-
Args:
60-
feature_name: Name of the feature that requires NLTK, used in the error message.
61-
62-
Raises:
63-
ImportError: If the ``nltk`` package is not installed, re-raised with
64-
a descriptive message and installation instructions.
65-
"""
66-
try:
67-
yield
68-
except ImportError as err:
69-
raise ImportError(
70-
f"'nltk' package not installed. This package is required for "
71-
f"{feature_name} in the 'granite_io' library."
72-
f"{NLTK_INSTALL_INSTRUCTIONS}"
73-
) from err
74-
75-
7647
def find_substring_in_text(substring: str, text: str) -> list[dict]:
7748
"""Find all substring matches in text.
7849

mellea/formatters/granite/granite3/granite32/output.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,8 +40,9 @@
4040

4141
# First Party
4242
from ...base.io import OutputProcessor
43+
from ...base.optional import nltk_check
4344
from ...base.types import AssistantMessage, ChatCompletion, Document, ToolCall
44-
from ...base.util import find_substring_in_text, nltk_check, random_uuid
45+
from ...base.util import find_substring_in_text, random_uuid
4546
from ...granite3.output import (
4647
add_citation_context_spans,
4748
add_hallucination_response_spans,

mellea/formatters/granite/granite3/granite33/output.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,9 @@
2525

2626
# First Party
2727
from ...base.io import OutputProcessor
28+
from ...base.optional import nltk_check
2829
from ...base.types import AssistantMessage, ChatCompletion, ToolCall
29-
from ...base.util import nltk_check, random_uuid
30+
from ...base.util import random_uuid
3031
from ...granite3.output import (
3132
add_citation_context_spans,
3233
add_hallucination_response_spans,

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ dependencies = [
2828
"mistletoe>=1.4.0",
2929
"pillow", # Needed for Intrinsics (HF and OpenAI Backends).
3030
"math_verify", # Needed for Majority Voting Sampling Strategies.
31+
"nltk>=3.9", # Needed for sentence tokenization in granite citation parsing.
3132
"rouge_score", # Needed for Majority Voting Sampling Strategies.
3233
"PyYAML", # Needed for backends/adapters and granite formatters.
3334
"packaging", # Needed for server type helpers.

0 commit comments

Comments
 (0)