Skip to content

feat: Add example rows to ValidationError for all rule failures#286

Open
Copilot wants to merge 11 commits into
streaminfrom
copilot/enhance-validation-error-message
Open

feat: Add example rows to ValidationError for all rule failures#286
Copilot wants to merge 11 commits into
streaminfrom
copilot/enhance-validation-error-message

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 28, 2026

Motivation

Closes #282.

Changes

This PR adds a new option max_failure_examples to the config object. Users can set, for example,

dy.Config.set_max_failure_examples(3)

to show up to three failures for each failing rule whenever validation is run. Examples are shown both for lazy and eager validation.


Side note for reviewers: copilot was rather unhelpful here 😅

…and examples

Co-authored-by: borchero <22455425+borchero@users.noreply.github.com>
Copilot AI changed the title [WIP] Improve validation error messaging for primary key failures feat: Add duplicate key details to primary_key ValidationError Feb 28, 2026
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Adjust the format_rule_failures method to optionally include examples. Possibly add a helper method to FailureInfo if that is helpful.
  • Do not only provide examples for primary key violations but for any type of validation
  • Adjust the Rust plugin s.t. the same information is generated during lazy execution

…paths

Co-authored-by: borchero <22455425+borchero@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 28, 2026

  • Adjust the format_rule_failures method to optionally include examples. Possibly add a helper method to FailureInfo if that is helpful.
  • Do not only provide examples for primary key violations b...

Addressed in commit 462f8ce:

  1. format_rule_failures extended: Now accepts an optional examples: dict[str, list[str]] | None parameter. Each value is a list of pre-formatted example row strings. The examples are shown for both schema-level and column-level rules.

  2. Examples for all rule types: Added FailureInfo.examples() helper method (the suggested helper on FailureInfo) that computes distinct example rows from all data columns for every failing rule. This is used in both schema.py and collection/collection.py eager validation paths.

  3. Lazy execution (Rust plugin): Updated all_rules_required to accept data columns as additional inputs after the rule boolean columns (num_rule_columns kwarg tells it where the split is). For each failing rule, the plugin computes up to 5 distinct example rows using AnyValue::Display. The schema.py lazy path now passes cls.column_names() as data columns.

Example output:

# Eager
1 rules failed validation:
 - 'primary_key' failed for 2 rows with 2 distinct examples: [{'a': 1, 'b': 'x', 'c': '1'}, {'a': 1, 'b': 'y', 'c': '2'}]

# Lazy
1 rules failed validation for schema 'MySchema':
 - 'primary_key' failed for 2 rows with 2 distinct examples: [{'a': 1, 'b': "x", 'c': "1"}, {'a': 1, 'b': "y", 'c': "2"}]

Note: string quotes differ slightly between eager (Python str(dict) → single quotes) and lazy (Rust AnyValue::Display → double quotes), but both are readable.

Copilot AI changed the title feat: Add duplicate key details to primary_key ValidationError feat: Add example rows to ValidationError for all rule failures Feb 28, 2026
@borchero Oliver Borchert (borchero) changed the base branch from main to streamin May 24, 2026 16:46
@github-actions github-actions Bot added the enhancement New feature or request label May 24, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (c79baae) to head (3923110).

Additional details and impacted files
@@            Coverage Diff             @@
##           streamin      #286   +/-   ##
==========================================
  Coverage    100.00%   100.00%           
==========================================
  Files            56        56           
  Lines          3427      3419    -8     
==========================================
- Hits           3427      3419    -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants