feat: Proposed SIMBAUQ Sampling Strategy by radum2275 · Pull Request #785 · generative-computing/mellea

radum2275 · 2026-04-03T16:39:57Z

Sampling Strategy PR

Use this template when adding or modifying sampling strategies in mellea/stdlib/sampling/.

Description

Link to Issue: Fixes Proposal: Integrating Similarity‑Based Aggregation for Uncertainty Quantification into Mellea #718

Implementation Checklist

Base Class

Extends appropriate base class:
- BaseSamplingStrategy if your changes are mostly modifying the repair and/or select_from_failure functions
- SamplingStrategy if your changes involve a new sample method
- Other defined sampling strategies if your implementation is similar to existing implementations

Return Value

Returns a properly typed SamplingResult. Specifically, this means:
- ModelOutputThunks in sample_generations are properly typed from the Component and the parsed_repr is the expected type.

Integration

Strategy exported in mellea/stdlib/sampling/__init__.py

Testing

Tests added to tests/sampling/
New code has 100% coverage
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

github-actions · 2026-04-03T16:40:10Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

planetf1

Also noticed we don't export SOFAISamplingStrategy in all - not an issue from this PR, but observed

test/stdlib/sampling/test_simbauq.py

docs/examples/simbauq/simbauq_example.py

.gitignore

docs/examples/simbauq/README.md

mellea/stdlib/sampling/simbauq.py

test/stdlib/sampling/test_simbauq.py

mellea/stdlib/sampling/simbauq.py

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

radum2275 · 2026-04-09T09:35:01Z

@planetf1 @jakelorocco

I made all the required changes. I also replaced the RITS backend in my example with the ollama one.

However, I need some guidance with the following: the "classifier" confidence estimation method we developed requires a probabilisitic (skelearn) classifier, which we either receive from the user or we train it on-the-fly based off of the training examples provided by the user. I'd like to pre-train one using the datasets we already collected for our paper and have it as default option but it needs to live somewhere in the package as a serialised object (e.g., pickle file). What would be the best way to do that without messing up too much with the package structure. Thanks.

docs/examples/simbauq/README.md

psschwei · 2026-04-09T11:00:39Z

it needs to live somewhere in the package as a serialised object (e.g., pickle file)

Do you have an estimate on how large this file would be? If it's tens of MBs that's probably not a problem, but if we're looking at hundreds of MBs or a GB+ then could be a different story.

radum2275 · 2026-04-09T11:11:39Z

@psschwei it's actually not that big. the one we trained for our paper was about 250KB. it's a basic sklearn RandomForestClassifier.

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

psschwei · 2026-04-09T11:15:50Z

about 250KB

cool, I don't think that will be a problem

psschwei · 2026-04-09T11:26:53Z

I'm assuming then we would need to add sklearn as another dependency?

If we're pickling in a file, we probably should consider how to best do so safely.

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

radum2275 · 2026-04-09T11:51:01Z

@psschwei yes, added scikit-learn as a required dependency in pyproject.toml

radum2275 · 2026-04-09T11:53:48Z

I'm assuming then we would need to add sklearn as another dependency?

If we're pickling in a file, we probably should consider how to best do so safely.

sure, any suggestion is welcome :). i usually do this:

pickle.dump(model, open(filename, 'wb'))
# some time later...
loaded_model = pickle.load(open(filename, 'rb'))

test/stdlib/sampling/test_simbauq.py

mellea/stdlib/sampling/simbauq.py

planetf1 · 2026-04-10T15:57:00Z

mellea/stdlib/sampling/simbauq.py

+            except ImportError:
+                msg = (
+                    "scipy is required for harmonic mean aggregation. "
+                    "Please install with `pip install scipy`."


correct command, but if this is base library should we have scipy as a core dependency? And what's the relationship to granite-retriever?

I added scipy to the core dependencies next to numpy and scikit-learn required by this sampling strategy. However, as far as I know scikit-learn requires scipy, so probably we only need numpy and scikit-learn. Please advise.

The granite_retriever dependency group defined in pyproject.toml already contains the sentence-transformers required by the sbert similarity metric. Should we move sentence-transformers to the core dependencies?

planetf1 · 2026-04-10T15:58:39Z

I'm assuming then we would need to add sklearn as another dependency?

If we're pickling in a file, we probably should consider how to best do so safely.

Apologies noticed this after doing a per-line review.
There's no sklearn any more -- it's scikit-learn

Also I'm not 100% sure of the intent -- if this is stdlib is it core function? If so shouldn't all dependencies be in our core dependencies.

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

psschwei · 2026-04-10T16:58:52Z

Also I'm not 100% sure of the intent -- if this is stdlib is it core function? If so shouldn't all dependencies be in our core dependencies.

Agree, deps should be in the core deps if this is going into core.
I could also see a case for putting this in contribs rather than core stdlib (numpy/sklearn (old habits die hard)/scipy aren't huge like pytorch but they do add some size)

planetf1 · 2026-04-13T10:33:37Z

Also I'm not 100% sure of the intent -- if this is stdlib is it core function? If so shouldn't all dependencies be in our core dependencies.

Agree, deps should be in the core deps if this is going into core. I could also see a case for putting this in contribs rather than core stdlib (numpy/sklearn (old habits die hard)/scipy aren't huge like pytorch but they do add some size)

So the key decision here is where this code belongs. What do you think @jakelorocco -- is this core stdlib, or an optional, additional strategy?

I can see there might be three options
a) core
b) shipped with mellea, but not core & in an optional package (we may not have precedent for this, but it's an option)
c) contrib

once we've agreed on this we can nail the actual dependency config needed.

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

Radu Marinescu added 4 commits April 2, 2026 14:12

feat: initial commit for the SIMBAUQSamplingStrategy

c5236f0

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

chore: added a separate filed to mot.meta for the similarity matrix

ea51043

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

chore: added a second aggregation by classification CE algorithm

5c23a58

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

refactor: revised and moved the SIMBAUQSamplingStrategy in docs/examples

d7f3b6a

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

radum2275 requested a review from a team as a code owner April 3, 2026 16:39

jakelorocco changed the title ~~Proposed SIMBAUQ Sampling Strategy~~ feat: Proposed SIMBAUQ Sampling Strategy Apr 3, 2026

github-actions bot added the enhancement New feature or request label Apr 3, 2026

planetf1 requested changes Apr 7, 2026

View reviewed changes

Update test/stdlib/sampling/test_simbauq.py

908258c

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

radum2275 requested a review from a team as a code owner April 7, 2026 18:50

radum2275 requested review from HendrikStrobelt and nrfulton April 7, 2026 18:50

radum2275 and others added 10 commits April 7, 2026 19:50

Update docs/examples/simbauq/simbauq_example.py

8b8c336

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

Update .gitignore

865e85f

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

Update docs/examples/simbauq/README.md

a6b356a

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

Update docs/examples/simbauq/README.md

cbae30c

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

Update mellea/stdlib/sampling/simbauq.py

a3c51a8

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

Update mellea/stdlib/sampling/simbauq.py

e9b05f1

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

Update mellea/stdlib/sampling/simbauq.py

372046a

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

refactor: refactored the simbauq sampling strategy

af55899

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

fix: added the ollama backend in simbauq example

da1440d

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

chore: set aggregation by mean in simbauq example

11b180f

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

psschwei reviewed Apr 9, 2026

View reviewed changes

docs/examples/simbauq/README.md Outdated Show resolved Hide resolved

chore: fixed a typo in the simbauq README.md file

6c6c099

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

chore: added scikit-learn as required dependency for simbauq strategy

78fe6c7

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

planetf1 requested changes Apr 10, 2026

View reviewed changes

radum2275 and others added 5 commits April 10, 2026 17:31

Update test/stdlib/sampling/test_simbauq.py

65a1268

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

Update test/stdlib/sampling/test_simbauq.py

41728a5

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

Update mellea/stdlib/sampling/simbauq.py

f90a466

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

Update mellea/stdlib/sampling/simbauq.py

1cd588c

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

Update mellea/stdlib/sampling/simbauq.py

c8bd228

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>

chore: revised the dependencies for simbauq strategy

e0b5952

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>

Conversation

radum2275 commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Sampling Strategy PR

Description

Implementation Checklist

Base Class

Return Value

Integration

Testing

Uh oh!

github-actions bot commented Apr 3, 2026

Uh oh!

planetf1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

radum2275 commented Apr 9, 2026

Uh oh!

Uh oh!

psschwei commented Apr 9, 2026

Uh oh!

radum2275 commented Apr 9, 2026

Uh oh!

psschwei commented Apr 9, 2026

Uh oh!

psschwei commented Apr 9, 2026

Uh oh!

radum2275 commented Apr 9, 2026

Uh oh!

radum2275 commented Apr 9, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

planetf1 Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

radum2275 Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

planetf1 commented Apr 10, 2026

Uh oh!

psschwei commented Apr 10, 2026

Uh oh!

planetf1 commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

radum2275 commented Apr 3, 2026 •

edited

Loading