Skip to content

feat: Proposed SIMBAUQ Sampling Strategy#785

Open
radum2275 wants to merge 23 commits intogenerative-computing:mainfrom
radum2275:feat/simba_uq
Open

feat: Proposed SIMBAUQ Sampling Strategy#785
radum2275 wants to merge 23 commits intogenerative-computing:mainfrom
radum2275:feat/simba_uq

Conversation

@radum2275
Copy link
Copy Markdown

@radum2275 radum2275 commented Apr 3, 2026

Sampling Strategy PR

Use this template when adding or modifying sampling strategies in mellea/stdlib/sampling/.

Description

Implementation Checklist

Base Class

  • Extends appropriate base class:
    • BaseSamplingStrategy if your changes are mostly modifying the repair and/or select_from_failure functions
    • SamplingStrategy if your changes involve a new sample method
    • Other defined sampling strategies if your implementation is similar to existing implementations

Return Value

  • Returns a properly typed SamplingResult. Specifically, this means:
    • ModelOutputThunks in sample_generations are properly typed from the Component and the parsed_repr is the expected type.

Integration

  • Strategy exported in mellea/stdlib/sampling/__init__.py

Testing

  • Tests added to tests/sampling/
  • New code has 100% coverage
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Radu Marinescu added 4 commits April 2, 2026 14:12
Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>
Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>
Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>
Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>
@radum2275 radum2275 requested a review from a team as a code owner April 3, 2026 16:39
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

The PR description has been updated. Please fill out the template for your PR to be reviewed.

@jakelorocco jakelorocco changed the title Proposed SIMBAUQ Sampling Strategy feat: Proposed SIMBAUQ Sampling Strategy Apr 3, 2026
@github-actions github-actions bot added the enhancement New feature or request label Apr 3, 2026
Copy link
Copy Markdown
Contributor

@planetf1 planetf1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also noticed we don't export SOFAISamplingStrategy in all - not an issue from this PR, but observed

Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
@radum2275 radum2275 requested a review from a team as a code owner April 7, 2026 18:50
radum2275 and others added 10 commits April 7, 2026 19:50
Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>
Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>
Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>
@radum2275
Copy link
Copy Markdown
Author

@planetf1 @jakelorocco

I made all the required changes. I also replaced the RITS backend in my example with the ollama one.

However, I need some guidance with the following: the "classifier" confidence estimation method we developed requires a probabilisitic (skelearn) classifier, which we either receive from the user or we train it on-the-fly based off of the training examples provided by the user. I'd like to pre-train one using the datasets we already collected for our paper and have it as default option but it needs to live somewhere in the package as a serialised object (e.g., pickle file). What would be the best way to do that without messing up too much with the package structure. Thanks.

@psschwei
Copy link
Copy Markdown
Member

psschwei commented Apr 9, 2026

it needs to live somewhere in the package as a serialised object (e.g., pickle file)

Do you have an estimate on how large this file would be? If it's tens of MBs that's probably not a problem, but if we're looking at hundreds of MBs or a GB+ then could be a different story.

@radum2275
Copy link
Copy Markdown
Author

@psschwei it's actually not that big. the one we trained for our paper was about 250KB. it's a basic sklearn RandomForestClassifier.

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>
@psschwei
Copy link
Copy Markdown
Member

psschwei commented Apr 9, 2026

about 250KB

cool, I don't think that will be a problem

@psschwei
Copy link
Copy Markdown
Member

psschwei commented Apr 9, 2026

I'm assuming then we would need to add sklearn as another dependency?

If we're pickling in a file, we probably should consider how to best do so safely.

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>
@radum2275
Copy link
Copy Markdown
Author

@psschwei yes, added scikit-learn as a required dependency in pyproject.toml

@radum2275
Copy link
Copy Markdown
Author

I'm assuming then we would need to add sklearn as another dependency?

If we're pickling in a file, we probably should consider how to best do so safely.

sure, any suggestion is welcome :). i usually do this:

pickle.dump(model, open(filename, 'wb'))
# some time later...
loaded_model = pickle.load(open(filename, 'rb'))

except ImportError:
msg = (
"scipy is required for harmonic mean aggregation. "
"Please install with `pip install scipy`."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct command, but if this is base library should we have scipy as a core dependency? And what's the relationship to granite-retriever?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added scipy to the core dependencies next to numpy and scikit-learn required by this sampling strategy. However, as far as I know scikit-learn requires scipy, so probably we only need numpy and scikit-learn. Please advise.

The granite_retriever dependency group defined in pyproject.toml already contains the sentence-transformers required by the sbert similarity metric. Should we move sentence-transformers to the core dependencies?

@planetf1
Copy link
Copy Markdown
Contributor

I'm assuming then we would need to add sklearn as another dependency?

If we're pickling in a file, we probably should consider how to best do so safely.

Apologies noticed this after doing a per-line review.
There's no sklearn any more -- it's scikit-learn

Also I'm not 100% sure of the intent -- if this is stdlib is it core function? If so shouldn't all dependencies be in our core dependencies.

radum2275 and others added 5 commits April 10, 2026 17:31
Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
Co-authored-by: Nigel Jones <nigel.l.jones+git@gmail.com>
@psschwei
Copy link
Copy Markdown
Member

Also I'm not 100% sure of the intent -- if this is stdlib is it core function? If so shouldn't all dependencies be in our core dependencies.

Agree, deps should be in the core deps if this is going into core.
I could also see a case for putting this in contribs rather than core stdlib (numpy/sklearn (old habits die hard)/scipy aren't huge like pytorch but they do add some size)

@planetf1
Copy link
Copy Markdown
Contributor

Also I'm not 100% sure of the intent -- if this is stdlib is it core function? If so shouldn't all dependencies be in our core dependencies.

Agree, deps should be in the core deps if this is going into core. I could also see a case for putting this in contribs rather than core stdlib (numpy/sklearn (old habits die hard)/scipy aren't huge like pytorch but they do add some size)

So the key decision here is where this code belongs. What do you think @jakelorocco -- is this core stdlib, or an optional, additional strategy?

I can see there might be three options
a) core
b) shipped with mellea, but not core & in an optional package (we may not have precedent for this, but it's an option)
c) contrib

once we've agreed on this we can nail the actual dependency config needed.

Signed-off-by: Radu Marinescu <radu.marinescu@ie.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: Integrating Similarity‑Based Aggregation for Uncertainty Quantification into Mellea

3 participants