Skip to content

Global fitting of separate signals by models that may share parameters#1018

Open
erik-mansson wants to merge 1 commit intolmfit:masterfrom
erik-mansson:global-fit-proposal
Open

Global fitting of separate signals by models that may share parameters#1018
erik-mansson wants to merge 1 commit intolmfit:masterfrom
erik-mansson:global-fit-proposal

Conversation

@erik-mansson
Copy link
Copy Markdown

@erik-mansson erik-mansson commented Aug 26, 2025

Proposing tools to facilitate "global fitting" of separate signals by models that may share parameters.

Description

In principle, global fitting is already possible if one writes a custom function like in https://lmfit.github.io/lmfit-py/examples/example_fit_multi_datasets.html to concatenate all residuals to a single 1D signal array. But that example uses an objective-function to minimize(), rather than the perspective of curve model functions to fit signals. It also seems undesirable that every user would have to manage boilerplate code like that rather than work at the higher abstraction level of "I have these signals with associated models, now I just want to fit them with some parameters shared instead of as independent fits".

My previous object-oriented version that allowed multi-model fitting was uploaded in https://github.com/orgs/lmfit/discussions/1012 but it was not ready for inclusion in lmfit and included some questionable __mro__-modification for a PrefixedWrapper class (able to prefix an already existing Model instance) which in hindsight is a separate topic and not very important for global fitting. This this new version is somewhat reduced in scope and is implemented without any new classes, rather closer to a functional style. The main functionality happens by a call to multi_fit() and there are supporting functions that may be used to facilitate the preparation of models, parameters and constraints. As data structure for the signal data arrays, the associated models and any additional information (weights, keyword arguments like independent variables) this implementation supports both dicts with string keys (more informative) and lists with automatic numeric indices (shorter to write, suitable when the signal data comes as rows of a 2D-array).

TODO-markers in the code point out things that could be improved, e.g. perhaps another class than ModelResult to better represent such a multi-signal & multi-model fit. If people want it, I'm not completely against retrying an object-oriented approach also for something like a multi-model wrapper, but it was slightly improper to have it as a subclass of Model when some methods need to take different arguments or return different types of objects.

tests/test_global_fit.py has eight fairly long test cases that can be read as extended usage examples, with plotting and some assertions at the end. They give a test coverage of 72% which can of course be increased in case this PR has chances of being accepted. Three concise examples intended as user documentation are in the module-level docstring of lmfit/global_fit.py. I have not familiarized myself with the production of HTML documentation or whether I used enough markup in the docstrings, but I suppose a HTML page could be useful to create, for showing something like the the concise examples nicely formatted on the web.

Type of Changes
  • Bug fix
  • New feature
  • Refactoring / maintenance
  • [kind of] Documentation / examples
Tested on

Python: 3.10.10 | packaged by conda-forge | (main, Mar 24 2023, 20:08:06) [GCC 11.3.0]
lmfit: 1.2.2, scipy: 1.10.1, numpy: 1.24.4, asteval: 0.9.29, uncertainties: 3.1.7

Python: 3.13.3 | packaged by conda-forge | (main, Apr 14 2025, 20:44:03) [GCC 13.3.0]
lmfit: 1.3.3, scipy: 1.15.2, numpy: 2.2.5, asteval: 1.0.6, uncertainties: 3.2.3
lmfit: latest from git 26 August 2025, scipy: 1.15.2, numpy: 2.2.5, asteval: 1.0.6, uncertainties: 3.2.3

Verification
  • included docstrings that follow PEP 257?
  • referenced existing ?
  • verified that existing tests pass locally?
  • verified that the documentation builds locally?
  • squashed/minimized your commits and written descriptive commit messages?
  • [72%] added or updated existing tests to cover the changes?
  • [no, the PR isn't ready for that yet] updated the documentation and/or added an entry to the release notes (doc/whatsnew.rst)?
  • [not in doc/examples, but there is material meant as examples in the python files] added an example?

… models that may share parameters.

A previous object-oriented version, including some questionable __mro__-modification for a PrefixedWrapper, was uploaded in https://github.com/orgs/lmfit/discussions/1012
but this new version is somewhat reduced in scope and is implented without any new classes.

tests/test_global_fit.py has 8 fairly long test cases that can be read as extended usage examples, with plotting and some assertions at the end.
Three consise examples intended as user documentation are in the module-level docstring of lmfit/global_fit.py.
@newville
Copy link
Copy Markdown
Member

@erik-mansson Thanks. But as in #1012, and started with #1015, I am pretty sure we want to use a Dataset container.

In the code here, models, data, weights, and separate_kwargs must be dicts with keys that are uniform for these 4 inputs. Lists are described as possible, but that will not work - these inputs must have a key (a string, in fact) to distinguish the different parameters for the different sets of (data, model, etc). The order of the list is not important, but the names are.

FWIW, independent variables really must be separately specified for each data array to be fit. Nothing about the shape or type of the independent variables can be assumed to be consistent for different sets of data. x might be a np.linspace(-10, 10, 201) for one dataset and a list of dictionaries for another.

Encapsulating data, weights, independent_vars, and model into a named Dataset solves all of those problems, or at least puts the challenges (basically of associating the Parameter names with the correct dataset and model) into our code instead of user code. I think it will also make it cleaner to have a workflow that builds and fits Datasets separately and then together.

Like nearly every other Python project, documentation uses RST and Sphinx. Docstrings are fine, but separate docs beyond the API documentation of a docstring are also valuable.

So, thanks for the PR but I think we want a better representation.

@erik-mansson
Copy link
Copy Markdown
Author

Thanks for taking a look.

Lists are described as possible, but that will not work - these inputs must have a key (a string, in fact) to distinguish the different parameters for the different sets of (data, model, etc). The order of the list is not important, but the names are.

Well, in this PR, prefixes are not necessarily the same as the dict keys or list indices: I.e. the user has to either use some string keys or accept that list order matters for associating data and models, but multi_fit() then fully relies on the Model-instances having been configured with prefixes for their parameters (or happen to have non-conflicting names if mixing completely dissimilar models). So unless one uses repeat_model() there's no automatic link between keys/indices and prefixes. I'm not saying that's ideal, but I was this time striving for a fairly basic interface that didn't enforce any parameter naming and that accepted that a user might already have constructed a Model instance, so that it would be too late to change its prefix in the global fitting code.

FWIW, independent variables really must be separately specified for each data array to be fit.

Yes. While possible with my separate_kwargs (see e.g. test_two_unequal_indpendent_vars() -- hm, a typo in that name I see now) it will certainly be more pleasant to use Dataset classes than dicts of dicts (or lists of dicts, but I admit that I probably didn't test that combination). I didn't want to introduce any coupling between the two PRs at this early stage. I'll post a comment on #1015 now with a thought I had about that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants