Data fit¶
DataFit / ArrayDataFit schemas for parameter estimation from data.
Mirrors ionworkspipeline.data_fits.
Schemas for data_fit.
- class ionworks_schema.data_fit.ArrayDataFit(objectives, source='', parameters=None, cost=None, initial_guesses=None, optimizer=None, cost_logger=None, multistarts=None, num_workers=None, parallel=None, max_batch_size=None, initial_guess_sampler=None, priors=None, options=None)¶
Bases:
DataFitFit the same model separately at each value of an independent variable.
Use this when you have one experiment repeated at different conditions — typically temperatures, C-rates, or pulse indices — and you want one fitted parameter set per condition rather than one global fit.
objectivesis keyed by the independent variable value ({298.15: ..., 313.15: ...}); each entry is fitted independently and the results can be post-processed to extract how parameters depend on the variable.All other fields behave the same as
DataFit.Extends:
ionworks_schema.data_fit.data_fit.DataFitSee also:
ionworkspipeline.ArrayDataFit(runtime implementation).- objectives: Annotated[dict[Any, Annotated[dict[str, Any] | BaseObjective | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])]] | Annotated[dict[str, Any] | BaseObjective | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])], FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])]¶
- model_config = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'populate_by_name': True, 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ionworks_schema.data_fit.DataFit(objectives, source='', parameters=None, cost=None, initial_guesses=None, optimizer=None, cost_logger=None, multistarts=None, num_workers=None, parallel=None, max_batch_size=None, initial_guess_sampler=None, priors=None, options=None)¶
Bases:
BaseSchemaFit a model’s parameters to measured experimental data.
A
DataFitstep says: “run these experiments through the model, compare the result to the measurements I supply, and adjust these parameters until the agreement is as good as possible”. One or moreobjectivesdescribe what experiments to compare against and which measured curves to match. Theparametersdict lists which parameters are free to move during the fit, and the optionalpriorsexpress what you already believe about their plausible values.The remaining fields (
cost,optimizer,initial_guesses,multistarts, …) tune how the fit runs. The defaults are sensible — you only need to set them if you want finer control over the optimisation algorithm, parallelism, or runtime budget.Parameters¶
- objectivesFittingObjective or DesignObjective or dict[str, FittingObjective | DesignObjective | dict]
What to fit against. Either a single objective (a
CurrentDriven,MSMRHalfCell, … fromiws.objectives) or a dict of named objectives if the fit spans multiple experiments.- sourcestr, optional
Free-text label for the data source (paper, dataset name, instrument). Shown in reports and provenance records.
- parametersdict[str, Parameter | pybamm.Symbol | callable] | None, optional
Which parameters are being fitted, and (optionally) how they relate to each other through pybamm expressions. At least one of
parametersorpriorsmust be set. Each value can be:an
iwp.Parameterobject, e.g.iwp.Parameter("x")a pybamm expression, in which case other referenced parameters must also be supplied as
iwp.Parameterobjects viapybamm.Parameterwrapping. For example:{ "param": 2 * pybamm.Parameter("half-param"), "half-param": iwp.Parameter("half-param"), }
works, but
{"param": 2 * iwp.Parameter("half-param")}does not.a function that constructs a pybamm expression referencing other parameters, which must again be explicitly supplied as
iwp.Parameterobjects:{ "main parameter": lambda x: ( pybamm.Parameter("other parameter") * x**2 ), "other parameter": iwp.Parameter("other parameter"), }
The dict key does not need to match the underlying pybamm parameter name —
DataFitfigures out which variable to fit from theiwp.Parameterreference.- costObjectiveFunction or str or dict or None, optional
How disagreement between model and data is summed up into a single number (e.g. sum-of-squares, log-likelihood). Leave unset for a sensible default.
- initial_guessesdict[str, float] or list[dict[str, float]] or None, optional
Starting point(s) for the optimiser. One dict applies to every restart; a list of dicts provides one starting point per restart.
- optimizerParameterEstimator or dict or None, optional
Which optimisation algorithm to use (e.g.
CMAES,PSO,ScipyMinimize). Leave unset for the default.- cost_loggerBaseSchema or dict or None, optional
Optional logger that records the cost trajectory and parameter values across the fit, for later inspection.
- multistartsint | None, optional
Number of independent restarts from different initial guesses. More restarts is more robust but takes longer.
- num_workersint | None, optional
Worker processes for running restarts in parallel.
Noneuses all CPU cores;1disables parallelism. Not supported on Windows.- parallelbool | None, optional
Whether to also parallelise within a single restart (for population-based optimisers). Auto-detected when
None.- max_batch_sizeint | None, optional
Cap on how many restarts run together in one batch. Leave unset for an auto-chosen value.
- initial_guess_samplerDistributionSampler or dict or None, optional
How to spread the multistart guesses across the parameter space (
LatinHypercubeby default).- priorsPrior or list[Prior] or dict or None, optional
What you already believe about the parameter values. Acts as a regulariser on the fit. May be supplied alone (the prior names become the fit parameters) or alongside
parameters(priors regularise the listed fit parameters).- optionsdict[str, Any] | None, optional
Advanced dict of runtime options:
seedfor reproducibility,maxiters/maxtimefor budgets, andlow_memoryto trim the log. Defaults are:options = { # Random seed for reproducibility. Defaults to a seed # generated from the current time. "seed": iwutil.random.generate_seed(), # Reduce log size: only append entries if the cost # improves the best-so-far by at least 0.1%. Defaults # to True for deterministic optimizers. "low_memory": True, # Maximum iterations per optimization job. "maxiters": None, # Maximum wall time (seconds) per job. With multistarts # the total may exceed this since many jobs run. "maxtime": None, }
Note:
maxitersandmaxtimeonly take effect whenmodel.convert_to_format == 'casadi'.
Examples¶
>>> # build the schema with the fields you care about >>> obj = iws.objectives.OCPHalfCell( ... electrode="positive", ... data_input="path/to/ocp.csv", ... ) >>> fit = iws.DataFit( ... objectives={"ocp": obj}, ... parameters={"Q_pe": iws.Parameter( ... "Positive electrode capacity [A.h]", initial_value=3.0, bounds=(2.0, 4.0), ... )}, ... priors={"Q_pe": iws.priors.Prior("Q_pe", iws.stats.Normal(3.0, 0.2))}, ... ) >>> config = iws.Pipeline({"ocp fit": fit}).to_config() >>> # then submit `config` via ionworks-api
Extends:
ionworks_schema.base.BaseSchemaSee also:
ionworkspipeline.DataFit(runtime implementation).- objectives: Annotated[dict[str, Annotated[dict[str, Any] | BaseObjective | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])]] | Annotated[dict[str, Any] | BaseObjective | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])], FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])]¶
- cost: Annotated[dict[str, Any] | BaseSchema | str | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])] | None¶
- optimizer: Annotated[dict[str, Any] | ParameterEstimator | BaseSchema | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])] | None¶
- cost_logger: Annotated[dict[str, Any] | BaseSchema | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])] | None¶
- initial_guess_sampler: Annotated[dict[str, Any] | DistributionSampler | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])] | None¶
- priors: Annotated[dict[str, Any] | Prior | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])] | list[Annotated[dict[str, Any] | Prior | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])]] | None¶
- wrap_bare_objective()¶
Wrap a bare objective in a dict, matching ionworkspipeline behavior.
Only applies to DataFit, not ArrayDataFit (which requires a dict keyed by independent variable values).
- validate_parameters_or_priors()¶
At least one of
parametersorpriorsmust be supplied.The runtime accepts both together — priors then act as regularizers on the listed fit parameters — so we mirror the runtime here rather than enforce a stricter mutual exclusion at the schema boundary.
- model_config = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'populate_by_name': True, 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].