Data fit

DataFit / ArrayDataFit schemas for parameter estimation from data. Mirrors ionworkspipeline.data_fits.

Schemas for data_fit.

class ionworks_schema.data_fit.ArrayDataFit(objectives, source='', parameters=None, cost=None, initial_guesses=None, optimizer=None, cost_logger=None, multistarts=None, num_workers=None, parallel=None, max_batch_size=None, initial_guess_sampler=None, priors=None, options=None)

Bases: DataFit

Fit the same model separately at each value of an independent variable.

Use this when you have one experiment repeated at different conditions — typically temperatures, C-rates, or pulse indices — and you want one fitted parameter set per condition rather than one global fit. objectives is keyed by the independent variable value ({298.15: ..., 313.15: ...}); each entry is fitted independently and the results can be post-processed to extract how parameters depend on the variable.

All other fields behave the same as DataFit.

Extends: ionworks_schema.data_fit.data_fit.DataFit

See also: ionworkspipeline.ArrayDataFit (runtime implementation).

objectives: Annotated[dict[Any, Annotated[dict[str, Any] | BaseObjective | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])]] | Annotated[dict[str, Any] | BaseObjective | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])], FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])]
model_config = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'populate_by_name': True, 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialise private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

class ionworks_schema.data_fit.DataFit(objectives, source='', parameters=None, cost=None, initial_guesses=None, optimizer=None, cost_logger=None, multistarts=None, num_workers=None, parallel=None, max_batch_size=None, initial_guess_sampler=None, priors=None, options=None)

Bases: BaseSchema

Fit a model’s parameters to measured experimental data.

A DataFit step says: “run these experiments through the model, compare the result to the measurements I supply, and adjust these parameters until the agreement is as good as possible”. One or more objectives describe what experiments to compare against and which measured curves to match. The parameters dict lists which parameters are free to move during the fit, and the optional priors express what you already believe about their plausible values.

The remaining fields (cost, optimizer, initial_guesses, multistarts, …) tune how the fit runs. The defaults are sensible — you only need to set them if you want finer control over the optimisation algorithm, parallelism, or runtime budget.

Parameters

objectivesFittingObjective or DesignObjective or dict[str, FittingObjective | DesignObjective | dict]

What to fit against. Either a single objective (a CurrentDriven, MSMRHalfCell, … from iws.objectives) or a dict of named objectives if the fit spans multiple experiments.

sourcestr, optional

Free-text label for the data source (paper, dataset name, instrument). Shown in reports and provenance records.

parametersdict[str, Parameter | pybamm.Symbol | callable] | None, optional

Which parameters are being fitted, and (optionally) how they relate to each other through pybamm expressions. At least one of parameters or priors must be set. Each value can be:

  • an iwp.Parameter object, e.g. iwp.Parameter("x")

  • a pybamm expression, in which case other referenced parameters must also be supplied as iwp.Parameter objects via pybamm.Parameter wrapping. For example:

    {
        "param": 2 * pybamm.Parameter("half-param"),
        "half-param": iwp.Parameter("half-param"),
    }
    

    works, but {"param": 2 * iwp.Parameter("half-param")} does not.

  • a function that constructs a pybamm expression referencing other parameters, which must again be explicitly supplied as iwp.Parameter objects:

    {
        "main parameter": lambda x: (
            pybamm.Parameter("other parameter") * x**2
        ),
        "other parameter": iwp.Parameter("other parameter"),
    }
    

The dict key does not need to match the underlying pybamm parameter name — DataFit figures out which variable to fit from the iwp.Parameter reference.

costObjectiveFunction or str or dict or None, optional

How disagreement between model and data is summed up into a single number (e.g. sum-of-squares, log-likelihood). Leave unset for a sensible default.

initial_guessesdict[str, float] or list[dict[str, float]] or None, optional

Starting point(s) for the optimiser. One dict applies to every restart; a list of dicts provides one starting point per restart.

optimizerParameterEstimator or dict or None, optional

Which optimisation algorithm to use (e.g. CMAES, PSO, ScipyMinimize). Leave unset for the default.

cost_loggerBaseSchema or dict or None, optional

Optional logger that records the cost trajectory and parameter values across the fit, for later inspection.

multistartsint | None, optional

Number of independent restarts from different initial guesses. More restarts is more robust but takes longer.

num_workersint | None, optional

Worker processes for running restarts in parallel. None uses all CPU cores; 1 disables parallelism. Not supported on Windows.

parallelbool | None, optional

Whether to also parallelise within a single restart (for population-based optimisers). Auto-detected when None.

max_batch_sizeint | None, optional

Cap on how many restarts run together in one batch. Leave unset for an auto-chosen value.

initial_guess_samplerDistributionSampler or dict or None, optional

How to spread the multistart guesses across the parameter space (LatinHypercube by default).

priorsPrior or list[Prior] or dict or None, optional

What you already believe about the parameter values. Acts as a regulariser on the fit. May be supplied alone (the prior names become the fit parameters) or alongside parameters (priors regularise the listed fit parameters).

optionsdict[str, Any] | None, optional

Advanced dict of runtime options: seed for reproducibility, maxiters/maxtime for budgets, and low_memory to trim the log. Defaults are:

options = {
    # Random seed for reproducibility. Defaults to a seed
    # generated from the current time.
    "seed": iwutil.random.generate_seed(),
    # Reduce log size: only append entries if the cost
    # improves the best-so-far by at least 0.1%. Defaults
    # to True for deterministic optimizers.
    "low_memory": True,
    # Maximum iterations per optimization job.
    "maxiters": None,
    # Maximum wall time (seconds) per job. With multistarts
    # the total may exceed this since many jobs run.
    "maxtime": None,
}

Note: maxiters and maxtime only take effect when model.convert_to_format == 'casadi'.

Examples

>>> # build the schema with the fields you care about
>>> obj = iws.objectives.OCPHalfCell(
...     electrode="positive",
...     data_input="path/to/ocp.csv",
... )
>>> fit = iws.DataFit(
...     objectives={"ocp": obj},
...     parameters={"Q_pe": iws.Parameter(
...         "Positive electrode capacity [A.h]", initial_value=3.0, bounds=(2.0, 4.0),
...     )},
...     priors={"Q_pe": iws.priors.Prior("Q_pe", iws.stats.Normal(3.0, 0.2))},
... )
>>> config = iws.Pipeline({"ocp fit": fit}).to_config()
>>> # then submit `config` via ionworks-api

Extends: ionworks_schema.base.BaseSchema

See also: ionworkspipeline.DataFit (runtime implementation).

objectives: Annotated[dict[str, Annotated[dict[str, Any] | BaseObjective | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])]] | Annotated[dict[str, Any] | BaseObjective | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])], FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])]
source: str
parameters: dict[str, Any] | None
cost: Annotated[dict[str, Any] | BaseSchema | str | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])] | None
initial_guesses: dict[str, int | float] | list[dict[str, int | float]] | None
optimizer: Annotated[dict[str, Any] | ParameterEstimator | BaseSchema | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])] | None
cost_logger: Annotated[dict[str, Any] | BaseSchema | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])] | None
multistarts: int | None
num_workers: int | None
parallel: bool | None
max_batch_size: int | None
initial_guess_sampler: Annotated[dict[str, Any] | DistributionSampler | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])] | None
priors: Annotated[dict[str, Any] | Prior | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])] | list[Annotated[dict[str, Any] | Prior | Any, FieldInfo(annotation=NoneType, required=True, metadata=[_PydanticGeneralMetadata(union_mode='left_to_right')])]] | None
options: dict[str, Any] | None
wrap_bare_objective()

Wrap a bare objective in a dict, matching ionworkspipeline behavior.

Only applies to DataFit, not ArrayDataFit (which requires a dict keyed by independent variable values).

validate_parameters_or_priors()

At least one of parameters or priors must be supplied.

The runtime accepts both together — priors then act as regularizers on the listed fit parameters — so we mirror the runtime here rather than enforce a stricter mutual exclusion at the schema boundary.

model_config = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'populate_by_name': True, 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialise private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.