Validators#

Outbound and inbound validation pipelines for data transformation.

Reusable validator functions and composable pipelines for value normalization.

Provides functions for composable inbound/outbound value normalization (e.g., converting between pandas DataFrames and dictionaries).

ionworks.validators.set_dataframe_backend(backend)[source]#

Set the default DataFrame backend for data fetching.

This overrides the IONWORKS_DATAFRAME_BACKEND environment variable.

Parameters:

backend (str) – DataFrame backend to use: “polars” or “pandas”.

Raises:

ValueError – If backend is not “polars” or “pandas”.

Return type:

None

ionworks.validators.get_dataframe_backend()[source]#

Get the current DataFrame backend setting.

Returns:

Current backend: “polars” or “pandas”.

Return type:

str

exception ionworks.validators.MeasurementValidationError(message, errors=None)[source]#

Bases: IonworksError

Exception raised when measurement data validation fails.

Parameters:
Return type:

None

__init__(message, errors=None)[source]#

Initialize the IonworksError.

Parameters:
  • message (str | dict[str, Any]) – Error message string or dict containing error details. Supports both the legacy {"detail": ...} format and the new standardized {"error_code": ..., "message": ..., "detail": ...} format.

  • status_code (int | None) – Optional HTTP status code.

  • errors (list[str] | None)

Return type:

None

ionworks.validators.positive_current_is_charge(t, current, voltage)[source]#

Determine whether positive current corresponds to charging.

Fits voltage = intercept + slope * capacity using weighted least squares (weights = dt). If the slope is non-negative (voltage rises or stays flat as cumulative charge increases) then positive current is charging; otherwise positive current is discharging.

Parameters:
  • t (np.ndarray) – Time values [s].

  • current (np.ndarray) – Current values [A].

  • voltage (np.ndarray) – Voltage values [V].

Returns:

  • is_charge (bool) – True if positive current is charging (slope >= 0). False if positive current is discharging (slope < 0). Returns False (assume discharge) when there is insufficient data.

  • p_value (float) – Two-sided p-value for the slope being nonzero. Lower means more confident. Returns 1.0 when there is insufficient data for a t-test.

Return type:

tuple[bool, float]

ionworks.validators.validate_positive_current_is_discharge(df, current_col='Current [A]', voltage_col='Voltage [V]', time_col='Time [s]', step_col=None, rest_tol=0.001)[source]#

Validate that positive current corresponds to discharge.

Discharge should cause voltage to decrease. This function analyzes the relationship between current direction and voltage change to verify the sign convention is correct.

Uses weighted least squares (V vs cumulative Q) per step, then a confidence-weighted vote across steps to decide the overall convention. This matches the algorithm in ionworksdata.transform.

Parameters:
  • df (DataFrame) – Time series data with current and voltage columns (pandas or polars).

  • current_col (str) – Name of the current column.

  • voltage_col (str) – Name of the voltage column.

  • time_col (str) – Name of the time column.

  • step_col (str, optional) – Name of the step column. If provided, analyzes per-step. Otherwise, infers steps from current sign changes.

  • rest_tol (float) – Tolerance for considering current as zero (rest).

Returns:

List of validation error messages. Empty if validation passes.

Return type:

list[str]

ionworks.validators.validate_cumulative_values_reset_per_step(df, step_col='Step count', cumulative_cols=None, tolerance=1e-06)[source]#

Validate cumulative values reset to ~0 at each step and only increase.

Parameters:
  • df (DataFrame) – Time series data (pandas or polars).

  • step_col (str) – Name of the column containing step numbers.

  • cumulative_cols (list[str], optional) – List of cumulative column names to validate. If None, checks for common capacity and energy columns.

  • tolerance (float) – Tolerance for considering a value as “zero” at step start.

Returns:

List of validation error messages. Empty if validation passes.

Return type:

list[str]

ionworks.validators.validate_minimum_points_per_step(df, step_col='Step count', min_points=2)[source]#

Validate that each step has at least a minimum number of data points.

Parameters:
  • df (DataFrame) – Time series data (pandas or polars).

  • step_col (str) – Name of the column containing step numbers.

  • min_points (int) – Minimum number of points required per step.

Returns:

List of validation error messages. Empty if validation passes.

Return type:

list[str]

ionworks.validators.validate_time_starts_at_zero(df, tolerance=1e-06)[source]#

Validate that ‘Time [s]’ starts at 0.

Parameters:
  • df (DataFrame) – Time series data (pandas or polars).

  • tolerance (float) – Tolerance for considering the start value as zero.

Returns:

List of validation error messages. Empty if validation passes.

Return type:

list[str]

ionworks.validators.validate_time_monotonic(df, time_col='Time [s]', tolerance=1e-12)[source]#

Validate that the time column is monotonically non-decreasing.

Parameters:
  • df (DataFrame) – Time series data (pandas or polars).

  • time_col (str) – Name of the time column.

  • tolerance (float) – Numerical tolerance; time[i] must be >= time[i-1] - tolerance.

Returns:

List of validation error messages. Empty if validation passes.

Return type:

list[str]

ionworks.validators.validate_step_count_sequential(df)[source]#

Validate that ‘Step count’ exists, starts at 0, and increases by 1.

Parameters:

df (DataFrame) – Time series data (pandas or polars).

Returns:

List of validation error messages. Empty if validation passes.

Return type:

list[str]

ionworks.validators.validate_cycle_constant_within_step(df, step_col='Step count', cycle_col=None)[source]#

Validate that cycle number does not change within a step.

Parameters:
  • df (DataFrame) – Time series data (pandas or polars).

  • step_col (str) – Name of the column containing step numbers.

  • cycle_col (str, optional) – Name of the column containing cycle numbers. If None, tries common names.

Returns:

List of validation error messages. Empty if validation passes.

Return type:

list[str]

ionworks.validators.validate_ocp_columns(df)[source]#

Validate that OCP data has required columns.

Checks that the DataFrame contains: 1. A ‘Voltage [V]’ column 2. At least one x-axis column: ‘Capacity [A.h]’, ‘Stoichiometry’, or ‘SOC’

Parameters:

df (DataFrame) – Time series data (pandas or polars).

Returns:

List of validation error messages. Empty if validation passes.

Return type:

list[str]

ionworks.validators.validate_time_series_row_count(df, max_rows=1000)[source]#

Validate that the time series does not exceed the maximum row count.

Datasets larger than max_rows should be uploaded via the standard upload flow and then referenced with "db:<measurement_id>" or iwdata.DataLoader.from_db(MEASUREMENT_ID) in pipeline configurations.

Parameters:
  • df (DataFrame) – Time series data (pandas or polars).

  • max_rows (int) – Maximum allowed number of rows.

Returns:

List of validation error messages. Empty if validation passes.

Return type:

list[str]

ionworks.validators.validate_measurement_data(df, strict=False, data_type=None)[source]#

Validate measurement time series data before upload.

For standard cycler data (data_type=None), performs:

  1. Positive current should correspond to discharge (voltage decreases)

  2. Time starts at 0

  3. Time is monotonically non-decreasing

  4. ‘Step count’ column exists, starts at 0, and increases by 1

  5. Cumulative values (capacity, energy) should reset at each step start and only increase within steps

  6. Each step has at least 2 data points (strict mode only)

  7. Cycle number does not change within a step (strict mode only)

For OCP data (data_type="ocp"), only validates:

  1. ‘Voltage [V]’ column exists

  2. ‘Step count’ column exists and is sequential

Parameters:
  • df (DataFrame) – Time series data to validate (pandas or polars DataFrame).

  • strict (bool) – If False (default), skip strict checks. If True, run additional checks: minimum 2 points per step and cycle number constant within each step.

  • data_type (str | None) – The type of data being validated. Use "ocp" for open-circuit potential data, which relaxes validation to skip current, time, capacity, and energy checks. Default is None (standard cycler data).

Raises:

MeasurementValidationError – If any validation checks fail. The exception contains a list of all errors found.

Return type:

None

ionworks.validators.df_to_dict_validator(v)[source]#

Convert DataFrame to dict with orient=’list’ for serialization.

Parameters:

v (Any)

Return type:

Any

ionworks.validators.dict_to_df_validator(v, return_type=None)[source]#

Convert dict to DataFrame for data processing.

Parameters:
  • v (Any) – Value to convert. If dict, converts to DataFrame.

  • return_type (str | None) – Type of DataFrame to return: “polars” or “pandas”. If None, uses the global setting from set_dataframe_backend().

Returns:

DataFrame if input was dict, otherwise unchanged.

Return type:

Any

ionworks.validators.parameter_validator(v)[source]#

Convert pybamm.Symbol values to JSON-serializable form.

Parameters:

v (Any)

Return type:

Any

ionworks.validators.float_sanitizer(v)[source]#

Sanitize float values to JSON-compatible forms.

Converts inf, -inf, and NaN to None since these are not JSON-compliant.

Parameters:

v (Any)

Return type:

Any

ionworks.validators.bounds_tuple_validator(v)[source]#

Convert bounds 2-tuple to list for JSON serialization.

Parameters:

v (Any) – Value to validate. If it’s a tuple with 2 elements, converts to list.

Returns:

List if input was a 2-tuple, otherwise unchanged.

Return type:

Any

ionworks.validators.file_scheme_validator(v)[source]#

Convert file:// and folder:// scheme paths to serialized dicts.

Handles file: prefixed paths (loads CSV as dict) and folder: prefixed paths (loads time_series.csv and steps.csv as dict). All other values are returned unchanged.

Raises:

FileNotFoundError – If the file or folder path doesn’t exist.

Parameters:

v (Any)

Return type:

Any

ionworks.validators.run_validators_outbound(v)[source]#

Recursively apply outbound validators to values and nested containers.

Parameters:

v (Any)

Return type:

Any

ionworks.validators.run_validators_inbound(v)[source]#

Recursively apply inbound validators to values and nested containers.

Parameters:

v (Any)

Return type:

Any