Validators#
Outbound and inbound validation pipelines for data transformation.
Reusable validator functions and composable pipelines for value normalization.
Provides functions for composable inbound/outbound value normalization (e.g., converting between pandas DataFrames and dictionaries).
- ionworks.validators.set_dataframe_backend(backend)[source]#
Set the default DataFrame backend for data fetching.
This overrides the IONWORKS_DATAFRAME_BACKEND environment variable.
- Parameters:
backend (str) – DataFrame backend to use: “polars” or “pandas”.
- Raises:
ValueError – If backend is not “polars” or “pandas”.
- Return type:
None
- ionworks.validators.get_dataframe_backend()[source]#
Get the current DataFrame backend setting.
- Returns:
Current backend: “polars” or “pandas”.
- Return type:
- exception ionworks.validators.MeasurementValidationError(message, errors=None)[source]#
Bases:
IonworksErrorException raised when measurement data validation fails.
- ionworks.validators.positive_current_is_charge(t, current, voltage)[source]#
Determine whether positive current corresponds to charging.
Fits
voltage = intercept + slope * capacityusing weighted least squares (weights = dt). If the slope is non-negative (voltage rises or stays flat as cumulative charge increases) then positive current is charging; otherwise positive current is discharging.- Parameters:
t (np.ndarray) – Time values [s].
current (np.ndarray) – Current values [A].
voltage (np.ndarray) – Voltage values [V].
- Returns:
is_charge (bool) –
Trueif positive current is charging (slope >= 0).Falseif positive current is discharging (slope < 0). ReturnsFalse(assume discharge) when there is insufficient data.p_value (float) – Two-sided p-value for the slope being nonzero. Lower means more confident. Returns 1.0 when there is insufficient data for a t-test.
- Return type:
- ionworks.validators.validate_positive_current_is_discharge(df, current_col='Current [A]', voltage_col='Voltage [V]', time_col='Time [s]', step_col=None, rest_tol=0.001)[source]#
Validate that positive current corresponds to discharge.
Discharge should cause voltage to decrease. This function analyzes the relationship between current direction and voltage change to verify the sign convention is correct.
Uses weighted least squares (V vs cumulative Q) per step, then a confidence-weighted vote across steps to decide the overall convention. This matches the algorithm in
ionworksdata.transform.- Parameters:
df (DataFrame) – Time series data with current and voltage columns (pandas or polars).
current_col (str) – Name of the current column.
voltage_col (str) – Name of the voltage column.
time_col (str) – Name of the time column.
step_col (str, optional) – Name of the step column. If provided, analyzes per-step. Otherwise, infers steps from current sign changes.
rest_tol (float) – Tolerance for considering current as zero (rest).
- Returns:
List of validation error messages. Empty if validation passes.
- Return type:
- ionworks.validators.validate_cumulative_values_reset_per_step(df, step_col='Step count', cumulative_cols=None, tolerance=1e-06)[source]#
Validate cumulative values reset to ~0 at each step and only increase.
- Parameters:
df (DataFrame) – Time series data (pandas or polars).
step_col (str) – Name of the column containing step numbers.
cumulative_cols (list[str], optional) – List of cumulative column names to validate. If None, checks for common capacity and energy columns.
tolerance (float) – Tolerance for considering a value as “zero” at step start.
- Returns:
List of validation error messages. Empty if validation passes.
- Return type:
- ionworks.validators.validate_minimum_points_per_step(df, step_col='Step count', min_points=2)[source]#
Validate that each step has at least a minimum number of data points.
- ionworks.validators.validate_time_starts_at_zero(df, tolerance=1e-06)[source]#
Validate that ‘Time [s]’ starts at 0.
- ionworks.validators.validate_time_monotonic(df, time_col='Time [s]', tolerance=1e-12)[source]#
Validate that the time column is monotonically non-decreasing.
- ionworks.validators.validate_step_count_sequential(df)[source]#
Validate that ‘Step count’ exists, starts at 0, and increases by 1.
- ionworks.validators.validate_cycle_constant_within_step(df, step_col='Step count', cycle_col=None)[source]#
Validate that cycle number does not change within a step.
- Parameters:
- Returns:
List of validation error messages. Empty if validation passes.
- Return type:
- ionworks.validators.validate_ocp_columns(df)[source]#
Validate that OCP data has required columns.
Checks that the DataFrame contains: 1. A ‘Voltage [V]’ column 2. At least one x-axis column: ‘Capacity [A.h]’, ‘Stoichiometry’, or ‘SOC’
- ionworks.validators.validate_time_series_row_count(df, max_rows=1000)[source]#
Validate that the time series does not exceed the maximum row count.
Datasets larger than
max_rowsshould be uploaded via the standard upload flow and then referenced with"db:<measurement_id>"oriwdata.DataLoader.from_db(MEASUREMENT_ID)in pipeline configurations.
- ionworks.validators.validate_measurement_data(df, strict=False, data_type=None)[source]#
Validate measurement time series data before upload.
For standard cycler data (
data_type=None), performs:Positive current should correspond to discharge (voltage decreases)
Time starts at 0
Time is monotonically non-decreasing
‘Step count’ column exists, starts at 0, and increases by 1
Cumulative values (capacity, energy) should reset at each step start and only increase within steps
Each step has at least 2 data points (strict mode only)
Cycle number does not change within a step (strict mode only)
For OCP data (
data_type="ocp"), only validates:‘Voltage [V]’ column exists
‘Step count’ column exists and is sequential
- Parameters:
df (DataFrame) – Time series data to validate (pandas or polars DataFrame).
strict (bool) – If False (default), skip strict checks. If True, run additional checks: minimum 2 points per step and cycle number constant within each step.
data_type (str | None) – The type of data being validated. Use
"ocp"for open-circuit potential data, which relaxes validation to skip current, time, capacity, and energy checks. Default isNone(standard cycler data).
- Raises:
MeasurementValidationError – If any validation checks fail. The exception contains a list of all errors found.
- Return type:
None
- ionworks.validators.df_to_dict_validator(v)[source]#
Convert DataFrame to dict with orient=’list’ for serialization.
- ionworks.validators.dict_to_df_validator(v, return_type=None)[source]#
Convert dict to DataFrame for data processing.
- Parameters:
v (Any) – Value to convert. If dict, converts to DataFrame.
return_type (str | None) – Type of DataFrame to return: “polars” or “pandas”. If None, uses the global setting from set_dataframe_backend().
- Returns:
DataFrame if input was dict, otherwise unchanged.
- Return type:
Any
- ionworks.validators.parameter_validator(v)[source]#
Convert pybamm.Symbol values to JSON-serializable form.
- ionworks.validators.float_sanitizer(v)[source]#
Sanitize float values to JSON-compatible forms.
Converts inf, -inf, and NaN to None since these are not JSON-compliant.
- ionworks.validators.bounds_tuple_validator(v)[source]#
Convert bounds 2-tuple to list for JSON serialization.
- Parameters:
v (Any) – Value to validate. If it’s a tuple with 2 elements, converts to list.
- Returns:
List if input was a 2-tuple, otherwise unchanged.
- Return type:
Any
- ionworks.validators.file_scheme_validator(v)[source]#
Convert file:// and folder:// scheme paths to serialized dicts.
Handles
file:prefixed paths (loads CSV as dict) andfolder:prefixed paths (loads time_series.csv and steps.csv as dict). All other values are returned unchanged.- Raises:
FileNotFoundError – If the file or folder path doesn’t exist.
- Parameters:
v (Any)
- Return type: