check_data

Warning

🚧 Sprout is still in active development and evolving quickly, so the documentation and functionality may not work as described and could undergo substantial changes 🚧

check_data(data: pl.DataFrame, resource_properties: ResourceProperties)

Check that the DataFrame matches the requirements in the resource properties.

Run a few checks to compare between the data and the properties on the items:

Data Properties
Column names field.name
Column types field.types
Column values’ types field.types
Column values’ constraints field.constraints

The error messages are generally in the format of:

# {data item}:

There is a mismatch found:

- In the properties: {mismatch}
- In the data: {mismatch}

Parameters

data : pl.DataFrame

A Polars DataFrame.

resource_properties : ResourceProperties

The specific ResourceProperties for the data.

Returns

pl.DataFrame

The data if checks all pass.

Raises

ExceptionGroup[CheckError]

If the resource properties are incorrect.

ValueError

If column names in the data are incorrect.

ExceptionGroup[ValueError]

If data types in the data are incorrect.

Examples

import seedcase_sprout as sp

sp.check_data(
    data=sp.example_data(),
    resource_properties=sp.example_resource_properties()
)
shape: (3, 3)
id name value
i64 str f64
34 "Helly R" 123.123
99 "Mark S" 9988.0
100 "Ms Casey" -76.0009