Quick start
Xarray-schema provides a simple class-based API for defining schemas and validating Xarray objects (and their components).
All schema objects objects have .validate()
and to_json
methods.
[1]:
import numpy as np
import xarray as xr
from xarray_schema import DataArraySchema
We’ll start with a simple example that uses the DataArraySchema
to validate the following DataArray
:
[2]:
da = xr.DataArray(np.ones((4, 10), dtype='i4'), dims=['x', 't'], name='foo')
We can create a schema for this DataArray
that includes the datatype, name, and shape. Note that for the shape schema, we’ve used None
as a wildcard.
[3]:
schema = DataArraySchema(dtype=np.integer, name='foo', shape=(4, None))
With our schema created, we can now validate our DataArray
:
[4]:
schema.validate(da)
When we validate an object that doesn’t conform to our schema, we get a SchemaError
:
[5]:
da2 = xr.DataArray(np.ones((4, 10), dtype='f4'), dims=['x', 't'], name='foo')
schema.validate(da2)
---------------------------------------------------------------------------
SchemaError Traceback (most recent call last)
<ipython-input-5-75422557b423> in <module>
1 da2 = xr.DataArray(np.ones((4, 10), dtype='f4'), dims=['x', 't'], name='foo')
----> 2 schema.validate(da2)
~/Dropbox (Personal)/src/xarray-schema/xarray_schema/dataarray.py in validate(self, da)
196
197 if self.dtype is not None:
--> 198 self.dtype.validate(da.dtype)
199
200 if self.name is not None:
~/Dropbox (Personal)/src/xarray-schema/xarray_schema/components.py in validate(self, dtype)
38 '''
39 if not np.issubdtype(dtype, self.dtype):
---> 40 raise SchemaError(f'dtype {dtype} != {self.dtype}')
41
42 @property
SchemaError: dtype float32 != <class 'numpy.integer'>
Schemas can also be exported to JSON:
[6]:
schema.json
[6]:
{'dtype': 'integer', 'shape': [4, None], 'name': 'foo'}
Components
Each component of the Xarray data model is implemented as a stand alone class:
[7]:
from xarray_schema.components import (
DTypeSchema,
DimsSchema,
ShapeSchema,
NameSchema,
ChunksSchema,
ArrayTypeSchema,
AttrSchema,
AttrsSchema
)
# example constructions
dtype_schema = DTypeSchema('i4')
dims_schema = DimsSchema(('x', 'y', None)) # None is used as a wildcard
shape_schema = ShapeSchema((5, 10, None)) # None is used as a wildcard
name_schema = NameSchema('foo')
chunk_schema = ChunksSchema({'x': None, 'y': -1}) # None is used as a wildcard, -1 is used as
ArrayTypeSchema = ArrayTypeSchema(np.ndarray)
# Example usage
dtype_schema.validate(da.dtype)
# Each object schema can be exported to JSON format
chunk_schema = chunk_schema.to_json()
print(chunk_schema)
{"x": null, "y": -1}
[ ]: