Quick start

Xarray-schema provides a simple class-based API for defining schemas and validating Xarray objects (and their components).

All schema objects objects have .validate() and to_json methods.

[1]:
import numpy as np
import xarray as xr

from xarray_schema import DataArraySchema

We’ll start with a simple example that uses the DataArraySchema to validate the following DataArray:

[2]:
da = xr.DataArray(np.ones((4, 10), dtype='i4'), dims=['x', 't'], name='foo')

We can create a schema for this DataArray that includes the datatype, name, and shape. Note that for the shape schema, we’ve used None as a wildcard.

[3]:
schema = DataArraySchema(dtype=np.integer, name='foo', shape=(4, None))

With our schema created, we can now validate our DataArray:

[4]:
schema.validate(da)

When we validate an object that doesn’t conform to our schema, we get a SchemaError:

[5]:
da2 = xr.DataArray(np.ones((4, 10), dtype='f4'), dims=['x', 't'], name='foo')
schema.validate(da2)
---------------------------------------------------------------------------
SchemaError                               Traceback (most recent call last)
<ipython-input-5-75422557b423> in <module>
      1 da2 = xr.DataArray(np.ones((4, 10), dtype='f4'), dims=['x', 't'], name='foo')
----> 2 schema.validate(da2)

~/Dropbox (Personal)/src/xarray-schema/xarray_schema/dataarray.py in validate(self, da)
    196
    197         if self.dtype is not None:
--> 198             self.dtype.validate(da.dtype)
    199
    200         if self.name is not None:

~/Dropbox (Personal)/src/xarray-schema/xarray_schema/components.py in validate(self, dtype)
     38         '''
     39         if not np.issubdtype(dtype, self.dtype):
---> 40             raise SchemaError(f'dtype {dtype} != {self.dtype}')
     41
     42     @property

SchemaError: dtype float32 != <class 'numpy.integer'>

Schemas can also be exported to JSON:

[6]:
schema.json
[6]:
{'dtype': 'integer', 'shape': [4, None], 'name': 'foo'}

Components

Each component of the Xarray data model is implemented as a stand alone class:

[7]:
from xarray_schema.components import (
    DTypeSchema,
    DimsSchema,
    ShapeSchema,
    NameSchema,
    ChunksSchema,
    ArrayTypeSchema,
    AttrSchema,
    AttrsSchema
)

# example constructions
dtype_schema = DTypeSchema('i4')
dims_schema = DimsSchema(('x', 'y', None))  # None is used as a wildcard
shape_schema = ShapeSchema((5, 10, None))  # None is used as a wildcard
name_schema = NameSchema('foo')
chunk_schema = ChunksSchema({'x': None, 'y': -1})  # None is used as a wildcard, -1 is used as
ArrayTypeSchema = ArrayTypeSchema(np.ndarray)

# Example usage
dtype_schema.validate(da.dtype)

# Each object schema can be exported to JSON format
chunk_schema = chunk_schema.to_json()
print(chunk_schema)
{"x": null, "y": -1}
[ ]: