XarrayWriter#
- class XarrayWriter(params=None, **kwargs)#
Bases:
WriterAbstract[str,Dataset]Write Xarray dataset.
- Parameters:
params (t.Any | None)
- add_metadata(ds, add_dataset_parameters=True, add_commit=True)#
Set some dataset attributes with information on how it was created.
Wrapper around
get_metadata(). Attributes already present will not be overwritten.- Parameters:
ds (Dataset) – Dataset to add global attributes to. This is not done in-place.
add_dataset_params – Add the parent dataset parameters values to serialization if True (default) and if
parametersis not a string. The parent parameters won’t overwrite the values ofparameters.add_commit (bool) – If True (default), try to find the current commit hash of the directory containing the script called.
add_dataset_parameters (bool)
- Return type:
- send_calls_together(calls, client, chop=None, format=None, **kwargs)#
Send multiple calls together.
If Dask is correctly configured, the writing calls will be executed in parallel.
For all calls within a group, a list of delayed writing calls is constructed. It is then computed all at once using
client.compute(delayed), however to avoid lingering results (because we only care about the side-effect of writing to file, not the computed result), we get rid of ‘futures’ as soon as completed. This avoid a blow up in memory:for future in distributed.as_completed(client.compute(delayed)): log.debug("future completed: %s", future)
- Parameters:
client (Client) – Dask
Clientinstance.chop (int | None) – If None (default), all calls are sent together. If chop is an integer, groups of calls of size
chop(at most) will be sent one after the other, calls within each group being run in parallel.kwargs – Passed to writing function. Overwrites the defaults from
TO_NETCDF_KWARGS, whatever the value of function is.calls (abc.Sequence[CallXr])
format (t.Literal['nc', 'zarr', None])
- send_single_call(call: CallXr, format: Literal[None] = None, compute: Literal[True] = True, **kwargs) None | BaseStore#
- send_single_call(call: CallXr, format: Literal['nc'], compute: Literal[True], **kwargs) None
- send_single_call(call: CallXr, format: Literal['zarr'], compute: Literal[True], **kwargs) BaseStore
- send_single_call(call: CallXr, *, compute: Literal[False], **kwargs) Delayed
Execute a single call.
- Parameters:
kwargs – Passed to the writing function.
call (CallXr)
format (t.Literal['nc', 'zarr', None])
compute (bool)
- Return type:
Delayed | None | BaseStore
- write(data, target=None, client=None, **kwargs)#
Write datasets to multiple targets.
Each dataset is written to its corresponding target (filename or store location). Directories will automatically be created if necessary. Metadata is added to the dataset.
- Parameters:
data (xr.Dataset | abc.Sequence[xr.Dataset]) – Dataset or Sequence of datasets to write.
target (str | abc.Sequence[str] | None) – If None (default), target location(s) are automatically obtained via
DataManagerBase.get_source().client (Client | None) – Dask
distributed.Clientinstance. If present multiple write calls will be send in parallel. Seesend_calls_together()for details. If left to None, the write calls will be sent serially.kwargs – Passed to the function that writes to disk (
xarray.Dataset.to_netcdf()orxarray.Dataset.to_zarr()).
- Return type:
t.Any