Dataset#

class Dataset(**kwargs)#

Bases: Generic[T_Params, T_Source, T_Data], Section

DataManager base object.

Registers modules for parameters management, source management, data loading, and data writing.

The parameters (stored in params) are treated as global across the instance, and those are the value that will be used when calling various methods. Few methods may allow to complete them, fewer to overwrite them temporarily. Parameters should be changed using set_params(), which may will the cache that some plugin use. save_excursion() can be used to change parameters temporarily inside a with block.

Parameters:

params (t.Any | None)

ID: str | None = None#

Long name to identify uniquely this data-manager class.

Loader#

alias of LoaderAbstract

Params#

alias of ParamsManagerAbstract

SHORTNAME: str | None = None#

Short name to refer to this data-manager class.

Source#

alias of SourceAbstract

Writer#

alias of WriterAbstract

get_data(*args, **kwargs)#

Return data object.

Wraps around loader.get_data().

Return type:

T_Data

get_data_sets(params_maps=None, params_sets=None, **kwargs)#

Return data for specific sets of parameters.

Each set of parameter will specify one filename. Parameters that do not change from one set to the next do not need to be specified if they are fixed (by setting them in the DataManager). The sets can be specified with either one of params_maps or params_sets.

Parameters:
  • params_maps (Sequence[Mapping[str, Any]] | None) –

    Each set is specified by a mapping of parameters names to a value:

    [{'Y': 2020, 'm': 1, 'd': 15},
     {'Y': 2021, 'm': 2, 'd': 24},
     {'Y': 2022, 'm', 6, 'd': 2}]
    

    This will give 3 filenames for 3 different dates. Note that here, the parameters do not need to be the same for all sets, for example in a fourth set we could have {'Y': 2023, 'm': 1, 'd': 10, 'depth': 50} to override the value of ‘depth’ set in the DataManager parameters.

  • params_sets (Sequence[Sequence] | None) –

    Here each set is specified by sequence of parameters values. This first row gives the order of parameters. The same input as before can be written as:

    [['Y', 'm', 'd'],
     [2020, 1, 15],
     [2021, 2, 24],
     [2022, 6, 2]]
    

    Here the changing parameters must remain the same for the whole sequence.

  • kwargs – Arguments passed to get_data().

Returns:

data – List of data objects corresponding to each set of parameters. Subclasses can overwrite this method to specify how to combine them into one if needed.

Return type:

T_Data | list[T_Data]

get_source(*args, **kwargs)#

Return source for the data.

Can be filenames, URL, store object, etc.

Wraps around source.get_source().

Return type:

T_Source

property params: T_Params#

Parameters values for this instance.

reset(callbacks=True, **kwargs)#

Call all registered callbacks when parameters are reset/changed.

Plugins should register callback in the dictionary _RESET_CALLBACKS during _init_plugin(). Callbacks should be functions that take the data manager as first argument, then any number of keyword arguments.

Parameters:

callbacks (bool | list[str]) – If True all callbacks are run (default), if False none are run. Can also be a list of specific callback names to run (keys in the dictionary _RESET_CALLBACKS).

reset_params(params=None, reset=True, **kwargs)#

Set parameters values.

Old parameters values are discarded.

Parameters:
  • reset (bool | list[str]) – Passed to reset().

  • kwargs – Other parameters values in the form name=value. Parameters will be taken in order of first available in: kwargs, params, PARAMS_DEFAULTS.

  • params (Any | None)

save_excursion(save_cache=False)#

Save and restore current parameters after a with block.

For instance:

# we have some parameters, self.params["p"] = 0
with self.save_excursion():
    # we change them
    self.set_params(p=2)
    self.get_data()

# we are back to self.params["p"] = 0

Any exception happening in the with block will be raised.

Parameters:

save_cache (bool) – If true, save and restore the cache. The context reset the parameters of the data manager using set_params() and then restore any saved key in the cache, without overwriting. This may lead to unexpected behavior and is disabled by default.

Returns:

context – Context object containing the original parameters.

Return type:

_ParamsContext

set_params(params=None, reset=True, **kwargs)#

Update one or more parameters values.

Other parameters are kept.

Parameters:
  • reset (bool | list[str]) – Passed to reset().

  • kwargs – Other parameters values in the form name=value.

  • params (Any | None)

write(*args, **kwargs)#

Write data to target.

Wraps around writer.write().

Return type:

Any