logml.data.manager
Classes
|
Utility for datasets generation and handling. |
- class logml.data.manager.DatasetsManager(cfg: logml.GlobalConfig, global_params: dict, logger=None, sequential_naming=False, random_state=None, debug=False)
Bases:
object
Utility for datasets generation and handling.
- validate_ds_type(ds_type: str) str
- generate_dataset(shuffle: bool = False, sequence_no=0, ds_type: Optional[str] = None) str
Reads a dataframe and generates a dataset on top.
- list_datasets() List[str]
Returns all available datasets of a given type.
- get_dataset(dataset_hash: str) logml.data.datasets.base.BaseDataset
Returns a requested dataset.
- dump_dataset(dataset: logml.data.datasets.base.BaseDataset, sequence_no=0, transform_log=None, corr_groups: Optional[pandas.core.frame.DataFrame] = None) None
Dumps dataset and accompanying artifacts.
- generate_datasets(ds_ids: Optional[List[int]] = None) List[str]
Generates a number of datasets for a given type (based on config).
Datasets are pickled into DatasetsOutputStructure.datasets_path folder, using hash as filename.
- Returns
List of generated dataset hashes.
- dump_debug_data(name, steps)