logml.model_search.selection
Models training and selection
Functions
|
Compare loss of Model and Baseline Model. |
|
Kolmogorov-Smirnov Test if Sample dist is less than Baseline. |
|
Mann-Whitney/U-Test if Sample dist is less than Baseline. |
Classes
|
Trains all models and perform selection |
- logml.model_search.selection.loss_u_test(sample: Union[Sequence[Sequence[Sequence[Sequence[Sequence[Any]]]]], numpy.typing._array_like._SupportsArray[numpy.dtype], Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]], Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]], Sequence[Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]], Sequence[Sequence[Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]]], bool, int, float, complex, str, bytes, Sequence[Union[bool, int, float, complex, str, bytes]], Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]], Sequence[Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]]], Sequence[Sequence[Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]]]]], baseline: Union[Sequence[Sequence[Sequence[Sequence[Sequence[Any]]]]], numpy.typing._array_like._SupportsArray[numpy.dtype], Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]], Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]], Sequence[Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]], Sequence[Sequence[Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]]], bool, int, float, complex, str, bytes, Sequence[Union[bool, int, float, complex, str, bytes]], Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]], Sequence[Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]]], Sequence[Sequence[Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]]]]]) float
Mann-Whitney/U-Test if Sample dist is less than Baseline.
- logml.model_search.selection.loss_ks_test(sample: Union[Sequence[Sequence[Sequence[Sequence[Sequence[Any]]]]], numpy.typing._array_like._SupportsArray[numpy.dtype], Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]], Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]], Sequence[Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]], Sequence[Sequence[Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]]], bool, int, float, complex, str, bytes, Sequence[Union[bool, int, float, complex, str, bytes]], Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]], Sequence[Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]]], Sequence[Sequence[Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]]]]], baseline: Union[Sequence[Sequence[Sequence[Sequence[Sequence[Any]]]]], numpy.typing._array_like._SupportsArray[numpy.dtype], Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]], Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]], Sequence[Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]], Sequence[Sequence[Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]]], bool, int, float, complex, str, bytes, Sequence[Union[bool, int, float, complex, str, bytes]], Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]], Sequence[Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]]], Sequence[Sequence[Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]]]]]) float
Kolmogorov-Smirnov Test if Sample dist is less than Baseline.
- logml.model_search.selection.compare_model_with_baseline(model: logml.model_search.common.ModelEvaluationData, baseline: logml.model_search.common.ModelEvaluationData, min_test_size_limit=7, pvalue_threshold=0.01) dict
Compare loss of Model and Baseline Model.
If length of raw losses is less than min_test_size_limit, compare median values, else - perform one-sided U-test to check that model loss distribution is statistically less than the baselines’ one.
- Returns
test name, select: bool, pvalue.
- Return type
dict with the fields
- class logml.model_search.selection.ModelSelection(config: Optional[logml.configuration.modeling.ModelSearchSection] = None, objective_config: Optional[logml.configuration.modeling.ModelingTaskSpec] = None, hpo_config: Optional[logml.configuration.modeling.HPOSection] = None, model_provider: Optional[logml.model_search.provider.ModelProvider] = None, logger=None, dump_hpo_data=True, show_progressbar=True, min_test_size_limit=7)
Bases:
object
Trains all models and perform selection
- run(dataset: logml.data.datasets.cv_dataset.ModelingDataset)
Run model selection.
- get_model_config(model_name: str) Optional[logml.configuration.modeling.ModelSelectionConfig]
Returns ModelSelectionConfig for the model.
- train_and_evaluate(model_config: logml.configuration.modeling.ModelSelectionConfig, dataset: logml.data.datasets.cv_dataset.ModelingDataset, ds_name=None, dump_result=True) logml.model_search.common.ModelEvaluationData
HPO, train and evaluate a model.