logml.configuration.cross_validation
Functions
|
Prepare CV class type at its params. |
Classes
|
Type of CV splits: k-fold or shuffle |
- class logml.configuration.cross_validation.CVSplitType(value)
Bases:
logml.common.StrEnum
Type of CV splits: k-fold or shuffle
- KFOLD = 'kfold'
- SHUFFLE = 'shuffle'
- class logml.configuration.cross_validation.CrossValidationSection
Bases:
pydantic.main.BaseModel
Configure CV application for the dataset.
Show JSON schema
{ "title": "CrossValidationSection", "description": "Configure CV application for the dataset.", "type": "object", "properties": { "random_state": { "title": "Random State", "description": "State to initialize random numbers generation.", "type": "integer" }, "split_type": { "description": "Configures coverage of splits. 'kfold' covers dataset completely, 'shuffle' - does not guarantee it due to sampling.", "default": "kfold", "allOf": [ { "$ref": "#/definitions/CVSplitType" } ] }, "n_folds": { "title": "N Folds", "description": "How many CV folds should be produced.", "default": 20, "type": "integer" }, "test_size": { "title": "Test Size", "description": "Which portion of the dataset to leave for evaluation of the fold.", "default": 0.2, "type": "number" }, "type": { "title": "Type", "description": "To be set automatically. Cross Validation strategy alias to use (\"kfold\", \"stratifiedkfold\", etc.). Reference: https://scikit-learn.org/stable/modules/classes.html#module-sklearn.model_selection", "default": "", "type": "string" }, "params": { "title": "Params", "description": "To be set automatically.Parameters that will be passed to corresponding Scikit-learn classes. Please refer to the official Scikit-learn documentation for details.", "default": {}, "type": "object" } }, "definitions": { "CVSplitType": { "title": "CVSplitType", "description": "Type of CV splits: k-fold or shuffle", "enum": [ "kfold", "shuffle" ], "type": "string" } } }
- Fields
- field random_state: Optional[int] = None
State to initialize random numbers generation.
- field split_type: logml.configuration.cross_validation.CVSplitType = CVSplitType.KFOLD
Configures coverage of splits. ‘kfold’ covers dataset completely, ‘shuffle’ - does not guarantee it due to sampling.
- field n_folds: int = 20
How many CV folds should be produced.
- field test_size: float = 0.2
Which portion of the dataset to leave for evaluation of the fold.
- field type: str = ''
To be set automatically. Cross Validation strategy alias to use (“kfold”, “stratifiedkfold”, etc.). Reference: https://scikit-learn.org/stable/modules/classes.html#module-sklearn.model_selection
- field params: dict = {}
To be set automatically.Parameters that will be passed to corresponding Scikit-learn classes. Please refer to the official Scikit-learn documentation for details.
- get_cv_params(objective: Optional[logml.common.ModelingTask] = None, generator_type: Optional[str] = None) Tuple[str, dict]
Returns CV class type and parameters.
- logml.configuration.cross_validation.make_cv_params(stratify: bool = False, cv_type: logml.configuration.cross_validation.CVSplitType = CVSplitType.KFOLD, n_folds=100, test_size=0.25, random_state=None, generator_type: Optional[str] = None) Tuple[str, Dict]
Prepare CV class type at its params.