logml.models.base

Functions

calculate_metrics(objective, y_true, y_pred)

Returns metrics for given model and data.

cast_params_to_int(params, names)

Given dictionary of model params, transforms values for 'names' to int.

complete_proba(raw_y_pred_proba, ...)

Adds inverse probas for second class that was missed by the model while predicting.

Classes

BaseModel([params, logger])

LogML model interface.

CVMetricsResult(metric_results)

Maintains full set of metrics, one per CV fold.

MetricsResult(values, float] =, errors, ...)

Result of estimator evaluation.

class logml.models.base.BaseModel(params: Optional[dict] = None, logger=None)

Bases: object

LogML model interface.

DEFAULT_PARAMS = {}
PARAMS_SPACE = {}
TASK = None
TAGS = None
F_MODEL = None
FE_MODEL_ATTRIBUTE = 'coef_'
get_estimator(**kwargs)

Returns a default estimator.

get_models_attributes(attribute_name: str) Union[List[object], object]

Retrieves a given attribute from available models.

iterate_over_folds(dataset: logml.data.ModelingDataset) list

Returns list of CV indexes and models zipped: [( (train idx, test idx), model ), …]

evaluate(dataset: logml.data.ModelingDataset, return_loss: bool = False, return_score: bool = False, metric: Union[str, list] = None, predict_args: dict = None, **kwargs) CVMetricsResult

Returns metrics on CV folds.

evaluate_fold(estimator, x_features: numpy.ndarray, y_true: numpy.ndarray, return_loss: Optional[bool] = None, return_score: Optional[bool] = None, metric: Optional[Union[str, list]] = None, class_labels=None, predict_args: Optional[dict] = None, **kwargs) logml.models.base.MetricsResult

Evaluate for single fold.

predict_fold(estimator, x_features: numpy.ndarray, class_labels=None, **kwargs) Tuple[numpy.ndarray, numpy.ndarray]

Predict for single fold.

predict(dataset: logml.data.ModelingDataset, **kwargs) Tuple[np.dnarray, np.ndarray]

Predict using final model

predict_cv(dataset: logml.data.ModelingDataset, **kwargs) Tuple[List, List]

Scores a given dataset and returns the predictions.

get_raw_feature_importances() list

Returns feature importance values extracted from the inner models.

get_feature_importance(dataset: logml.data.ModelingDataset) Optional[dict]

Get median FIs per feature across CV folds.

property final_model: Any
fit(dataset: logml.data.ModelingDataset, fit_params: Optional[Dict] = None, train_final_model=False)

For each CV fold fits a model.

is_fit()
logml.models.base.cast_params_to_int(params: dict, names: List[str])

Given dictionary of model params, transforms values for ‘names’ to int. The reason is that HPO likes to sample float values when a model anticipates strictly integer.

class logml.models.base.MetricsResult(values: typing.Dict[str, float] = <factory>, errors: typing.Dict[str, float] = <factory>, not_applicable: typing.Dict[str, float] = <factory>, loss_or_score: typing.Optional[str] = None)

Bases: object

Result of estimator evaluation.

values: Dict[str, float]
errors: Dict[str, float]
not_applicable: Dict[str, float]
loss_or_score: Optional[str] = None
get_value(metric_name: str) float
class logml.models.base.CVMetricsResult(metric_results: List[logml.models.base.MetricsResult])

Bases: object

Maintains full set of metrics, one per CV fold.

get_mean_value(metric_name: str) float
logml.models.base.calculate_metrics(objective: logml.common.ModelingTask, y_true: numpy.ndarray, y_pred: numpy.ndarray, y_pred_proba: Optional[numpy.ndarray] = None, return_loss: Optional[bool] = None, return_score: Optional[bool] = None, metric: Optional[Union[str, list]] = None, **kwargs) logml.models.base.MetricsResult

Returns metrics for given model and data.

logml.models.base.complete_proba(raw_y_pred_proba: numpy.ndarray, pred_class_labels: numpy.ndarray, class_labels: numpy.ndarray) numpy.ndarray

Adds inverse probas for second class that was missed by the model while predicting. For pred_class_labels pass model.classes_

Raises RuntimeError when impossible to compensate.