logml.analysis.items.modeling
ML Analysis and related routines
Functions
|
Returns a corresponding Strata for a given stratum id. |
|
Default estimation of high resources requirements based on dataset rows and columns number. |
Classes
|
FI Aggregation Step. |
|
EDA analysis |
|
FI extraction step. |
|
Generate Report Step. |
|
Modeling analysis result. |
|
Wraps analysis algo into data loading/saving and config. |
|
Wraps analysis algo into data loading/saving and config. |
|
Package report step |
|
Models selection step. |
|
Survival step result |
|
Survival analysis step. |
|
Train Model Step. |
- logml.analysis.items.modeling.get_stratum(global_cfg, stratum_id: str) Optional[logml.configuration.stratification.Strata]
Returns a corresponding Strata for a given stratum id.
- class logml.analysis.items.modeling.ModelingConfigBase
Bases:
pydantic.main.BaseModel
Config with common modeling params
Show JSON schema
{ "title": "ModelingConfigBase", "description": "Config with common modeling params", "type": "object", "properties": { "stratum_id": { "title": "Stratum Id", "type": "string" }, "problem_id": { "title": "Problem Id", "type": "string" } }, "required": [ "stratum_id" ] }
- field stratum_id: str [Required]
- field problem_id: Optional[str] [Required]
- class logml.analysis.items.modeling.ModelingItemBase(*args, **kwargs)
Bases:
logml.analysis.base_item.AnalysisItem
Wraps analysis algo into data loading/saving and config.
- make_params() dict
Make global params for modeling.
- abstract run()
See parent description.
- abstract get_result()
See parent description.
- class logml.analysis.items.modeling.EdaStep(*args, **kwargs)
Bases:
logml.analysis.items.modeling.ModelingItemBase
EDA analysis
- LABEL = 'run_eda'
- PARAMS_CLS
- run()
Run modeling step
- get_result()
Modeling saves results directly, nothing to return here.
- classmethod estimate_resources(res: logml.analysis.common.JobResourcesReqs, cfg: GlobalConfig = None, df: Optional[pandas.core.frame.DataFrame] = None, strata_shapes: Optional[Dict[str, tuple]] = None, item_params: Any = None) None
See parent description
- get_paths_to_release() Optional[List[logml.analysis.base_item.ReleasePath]]
See parent description.
- class logml.analysis.items.modeling.ModelingTransformerStepConfig
Bases:
logml.analysis.items.modeling.ModelingConfigBase
Config for modelig dataset generation.
Show JSON schema
{ "title": "ModelingTransformerStepConfig", "description": "Config for modelig dataset generation.", "type": "object", "properties": { "stratum_id": { "title": "Stratum Id", "type": "string" }, "problem_id": { "title": "Problem Id", "type": "string" }, "n_dataset": { "title": "N Dataset", "type": "integer" } }, "required": [ "stratum_id", "n_dataset" ] }
- Fields
- field n_dataset: int [Required]
- class logml.analysis.items.modeling.ModelingTransformer(*args, **kwargs)
Bases:
logml.analysis.items.modeling.ModelingItemBase
Wraps analysis algo into data loading/saving and config.
- LABEL = 'modeling_data_transform'
- PARAMS_CLS
alias of
logml.analysis.items.modeling.ModelingTransformerStepConfig
- run()
Run modeling step
- get_result()
Modeling saves results directly, nothing to return here.
- classmethod estimate_resources(res: logml.analysis.common.JobResourcesReqs, cfg: GlobalConfig = None, df: Optional[pandas.core.frame.DataFrame] = None, strata_shapes: Optional[Dict[str, tuple]] = None, item_params: Any = None) None
See parent description
- class logml.analysis.items.modeling.TrainModelStepConfig
Bases:
logml.analysis.items.modeling.ModelingConfigBase
Config for model training.
Show JSON schema
{ "title": "TrainModelStepConfig", "description": "Config for model training.", "type": "object", "properties": { "stratum_id": { "title": "Stratum Id", "type": "string" }, "problem_id": { "title": "Problem Id", "type": "string" }, "model_name": { "title": "Model Name", "type": "string" } }, "required": [ "stratum_id", "model_name" ] }
- Fields
- field model_name: str [Required]
- class logml.analysis.items.modeling.TrainModelStep(*args, **kwargs)
Bases:
logml.analysis.items.modeling.ModelingItemBase
Train Model Step.
- LABEL = 'train_model'
- PARAMS_CLS
- run()
Run modeling step
- get_result()
Modeling saves results directly, nothing to return here.
- classmethod estimate_resources(res: logml.analysis.common.JobResourcesReqs, cfg: GlobalConfig = None, df: Optional[pandas.core.frame.DataFrame] = None, strata_shapes: Optional[Dict[str, tuple]] = None, item_params: Any = None) None
See parent description
- class logml.analysis.items.modeling.SelectModelsStep(*args, **kwargs)
Bases:
logml.analysis.items.modeling.ModelingItemBase
Models selection step.
- LABEL = 'select_models'
- PARAMS_CLS
- run()
Run modeling step
- get_result()
Modeling saves results directly, nothing to return here.
- class logml.analysis.items.modeling.ExtractFeatureImportanceStepConfig
Bases:
logml.analysis.items.modeling.ModelingConfigBase
Parameters for FI extraction.
Show JSON schema
{ "title": "ExtractFeatureImportanceStepConfig", "description": "Parameters for FI extraction.", "type": "object", "properties": { "stratum_id": { "title": "Stratum Id", "type": "string" }, "problem_id": { "title": "Problem Id", "type": "string" }, "dataset_n": { "title": "Dataset N", "type": "integer" }, "model_name": { "title": "Model Name", "type": "string" } }, "required": [ "stratum_id", "dataset_n", "model_name" ] }
- Fields
- field dataset_n: int [Required]
- field model_name: str [Required]
- class logml.analysis.items.modeling.ExtractFeatureImportanceStep(*args, **kwargs)
Bases:
logml.analysis.items.modeling.ModelingItemBase
FI extraction step.
- LABEL = 'extract_fi'
- PARAMS_CLS
alias of
logml.analysis.items.modeling.ExtractFeatureImportanceStepConfig
- run()
Run features selection step
- get_result()
Modeling saves results directly, nothing to return here.
- classmethod estimate_resources(res: logml.analysis.common.JobResourcesReqs, cfg: GlobalConfig = None, df: Optional[pandas.core.frame.DataFrame] = None, strata_shapes: Optional[Dict[str, tuple]] = None, item_params: Any = None) None
See parent description.
- class logml.analysis.items.modeling.ModelingAnalysisResult(selected_models: Optional[List[logml.model_search.common.ModelEvaluationData]] = None, fi_models: Optional[List[str]] = None, objective: Optional[logml.configuration.modeling.ModelingTaskSpec] = None, stratum_id: Optional[str] = None, problem_id: Optional[str] = None, num_features: int = - 1, top_features: Optional[List[str]] = None, analysis_name: Optional[str] = None, unique_name: Optional[str] = None, status: Optional[str] = None, fi_relative_loss: Optional[Dict[str, float]] = None)
Bases:
logml.analysis.base_item.AnalysisResult
Modeling analysis result.
- class logml.analysis.items.modeling.AggregateFeatureImportanceStep(*args, **kwargs)
Bases:
logml.analysis.items.modeling.ModelingItemBase
FI Aggregation Step.
Note that this step is completely optional: in any case, all results are required. Here we only generate some sort of short summary table which is available even if reporting step fails fo some reason.
- LABEL = 'combine_fi'
- PARAMS_CLS
- run()
See parent description.
- generate_analysis_result(fi_runner, max_top_features_rank=30)
Generates (high-level) modeling analysis result
- get_result()
Modeling saves results directly, nothing to return here.
- generate_analysis_metadata() Optional[logml.analysis.base_item.AnalysisMetadata]
Creates metadata object.
Major use for metadata object is to point to high-level analysis results (for example, ultimate result of Survival Modeling, as opposed to sub-level analysis step, like model search). Those results are then gathered and rendered at the summary report page.
- get_paths_to_release() Optional[List[logml.analysis.base_item.ReleasePath]]
See parent description.
- class logml.analysis.items.modeling.SurvivalAnalysisResult(stratum_id: Optional[str] = None, problem_id: Optional[str] = None, num_features: int = - 1, top_features: Optional[List[str]] = None, analysis_name: Optional[str] = None, unique_name: Optional[str] = None, status: Optional[str] = None)
Bases:
logml.analysis.base_item.AnalysisResult
Survival step result
- class logml.analysis.items.modeling.SurvivalAnalysisStep(*args, **kwargs)
Bases:
logml.analysis.items.modeling.ModelingItemBase
Survival analysis step.
- LABEL = 'survival_analysis'
- PARAMS_CLS
- run()
Run modeling step
- generate_analysis_result()
Generate artifacts.
- get_result()
Modeling saves results directly, nothing to return here.
- generate_analysis_metadata() Optional[logml.analysis.base_item.AnalysisMetadata]
Creates metadata object.
Major use for metadata object is to point to high-level analysis results (for example, ultimate result of Survival Modeling, as opposed to sub-level analysis step, like model search). Those results are then gathered and rendered at the summary report page.
- class logml.analysis.items.modeling.GenerateReportStep(*args, **kwargs)
Bases:
logml.analysis.items.modeling.ModelingItemBase
Generate Report Step.
- LABEL = 'generate_report'
- run()
Run modeling step
- make_params() dict
Make global params for modeling.
- get_result()
See parent description.
- classmethod estimate_resources(res: logml.analysis.common.JobResourcesReqs, cfg: GlobalConfig = None, df: Optional[pandas.core.frame.DataFrame] = None, strata_shapes: Optional[Dict[str, tuple]] = None, item_params: Any = None) None
See parent description.
- class logml.analysis.items.modeling.PackageReportStep(*args, **kwargs)
Bases:
logml.analysis.items.modeling.ModelingItemBase
Package report step
- LABEL = 'package_report'
- run()
Run modeling step
- make_params() dict
Make global params for modeling.
- get_result()
See parent description.
- logml.analysis.items.modeling.stratum_high_resources_estimator(res: logml.analysis.common.JobResourcesReqs, strata_shapes: Optional[Dict[str, tuple]] = None, params: Optional[Any] = None, cpu_mul: float = 1.0, mem_mul: float = 2.0)
Default estimation of high resources requirements based on dataset rows and columns number.