logml.configuration.survival_analysis
- class logml.configuration.survival_analysis.SurvivalAnalysisMethod
Bases:
pydantic.main.BaseModel
Configures specific survival analysis method.
There are the following registered methods:
Survival Analysis Methods
.Show JSON schema
{ "title": "SurvivalAnalysisMethod", "description": "Configures specific survival analysis method.\n\nThere are the following registered methods: :lml:ref:`Survival Analysis Methods`.", "type": "object", "properties": { "enable": { "title": "Enable", "description": "Enable or disable this analysis.", "default": true, "type": "boolean" }, "method_id": { "title": "Method Id", "description": "Alias of survival analysis method to use. Refer to :lml:ref:`Survival Analysis Methods` for details.", "type": "string" }, "params": { "title": "Params", "description": "Parameters specific for the chosen analysis method. Refer to :lml:ref:`Survival Analysis Methods` to find out exact configurationstructure for the method.", "default": {}, "type": "object" } }, "required": [ "method_id" ] }
- field enable: bool = True
Enable or disable this analysis.
- field method_id: str [Required]
Alias of survival analysis method to use. Refer to
Survival Analysis Methods
for details.
- field params: Dict = {}
Parameters specific for the chosen analysis method. Refer to
Survival Analysis Methods
to find out exact configurationstructure for the method.
- class logml.configuration.survival_analysis.SurvivalAnalysisSetup
Bases:
pydantic.main.BaseModel
Defines configuration for a “survival analysis problem”, also called briefly “setup”.
Show JSON schema
{ "title": "SurvivalAnalysisSetup", "description": "Defines configuration for a \"survival analysis problem\", also called briefly \"setup\".", "type": "object", "properties": { "enable": { "title": "Enable", "description": "Enable or disable survival anaysis problem.", "default": true, "type": "boolean" }, "survival_metric": { "title": "Survival Metric", "description": "Column of the input dataset that contains \"time-to-event\" values. Usually this is overall\n survival (OS) or progression-free-survival (PFS) time. NOTE: deprecated, please use\n \"dataset_metadata\" section.\n ", "default": "", "type": "string" }, "event_observed": { "title": "Event Observed", "description": "Query-like expression that indicates \"events\" (\"uncensored\") samples. For example:\n \"OS_CNSR == 1\". See :ref:`Dataset Queries` for details. NOTE: deprecated, please use\n \"dataset_metadata\" section.\n ", "default": "", "type": "string" }, "event_column": { "title": "Event Column", "description": "Column used for event calculation. (We have to specify it so that is can be removed\n from features list after the dataset preprocessing).\n If you specify `event_observed: \"OS_CNSR == 1\"`, then also put `event_column: OS_CNSR`.\n If not specified, we attempt to extract column name from the `event_observed` expression.\n NOTE: deprecated, please use \"dataset_metadata\" section.\n ", "default": "", "type": "string" }, "dataset_preprocessing": { "title": "Dataset Preprocessing", "description": "Defines dataset preprocessing configuration. It runs before `time` and `event` data extraction, so make sure the final dataset contains both columns.", "default": { "enable": true, "preset": { "enable": false, "features_list": [], "remove_correlated_features": true, "nans_per_row_fraction_threshold": 0.9, "nans_fraction_threshold": 0.7, "apply_log1p_to_target": false, "drop_datetime_columns": true, "drop_dna_wt": false, "imputer": "median" }, "steps": [] }, "allOf": [ { "$ref": "#/definitions/DatasetPreprocessingSection" } ] }, "methods": { "title": "Methods", "description": "List of configurations for specific survival analysis methods to apply.", "default": [], "type": "array", "items": { "$ref": "#/definitions/SurvivalAnalysisMethod" } } }, "definitions": { "DatasetPreprocessingPresetSection": { "title": "DatasetPreprocessingPresetSection", "description": "Defines 'syntax sugar' for semi-automated data preprocessing steps generation.", "type": "object", "properties": { "enable": { "title": "Enable", "description": "Whether to enable automated generation of preprocessing steps.", "default": true, "type": "boolean" }, "features_list": { "title": "Features List", "description": "Defines a list of features (referenced by regexps) that should be selected. Additional option\n is just to reference a configuration file that contains the required list of features:\n ...\n features_list: sub_cfg/features_list.yaml # a config file\n ...\n ", "default": [], "anyOf": [ { "type": "string" }, { "type": "array", "items": { "type": "string" } } ] }, "remove_correlated_features": { "title": "Remove Correlated Features", "description": "Whether to include a step that removes correlated features.", "default": true, "type": "boolean" }, "nans_per_row_fraction_threshold": { "title": "Nans Per Row Fraction Threshold", "description": "Defines maximum acceptable fraction of NaNs within a row.", "default": 0.9, "type": "number" }, "nans_fraction_threshold": { "title": "Nans Fraction Threshold", "description": "Defines maximum acceptable fraction of NaNs within a column.", "default": 0.7, "type": "number" }, "apply_log1p_to_target": { "title": "Apply Log1P To Target", "description": "Whether to apply log1p transformation to target column (applicable only for regression problems).", "default": false, "type": "boolean" }, "drop_datetime_columns": { "title": "Drop Datetime Columns", "description": "Whether to drop date time columns.", "default": true, "type": "boolean" }, "drop_dna_wt": { "title": "Drop Dna Wt", "description": "Whether to drop DNA WT values after one-hot-encoding.", "default": false, "type": "boolean" }, "imputer": { "title": "Imputer", "description": "Imputer to use. Possible values: (median, mice)", "default": "median", "type": "string" } } }, "PreprocessingStep": { "title": "PreprocessingStep", "description": "Defines data preprocessing step.", "type": "object", "properties": { "enable": { "title": "Enable", "description": "Whether to enable preprocessing step.", "default": true, "type": "boolean" }, "transformer": { "title": "Transformer", "description": "Alias of transformer to use. Please refer to :lml:ref:`Data Transformers` for details.", "type": "string" }, "params": { "title": "Params", "description": "Parameters that will be passed to the correspoding transformer instance.", "default": {}, "type": "object" } }, "required": [ "transformer" ] }, "DatasetPreprocessingSection": { "title": "DatasetPreprocessingSection", "description": "Defines data preprocessing section for modeling/survival setup.", "type": "object", "properties": { "enable": { "title": "Enable", "description": "Whether to enable Preprocessing Pipeline for dataset transformation.", "default": true, "type": "boolean" }, "preset": { "title": "Preset", "default": { "enable": false, "features_list": [], "remove_correlated_features": true, "nans_per_row_fraction_threshold": 0.9, "nans_fraction_threshold": 0.7, "apply_log1p_to_target": false, "drop_datetime_columns": true, "drop_dna_wt": false, "imputer": "median" }, "allOf": [ { "$ref": "#/definitions/DatasetPreprocessingPresetSection" } ] }, "steps": { "title": "Steps", "description": "Defines a list of preprocessing steps (transformations) to apply. See :lml:ref:`Data Transformers` for details.", "default": [], "type": "array", "items": { "$ref": "#/definitions/PreprocessingStep" } } } }, "SurvivalAnalysisMethod": { "title": "SurvivalAnalysisMethod", "description": "Configures specific survival analysis method.\n\nThere are the following registered methods: :lml:ref:`Survival Analysis Methods`.", "type": "object", "properties": { "enable": { "title": "Enable", "description": "Enable or disable this analysis.", "default": true, "type": "boolean" }, "method_id": { "title": "Method Id", "description": "Alias of survival analysis method to use. Refer to :lml:ref:`Survival Analysis Methods` for details.", "type": "string" }, "params": { "title": "Params", "description": "Parameters specific for the chosen analysis method. Refer to :lml:ref:`Survival Analysis Methods` to find out exact configurationstructure for the method.", "default": {}, "type": "object" } }, "required": [ "method_id" ] } } }
- Fields
- field enable: bool = True
Enable or disable survival anaysis problem.
- field survival_metric: str = ''
Column of the input dataset that contains “time-to-event” values. Usually this is overall survival (OS) or progression-free-survival (PFS) time. NOTE: deprecated, please use “dataset_metadata” section.
- field event_observed: str = ''
Query-like expression that indicates “events” (“uncensored”) samples. For example: “OS_CNSR == 1”. See Dataset Queries for details. NOTE: deprecated, please use “dataset_metadata” section.
- field event_column: str = ''
Column used for event calculation. (We have to specify it so that is can be removed from features list after the dataset preprocessing). If you specify event_observed: “OS_CNSR == 1”, then also put event_column: OS_CNSR. If not specified, we attempt to extract column name from the event_observed expression. NOTE: deprecated, please use “dataset_metadata” section.
- field dataset_preprocessing: logml.configuration.modeling.DatasetPreprocessingSection = DatasetPreprocessingSection(enable=True, preset=DatasetPreprocessingPresetSection(enable=False, features_list=[], remove_correlated_features=True, nans_per_row_fraction_threshold=0.9, nans_fraction_threshold=0.7, apply_log1p_to_target=False, drop_datetime_columns=True, drop_dna_wt=False, imputer='median'), steps=[])
Defines dataset preprocessing configuration. It runs before time and event data extraction, so make sure the final dataset contains both columns.
- field methods: List[logml.configuration.survival_analysis.SurvivalAnalysisMethod] = []
List of configurations for specific survival analysis methods to apply.
- get_target_methods() List[str]
Returns enabled SA methods.
- get_sa_method(method_id: str)
Returns corresponding method for a given section.
- class logml.configuration.survival_analysis.SurvivalAnalysisSection
Bases:
pydantic.main.BaseModel
Defines survival analysis section.
Survival analysis goal is to consider the data from survival modeling perspecive and then make a conclusion based on the results.
Example is Kaplan-Meier univariate survival estimator, which shows how good median feature values separate data to two groups.
Show JSON schema
{ "title": "SurvivalAnalysisSection", "description": "Defines survival analysis section.\n\nSurvival analysis goal is to consider the data from survival modeling perspecive and\nthen make a conclusion based on the results.\n\nExample is Kaplan-Meier univariate survival estimator, which shows how good median feature\nvalues separate data to two groups.", "type": "object", "properties": { "enable": { "title": "Enable", "description": "Enable or disable Survival Anaysis.", "default": true, "type": "boolean" }, "problems": { "title": "Problems", "description": "Defines list of \"survival problem setup\" configurations. Usually problems are similar to one another, but have different target variables.", "default": {}, "type": "object", "additionalProperties": { "$ref": "#/definitions/SurvivalAnalysisSetup" } } }, "definitions": { "DatasetPreprocessingPresetSection": { "title": "DatasetPreprocessingPresetSection", "description": "Defines 'syntax sugar' for semi-automated data preprocessing steps generation.", "type": "object", "properties": { "enable": { "title": "Enable", "description": "Whether to enable automated generation of preprocessing steps.", "default": true, "type": "boolean" }, "features_list": { "title": "Features List", "description": "Defines a list of features (referenced by regexps) that should be selected. Additional option\n is just to reference a configuration file that contains the required list of features:\n ...\n features_list: sub_cfg/features_list.yaml # a config file\n ...\n ", "default": [], "anyOf": [ { "type": "string" }, { "type": "array", "items": { "type": "string" } } ] }, "remove_correlated_features": { "title": "Remove Correlated Features", "description": "Whether to include a step that removes correlated features.", "default": true, "type": "boolean" }, "nans_per_row_fraction_threshold": { "title": "Nans Per Row Fraction Threshold", "description": "Defines maximum acceptable fraction of NaNs within a row.", "default": 0.9, "type": "number" }, "nans_fraction_threshold": { "title": "Nans Fraction Threshold", "description": "Defines maximum acceptable fraction of NaNs within a column.", "default": 0.7, "type": "number" }, "apply_log1p_to_target": { "title": "Apply Log1P To Target", "description": "Whether to apply log1p transformation to target column (applicable only for regression problems).", "default": false, "type": "boolean" }, "drop_datetime_columns": { "title": "Drop Datetime Columns", "description": "Whether to drop date time columns.", "default": true, "type": "boolean" }, "drop_dna_wt": { "title": "Drop Dna Wt", "description": "Whether to drop DNA WT values after one-hot-encoding.", "default": false, "type": "boolean" }, "imputer": { "title": "Imputer", "description": "Imputer to use. Possible values: (median, mice)", "default": "median", "type": "string" } } }, "PreprocessingStep": { "title": "PreprocessingStep", "description": "Defines data preprocessing step.", "type": "object", "properties": { "enable": { "title": "Enable", "description": "Whether to enable preprocessing step.", "default": true, "type": "boolean" }, "transformer": { "title": "Transformer", "description": "Alias of transformer to use. Please refer to :lml:ref:`Data Transformers` for details.", "type": "string" }, "params": { "title": "Params", "description": "Parameters that will be passed to the correspoding transformer instance.", "default": {}, "type": "object" } }, "required": [ "transformer" ] }, "DatasetPreprocessingSection": { "title": "DatasetPreprocessingSection", "description": "Defines data preprocessing section for modeling/survival setup.", "type": "object", "properties": { "enable": { "title": "Enable", "description": "Whether to enable Preprocessing Pipeline for dataset transformation.", "default": true, "type": "boolean" }, "preset": { "title": "Preset", "default": { "enable": false, "features_list": [], "remove_correlated_features": true, "nans_per_row_fraction_threshold": 0.9, "nans_fraction_threshold": 0.7, "apply_log1p_to_target": false, "drop_datetime_columns": true, "drop_dna_wt": false, "imputer": "median" }, "allOf": [ { "$ref": "#/definitions/DatasetPreprocessingPresetSection" } ] }, "steps": { "title": "Steps", "description": "Defines a list of preprocessing steps (transformations) to apply. See :lml:ref:`Data Transformers` for details.", "default": [], "type": "array", "items": { "$ref": "#/definitions/PreprocessingStep" } } } }, "SurvivalAnalysisMethod": { "title": "SurvivalAnalysisMethod", "description": "Configures specific survival analysis method.\n\nThere are the following registered methods: :lml:ref:`Survival Analysis Methods`.", "type": "object", "properties": { "enable": { "title": "Enable", "description": "Enable or disable this analysis.", "default": true, "type": "boolean" }, "method_id": { "title": "Method Id", "description": "Alias of survival analysis method to use. Refer to :lml:ref:`Survival Analysis Methods` for details.", "type": "string" }, "params": { "title": "Params", "description": "Parameters specific for the chosen analysis method. Refer to :lml:ref:`Survival Analysis Methods` to find out exact configurationstructure for the method.", "default": {}, "type": "object" } }, "required": [ "method_id" ] }, "SurvivalAnalysisSetup": { "title": "SurvivalAnalysisSetup", "description": "Defines configuration for a \"survival analysis problem\", also called briefly \"setup\".", "type": "object", "properties": { "enable": { "title": "Enable", "description": "Enable or disable survival anaysis problem.", "default": true, "type": "boolean" }, "survival_metric": { "title": "Survival Metric", "description": "Column of the input dataset that contains \"time-to-event\" values. Usually this is overall\n survival (OS) or progression-free-survival (PFS) time. NOTE: deprecated, please use\n \"dataset_metadata\" section.\n ", "default": "", "type": "string" }, "event_observed": { "title": "Event Observed", "description": "Query-like expression that indicates \"events\" (\"uncensored\") samples. For example:\n \"OS_CNSR == 1\". See :ref:`Dataset Queries` for details. NOTE: deprecated, please use\n \"dataset_metadata\" section.\n ", "default": "", "type": "string" }, "event_column": { "title": "Event Column", "description": "Column used for event calculation. (We have to specify it so that is can be removed\n from features list after the dataset preprocessing).\n If you specify `event_observed: \"OS_CNSR == 1\"`, then also put `event_column: OS_CNSR`.\n If not specified, we attempt to extract column name from the `event_observed` expression.\n NOTE: deprecated, please use \"dataset_metadata\" section.\n ", "default": "", "type": "string" }, "dataset_preprocessing": { "title": "Dataset Preprocessing", "description": "Defines dataset preprocessing configuration. It runs before `time` and `event` data extraction, so make sure the final dataset contains both columns.", "default": { "enable": true, "preset": { "enable": false, "features_list": [], "remove_correlated_features": true, "nans_per_row_fraction_threshold": 0.9, "nans_fraction_threshold": 0.7, "apply_log1p_to_target": false, "drop_datetime_columns": true, "drop_dna_wt": false, "imputer": "median" }, "steps": [] }, "allOf": [ { "$ref": "#/definitions/DatasetPreprocessingSection" } ] }, "methods": { "title": "Methods", "description": "List of configurations for specific survival analysis methods to apply.", "default": [], "type": "array", "items": { "$ref": "#/definitions/SurvivalAnalysisMethod" } } } } } }
- Fields
- field enable: bool = True
Enable or disable Survival Anaysis.
- field problems: Dict[str, logml.configuration.survival_analysis.SurvivalAnalysisSetup] = {}
Defines list of “survival problem setup” configurations. Usually problems are similar to one another, but have different target variables.
- get_target_problems() List[str]
Returns modeling setups for which a given section is enabled.