logml.configuration.survival_analysis

class logml.configuration.survival_analysis.SurvivalAnalysisMethod

Bases: pydantic.main.BaseModel

Configures specific survival analysis method.

There are the following registered methods: Survival Analysis Methods.

Show JSON schema

{
   "title": "SurvivalAnalysisMethod",
   "description": "Configures specific survival analysis method.\n\nThere are the following registered methods: :lml:ref:`Survival Analysis Methods`.",
   "type": "object",
   "properties": {
      "enable": {
         "title": "Enable",
         "description": "Enable or disable this analysis.",
         "default": true,
         "type": "boolean"
      },
      "method_id": {
         "title": "Method Id",
         "description": "Alias of survival analysis method to use. Refer to :lml:ref:`Survival Analysis Methods` for details.",
         "type": "string"
      },
      "params": {
         "title": "Params",
         "description": "Parameters specific for the chosen analysis method. Refer to :lml:ref:`Survival Analysis Methods` to find out exact configurationstructure for the method.",
         "default": {},
         "type": "object"
      }
   },
   "required": [
      "method_id"
   ]
}

Fields

enable (bool)
method_id (str)
params (Dict)

field enable: bool = True: Enable or disable this analysis.

field method_id: str [Required]: Alias of survival analysis method to use. Refer to Survival Analysis Methods for details.

field params: Dict = {}: Parameters specific for the chosen analysis method. Refer to Survival Analysis Methods to find out exact configurationstructure for the method.

class logml.configuration.survival_analysis.SurvivalAnalysisSetup

Bases: pydantic.main.BaseModel

Defines configuration for a “survival analysis problem”, also called briefly “setup”.

Show JSON schema

{
   "title": "SurvivalAnalysisSetup",
   "description": "Defines configuration for a \"survival analysis problem\", also called briefly \"setup\".",
   "type": "object",
   "properties": {
      "enable": {
         "title": "Enable",
         "description": "Enable or disable survival anaysis problem.",
         "default": true,
         "type": "boolean"
      },
      "survival_metric": {
         "title": "Survival Metric",
         "description": "Column of the input dataset that contains \"time-to-event\" values. Usually this is overall\n            survival (OS) or progression-free-survival (PFS) time. NOTE: deprecated, please use\n            \"dataset_metadata\" section.\n        ",
         "default": "",
         "type": "string"
      },
      "event_observed": {
         "title": "Event Observed",
         "description": "Query-like expression that indicates \"events\" (\"uncensored\") samples. For example:\n            \"OS_CNSR == 1\". See :ref:`Dataset Queries` for details. NOTE: deprecated, please use\n            \"dataset_metadata\" section.\n        ",
         "default": "",
         "type": "string"
      },
      "event_column": {
         "title": "Event Column",
         "description": "Column used for event calculation. (We have to specify it so that is can be removed\n            from features list after the dataset preprocessing).\n            If you specify `event_observed: \"OS_CNSR == 1\"`, then also put `event_column: OS_CNSR`.\n            If not specified, we attempt to extract column name from the `event_observed` expression.\n            NOTE: deprecated, please use \"dataset_metadata\" section.\n        ",
         "default": "",
         "type": "string"
      },
      "dataset_preprocessing": {
         "title": "Dataset Preprocessing",
         "description": "Defines dataset preprocessing configuration. It runs before `time` and `event` data extraction, so make sure the final dataset contains both columns.",
         "default": {
            "enable": true,
            "preset": {
               "enable": false,
               "features_list": [],
               "remove_correlated_features": true,
               "nans_per_row_fraction_threshold": 0.9,
               "nans_fraction_threshold": 0.7,
               "apply_log1p_to_target": false,
               "drop_datetime_columns": true,
               "drop_dna_wt": false,
               "imputer": "median"
            },
            "steps": []
         },
         "allOf": [
            {
               "$ref": "#/definitions/DatasetPreprocessingSection"
            }
         ]
      },
      "methods": {
         "title": "Methods",
         "description": "List of configurations for specific survival analysis methods to apply.",
         "default": [],
         "type": "array",
         "items": {
            "$ref": "#/definitions/SurvivalAnalysisMethod"
         }
      }
   },
   "definitions": {
      "DatasetPreprocessingPresetSection": {
         "title": "DatasetPreprocessingPresetSection",
         "description": "Defines 'syntax sugar' for semi-automated data preprocessing steps generation.",
         "type": "object",
         "properties": {
            "enable": {
               "title": "Enable",
               "description": "Whether to enable automated generation of preprocessing steps.",
               "default": true,
               "type": "boolean"
            },
            "features_list": {
               "title": "Features List",
               "description": "Defines a list of features (referenced by regexps) that should be selected. Additional option\n            is just to reference a configuration file that contains the required list of features:\n            ...\n            features_list: sub_cfg/features_list.yaml  # a config file\n            ...\n        ",
               "default": [],
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "array",
                     "items": {
                        "type": "string"
                     }
                  }
               ]
            },
            "remove_correlated_features": {
               "title": "Remove Correlated Features",
               "description": "Whether to include a step that removes correlated features.",
               "default": true,
               "type": "boolean"
            },
            "nans_per_row_fraction_threshold": {
               "title": "Nans Per Row Fraction Threshold",
               "description": "Defines maximum acceptable fraction of NaNs within a row.",
               "default": 0.9,
               "type": "number"
            },
            "nans_fraction_threshold": {
               "title": "Nans Fraction Threshold",
               "description": "Defines maximum acceptable fraction of NaNs within a column.",
               "default": 0.7,
               "type": "number"
            },
            "apply_log1p_to_target": {
               "title": "Apply Log1P To Target",
               "description": "Whether to apply log1p transformation to target column (applicable only for regression problems).",
               "default": false,
               "type": "boolean"
            },
            "drop_datetime_columns": {
               "title": "Drop Datetime Columns",
               "description": "Whether to drop date time columns.",
               "default": true,
               "type": "boolean"
            },
            "drop_dna_wt": {
               "title": "Drop Dna Wt",
               "description": "Whether to drop DNA WT values after one-hot-encoding.",
               "default": false,
               "type": "boolean"
            },
            "imputer": {
               "title": "Imputer",
               "description": "Imputer to use. Possible values: (median, mice)",
               "default": "median",
               "type": "string"
            }
         }
      },
      "PreprocessingStep": {
         "title": "PreprocessingStep",
         "description": "Defines data preprocessing step.",
         "type": "object",
         "properties": {
            "enable": {
               "title": "Enable",
               "description": "Whether to enable preprocessing step.",
               "default": true,
               "type": "boolean"
            },
            "transformer": {
               "title": "Transformer",
               "description": "Alias of transformer to use. Please refer to :lml:ref:`Data Transformers` for details.",
               "type": "string"
            },
            "params": {
               "title": "Params",
               "description": "Parameters that will be passed to the correspoding transformer instance.",
               "default": {},
               "type": "object"
            }
         },
         "required": [
            "transformer"
         ]
      },
      "DatasetPreprocessingSection": {
         "title": "DatasetPreprocessingSection",
         "description": "Defines data preprocessing section for modeling/survival setup.",
         "type": "object",
         "properties": {
            "enable": {
               "title": "Enable",
               "description": "Whether to enable Preprocessing Pipeline for dataset transformation.",
               "default": true,
               "type": "boolean"
            },
            "preset": {
               "title": "Preset",
               "default": {
                  "enable": false,
                  "features_list": [],
                  "remove_correlated_features": true,
                  "nans_per_row_fraction_threshold": 0.9,
                  "nans_fraction_threshold": 0.7,
                  "apply_log1p_to_target": false,
                  "drop_datetime_columns": true,
                  "drop_dna_wt": false,
                  "imputer": "median"
               },
               "allOf": [
                  {
                     "$ref": "#/definitions/DatasetPreprocessingPresetSection"
                  }
               ]
            },
            "steps": {
               "title": "Steps",
               "description": "Defines a list of preprocessing steps (transformations) to apply. See :lml:ref:`Data Transformers` for details.",
               "default": [],
               "type": "array",
               "items": {
                  "$ref": "#/definitions/PreprocessingStep"
               }
            }
         }
      },
      "SurvivalAnalysisMethod": {
         "title": "SurvivalAnalysisMethod",
         "description": "Configures specific survival analysis method.\n\nThere are the following registered methods: :lml:ref:`Survival Analysis Methods`.",
         "type": "object",
         "properties": {
            "enable": {
               "title": "Enable",
               "description": "Enable or disable this analysis.",
               "default": true,
               "type": "boolean"
            },
            "method_id": {
               "title": "Method Id",
               "description": "Alias of survival analysis method to use. Refer to :lml:ref:`Survival Analysis Methods` for details.",
               "type": "string"
            },
            "params": {
               "title": "Params",
               "description": "Parameters specific for the chosen analysis method. Refer to :lml:ref:`Survival Analysis Methods` to find out exact configurationstructure for the method.",
               "default": {},
               "type": "object"
            }
         },
         "required": [
            "method_id"
         ]
      }
   }
}

Fields

dataset_preprocessing (logml.configuration.modeling.DatasetPreprocessingSection)
enable (bool)
event_column (str)
event_observed (str)
methods (List[logml.configuration.survival_analysis.SurvivalAnalysisMethod])
survival_metric (str)

field enable: bool = True: Enable or disable survival anaysis problem.

field survival_metric: str = '': Column of the input dataset that contains “time-to-event” values. Usually this is overall survival (OS) or progression-free-survival (PFS) time. NOTE: deprecated, please use “dataset_metadata” section.

field event_observed: str = '': Query-like expression that indicates “events” (“uncensored”) samples. For example: “OS_CNSR == 1”. See Dataset Queries for details. NOTE: deprecated, please use “dataset_metadata” section.

field event_column: str = '': Column used for event calculation. (We have to specify it so that is can be removed from features list after the dataset preprocessing). If you specify event_observed: “OS_CNSR == 1”, then also put event_column: OS_CNSR. If not specified, we attempt to extract column name from the event_observed expression. NOTE: deprecated, please use “dataset_metadata” section.

field dataset_preprocessing: logml.configuration.modeling.DatasetPreprocessingSection = DatasetPreprocessingSection(enable=True, preset=DatasetPreprocessingPresetSection(enable=False, features_list=[], remove_correlated_features=True, nans_per_row_fraction_threshold=0.9, nans_fraction_threshold=0.7, apply_log1p_to_target=False, drop_datetime_columns=True, drop_dna_wt=False, imputer='median'), steps=[]): Defines dataset preprocessing configuration. It runs before time and event data extraction, so make sure the final dataset contains both columns.

field methods: List[logml.configuration.survival_analysis.SurvivalAnalysisMethod] = []: List of configurations for specific survival analysis methods to apply.

get_target_methods() → List[str]: Returns enabled SA methods.

get_sa_method(method_id: str): Returns corresponding method for a given section.

class logml.configuration.survival_analysis.SurvivalAnalysisSection

Bases: pydantic.main.BaseModel

Defines survival analysis section.

Survival analysis goal is to consider the data from survival modeling perspecive and then make a conclusion based on the results.

Example is Kaplan-Meier univariate survival estimator, which shows how good median feature values separate data to two groups.

Show JSON schema

{
   "title": "SurvivalAnalysisSection",
   "description": "Defines survival analysis section.\n\nSurvival analysis goal is to consider the data from survival modeling perspecive and\nthen make a conclusion based on the results.\n\nExample is Kaplan-Meier univariate survival estimator, which shows how good median feature\nvalues separate data to two groups.",
   "type": "object",
   "properties": {
      "enable": {
         "title": "Enable",
         "description": "Enable or disable Survival Anaysis.",
         "default": true,
         "type": "boolean"
      },
      "problems": {
         "title": "Problems",
         "description": "Defines list of \"survival problem setup\" configurations. Usually problems are similar to one another, but have different target variables.",
         "default": {},
         "type": "object",
         "additionalProperties": {
            "$ref": "#/definitions/SurvivalAnalysisSetup"
         }
      }
   },
   "definitions": {
      "DatasetPreprocessingPresetSection": {
         "title": "DatasetPreprocessingPresetSection",
         "description": "Defines 'syntax sugar' for semi-automated data preprocessing steps generation.",
         "type": "object",
         "properties": {
            "enable": {
               "title": "Enable",
               "description": "Whether to enable automated generation of preprocessing steps.",
               "default": true,
               "type": "boolean"
            },
            "features_list": {
               "title": "Features List",
               "description": "Defines a list of features (referenced by regexps) that should be selected. Additional option\n            is just to reference a configuration file that contains the required list of features:\n            ...\n            features_list: sub_cfg/features_list.yaml  # a config file\n            ...\n        ",
               "default": [],
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "array",
                     "items": {
                        "type": "string"
                     }
                  }
               ]
            },
            "remove_correlated_features": {
               "title": "Remove Correlated Features",
               "description": "Whether to include a step that removes correlated features.",
               "default": true,
               "type": "boolean"
            },
            "nans_per_row_fraction_threshold": {
               "title": "Nans Per Row Fraction Threshold",
               "description": "Defines maximum acceptable fraction of NaNs within a row.",
               "default": 0.9,
               "type": "number"
            },
            "nans_fraction_threshold": {
               "title": "Nans Fraction Threshold",
               "description": "Defines maximum acceptable fraction of NaNs within a column.",
               "default": 0.7,
               "type": "number"
            },
            "apply_log1p_to_target": {
               "title": "Apply Log1P To Target",
               "description": "Whether to apply log1p transformation to target column (applicable only for regression problems).",
               "default": false,
               "type": "boolean"
            },
            "drop_datetime_columns": {
               "title": "Drop Datetime Columns",
               "description": "Whether to drop date time columns.",
               "default": true,
               "type": "boolean"
            },
            "drop_dna_wt": {
               "title": "Drop Dna Wt",
               "description": "Whether to drop DNA WT values after one-hot-encoding.",
               "default": false,
               "type": "boolean"
            },
            "imputer": {
               "title": "Imputer",
               "description": "Imputer to use. Possible values: (median, mice)",
               "default": "median",
               "type": "string"
            }
         }
      },
      "PreprocessingStep": {
         "title": "PreprocessingStep",
         "description": "Defines data preprocessing step.",
         "type": "object",
         "properties": {
            "enable": {
               "title": "Enable",
               "description": "Whether to enable preprocessing step.",
               "default": true,
               "type": "boolean"
            },
            "transformer": {
               "title": "Transformer",
               "description": "Alias of transformer to use. Please refer to :lml:ref:`Data Transformers` for details.",
               "type": "string"
            },
            "params": {
               "title": "Params",
               "description": "Parameters that will be passed to the correspoding transformer instance.",
               "default": {},
               "type": "object"
            }
         },
         "required": [
            "transformer"
         ]
      },
      "DatasetPreprocessingSection": {
         "title": "DatasetPreprocessingSection",
         "description": "Defines data preprocessing section for modeling/survival setup.",
         "type": "object",
         "properties": {
            "enable": {
               "title": "Enable",
               "description": "Whether to enable Preprocessing Pipeline for dataset transformation.",
               "default": true,
               "type": "boolean"
            },
            "preset": {
               "title": "Preset",
               "default": {
                  "enable": false,
                  "features_list": [],
                  "remove_correlated_features": true,
                  "nans_per_row_fraction_threshold": 0.9,
                  "nans_fraction_threshold": 0.7,
                  "apply_log1p_to_target": false,
                  "drop_datetime_columns": true,
                  "drop_dna_wt": false,
                  "imputer": "median"
               },
               "allOf": [
                  {
                     "$ref": "#/definitions/DatasetPreprocessingPresetSection"
                  }
               ]
            },
            "steps": {
               "title": "Steps",
               "description": "Defines a list of preprocessing steps (transformations) to apply. See :lml:ref:`Data Transformers` for details.",
               "default": [],
               "type": "array",
               "items": {
                  "$ref": "#/definitions/PreprocessingStep"
               }
            }
         }
      },
      "SurvivalAnalysisMethod": {
         "title": "SurvivalAnalysisMethod",
         "description": "Configures specific survival analysis method.\n\nThere are the following registered methods: :lml:ref:`Survival Analysis Methods`.",
         "type": "object",
         "properties": {
            "enable": {
               "title": "Enable",
               "description": "Enable or disable this analysis.",
               "default": true,
               "type": "boolean"
            },
            "method_id": {
               "title": "Method Id",
               "description": "Alias of survival analysis method to use. Refer to :lml:ref:`Survival Analysis Methods` for details.",
               "type": "string"
            },
            "params": {
               "title": "Params",
               "description": "Parameters specific for the chosen analysis method. Refer to :lml:ref:`Survival Analysis Methods` to find out exact configurationstructure for the method.",
               "default": {},
               "type": "object"
            }
         },
         "required": [
            "method_id"
         ]
      },
      "SurvivalAnalysisSetup": {
         "title": "SurvivalAnalysisSetup",
         "description": "Defines configuration for a \"survival analysis problem\", also called briefly \"setup\".",
         "type": "object",
         "properties": {
            "enable": {
               "title": "Enable",
               "description": "Enable or disable survival anaysis problem.",
               "default": true,
               "type": "boolean"
            },
            "survival_metric": {
               "title": "Survival Metric",
               "description": "Column of the input dataset that contains \"time-to-event\" values. Usually this is overall\n            survival (OS) or progression-free-survival (PFS) time. NOTE: deprecated, please use\n            \"dataset_metadata\" section.\n        ",
               "default": "",
               "type": "string"
            },
            "event_observed": {
               "title": "Event Observed",
               "description": "Query-like expression that indicates \"events\" (\"uncensored\") samples. For example:\n            \"OS_CNSR == 1\". See :ref:`Dataset Queries` for details. NOTE: deprecated, please use\n            \"dataset_metadata\" section.\n        ",
               "default": "",
               "type": "string"
            },
            "event_column": {
               "title": "Event Column",
               "description": "Column used for event calculation. (We have to specify it so that is can be removed\n            from features list after the dataset preprocessing).\n            If you specify `event_observed: \"OS_CNSR == 1\"`, then also put `event_column: OS_CNSR`.\n            If not specified, we attempt to extract column name from the `event_observed` expression.\n            NOTE: deprecated, please use \"dataset_metadata\" section.\n        ",
               "default": "",
               "type": "string"
            },
            "dataset_preprocessing": {
               "title": "Dataset Preprocessing",
               "description": "Defines dataset preprocessing configuration. It runs before `time` and `event` data extraction, so make sure the final dataset contains both columns.",
               "default": {
                  "enable": true,
                  "preset": {
                     "enable": false,
                     "features_list": [],
                     "remove_correlated_features": true,
                     "nans_per_row_fraction_threshold": 0.9,
                     "nans_fraction_threshold": 0.7,
                     "apply_log1p_to_target": false,
                     "drop_datetime_columns": true,
                     "drop_dna_wt": false,
                     "imputer": "median"
                  },
                  "steps": []
               },
               "allOf": [
                  {
                     "$ref": "#/definitions/DatasetPreprocessingSection"
                  }
               ]
            },
            "methods": {
               "title": "Methods",
               "description": "List of configurations for specific survival analysis methods to apply.",
               "default": [],
               "type": "array",
               "items": {
                  "$ref": "#/definitions/SurvivalAnalysisMethod"
               }
            }
         }
      }
   }
}

Fields

enable (bool)
problems (Dict[str, logml.configuration.survival_analysis.SurvivalAnalysisSetup])

field enable: bool = True: Enable or disable Survival Anaysis.

field problems: Dict[str, logml.configuration.survival_analysis.SurvivalAnalysisSetup] = {}: Defines list of “survival problem setup” configurations. Usually problems are similar to one another, but have different target variables.

get_target_problems() → List[str]: Returns modeling setups for which a given section is enabled.