logml.analysis.items.greedy_split

Functions

_calc_and_plot_km(surv_targret, censor, ...)

Kaplan Meier plot for two conditions.

_calc_km(surv_targret, censor, label_1, ...)

Kaplan Meier models for two conditions.

_plot_km(kmf1, kmf2, result[, ax])

Kaplan Meier plot for two conditions.

_run_search(x_data, feature_columns, time, event)

Run search

Classes

FindGeneSet(x_data, surv_target, censor[, ...])

Finds 'best' gene set.

GreedySplitAnalysisResult([stratas])

Result object that is produced by Greedy Split analysis item.

GreedySplitStrata(summary, km_fitter1, ...)

Strata-level result that is produced by Greedy Split analysis item.

GreedySplitSurvivalAnalysis(cfg[, logger])

Wraps analysis algo into data loading/saving and config.

class logml.analysis.items.greedy_split.GreedySplitAnalysisConfig

Bases: pydantic.main.BaseModel

Config definition for Greedy Split analysis item.

Show JSON schema
{
   "title": "GreedySplitAnalysisConfig",
   "description": "Config definition for Greedy Split analysis item.",
   "type": "object",
   "properties": {
      "survival_column": {
         "title": "Survival Column",
         "type": "string"
      },
      "event_column": {
         "title": "Event Column",
         "type": "string"
      },
      "event_query": {
         "title": "Event Query",
         "type": "string"
      },
      "stratify_column": {
         "title": "Stratify Column",
         "default": "",
         "type": "string"
      },
      "features": {
         "title": "Features",
         "default": [],
         "type": "array",
         "items": {}
      },
      "n_features_to_select": {
         "title": "N Features To Select",
         "default": 0,
         "type": "integer"
      },
      "min_split_size": {
         "title": "Min Split Size",
         "default": 0,
         "type": "integer"
      },
      "max_stratify_values": {
         "title": "Max Stratify Values",
         "default": 20,
         "type": "integer"
      },
      "n_jobs": {
         "title": "N Jobs",
         "default": 1,
         "type": "integer"
      },
      "input_ref": {
         "title": "Input Ref",
         "default": "$default",
         "type": "string"
      }
   },
   "required": [
      "survival_column",
      "event_column",
      "event_query"
   ]
}

Fields
field survival_column: str [Required]
field event_column: str [Required]
field event_query: str [Required]
field stratify_column: str = ''
field features: list = []
field n_features_to_select: int = 0
field min_split_size: int = 0
field max_stratify_values: int = 20
field n_jobs: int = 1
field input_ref: str = '$default'
class logml.analysis.items.greedy_split.GreedySplitStrata(summary: pandas.core.frame.DataFrame, km_fitter1: object, km_fitter2: object, stat: float, pvalue: float, adjusted_pvalue: float, cox_ph: dict)

Bases: object

Strata-level result that is produced by Greedy Split analysis item.

summary: pandas.core.frame.DataFrame
km_fitter1: object
km_fitter2: object
stat: float
pvalue: float
adjusted_pvalue: float
cox_ph: dict
get_summary(unused_short: bool = True)

Get analysis summary

class logml.analysis.items.greedy_split.GreedySplitAnalysisResult(stratas: Optional[Dict[str, logml.analysis.items.greedy_split.GreedySplitStrata]] = None)

Bases: logml.analysis.base_item.AnalysisResult

Result object that is produced by Greedy Split analysis item.

stratas: Dict[str, logml.analysis.items.greedy_split.GreedySplitStrata] = None
get_summary(short: bool = True)

Get summary object.

Parameters

short – When true, return short (one-line) summary. Else return full summary.

Returns

format-able summary result.

Return type

object

class logml.analysis.items.greedy_split.GreedySplitSurvivalAnalysis(cfg: logml.analysis.items.greedy_split.GreedySplitAnalysisConfig, logger=None, **kwargs)

Bases: logml.analysis.base_item.AnalysisItem

Wraps analysis algo into data loading/saving and config.

LABEL = 'greedy_split'
PARAMS_CLS

alias of logml.analysis.items.greedy_split.GreedySplitAnalysisConfig

RESULT_CLS

alias of logml.analysis.items.greedy_split.GreedySplitAnalysisResult

classmethod estimate_resources(res: logml.analysis.common.JobResourcesReqs, cfg: GlobalConfig = None, df: Optional[pandas.core.frame.DataFrame] = None, strata_shapes: Optional[Dict[str, tuple]] = None, item_params: Any = None) None

See parent description.

Use minimal memory amount and set CPU number per strata/jobs combination.

get_feature_columns(dataframe: pandas.core.frame.DataFrame) list

Return list of features.

get_event_data(dataframe: pandas.core.frame.DataFrame) numpy.array

Exec query to form survival event data.

run()

Run end-to-end analysis.

get_result() logml.analysis.items.greedy_split.GreedySplitAnalysisResult

Return final analysis result.

generate_analysis_metadata() Optional[logml.analysis.base_item.AnalysisMetadata]

Create metadata object

class logml.analysis.items.greedy_split.FindGeneSet(x_data: pandas.core.frame.DataFrame, surv_target: numpy.array, censor: numpy.array, genes: Optional[list] = None, num_genes_to_select: int = 0, min_split_size: int = 0, use_any=False, interactive: bool = False, title: Optional[str] = None, logger=None, label=None, variance_threshold=0.0001)

Bases: object

Finds ‘best’ gene set.

property result

Returns greedy split analysis result.

run()

Greedy search for gene set.