LogML Pipeline ============================= When you provide LogML with data and configuration file, it first determines what individual steps are to be executed and their dependencies. Typical dependency line for each analysis looks like: - Data preprocessing: prepare data for modeling. Result is LogML Dataset entity. - Specific analysis: accept dataset and perform analsys: - For example, Feature Importance analysis performs modeling and extraction of features essential for target explanation. - Report generation: use artifacts generated by analysis step to render results visualization. - Result packaging: archive report and important analysis artifacts. Essentially an "analysis" is a set of LogML components (module) which receives incoming data, builds a model of some sort - statistical or machine learning model - and then makes a conclusion about the data. In general there are three types of questions LogML analyses try to answer: - What is the relation between covariates and target variables? (Modeling and Survival Analysis). - Given groups of samples, what features define separation? (Expression analysis). - Is there anything specific about statistical properties of data (Exploratory analysis). See following sections for details of analyses kinds provided by LogML.