logml.data.readers
Functions
|
Selects applicable params for reader of a given file. |
|
Parses a given file's extension and returns the corresponding 'pd.read_XXX' reader functions. |
|
Utility for loading a given file. |
|
Parses dates columns using the following params: |
|
Reads a dataframe defined by a given params, applies strata filtering. |
|
Replaces unallowed characters in a given column name, casts to lower case. |
|
Sanitizes columns of a given dataframe. |
- logml.data.readers.sanitize_column(col_name: str, **kwargs) str
Replaces unallowed characters in a given column name, casts to lower case. Additional rules/flags could be added.
- logml.data.readers.sanitize_columns(dataframe: pandas.core.frame.DataFrame, **kwargs) pandas.core.frame.DataFrame
Sanitizes columns of a given dataframe.
- Special flags:
col_prefix - a given prefix will be prepended to all column names
replace_dot - whether ‘.’ should be replaced in column names or not
- logml.data.readers.get_file_reader(path: str) Callable
Parses a given file’s extension and returns the corresponding ‘pd.read_XXX’ reader functions.
- logml.data.readers.filter_params_for_reader(path: str, **kwargs) Dict
Selects applicable params for reader of a given file.
- logml.data.readers.parse_dates(dataframe: pandas.core.frame.DataFrame, **kwargs) pandas.core.frame.DataFrame
- Parses dates columns using the following params:
parse_dates - list of columns to parse
dateformat - expected format
datetime_errors - whether raise or ignore dates parsing exceptions
- logml.data.readers.load_dataframe(path: str, **kwargs)
Utility for loading a given file.
- In case a given input is csv file, additional parsing flags are applied:
parse_dates - list of columns that contain dates
dateformat - for dates parsing
datetime_errors - whether raise or ignore dates parsing exceptions
sep - separator in csv files
encoding - csv files encoding
header - whether a given file has a header or not
sanitize_columns - whether column names need to be cleaned or not
- col_prefix - if column sanitizing is required, a given prefix
will be prepended to all ‘clean’ column names
replace_dot - whether ‘.’ should be replaced in column names or not
- logml.data.readers.read_dataframe(global_params: Dict) pandas.core.frame.DataFrame
Reads a dataframe defined by a given params, applies strata filtering.