mydatapreprocessing.helpers package¶
Helper functions that are used across all library.
It’s made mostly for internal use, but finally added to public API as it may be helpful.
-
mydatapreprocessing.helpers.
get_copy_or_view
(data: DataFrameOrArrayGeneric, inplace: bool) → DataFrameOrArrayGeneric[source]¶ As DataFrame copy function needs to be casted for correct type hints this helps to solve it.
Parameters: - data (DataFrameOrArrayGeneric) – Input data
- inplace (bool) – Whether to return copy or not.
Returns: Copy or original data.
Return type: DataFrameOrArrayGeneric
Example
>>> a = np.array([1, 2, 3]) >>> b = get_copy_or_view(a, inplace=True) >>> id(a) == id(b) True >>> b = get_copy_or_view(a, inplace=False) >>> id(a) == id(b) False
-
mydatapreprocessing.helpers.
check_column_in_df
(df: pd.DataFrame, name_or_index: PandasIndex, source: None | str = None) → None[source]¶ If defined column is not in DataFrame, it raise Error.
Parameters: - df (pd.DataFrame) – Input data.
- name_or_index (PandasIndex) – Integer index, name or pandas.Index.
- source (str, optional) – In raised message wanted column can be referenced. Defaults to None.
Raises: KeyError
– If column not found in DataFrame.Example
>>> df = pd.DataFrame([[1, 2, 3]], columns=["a", "b", "c"]) >>> check_column_in_df(df, "a") >>> check_column_in_df(df, "z") Traceback (most recent call last): KeyError...
-
mydatapreprocessing.helpers.
get_column_name
(df: pd.DataFrame, index: PandasIndex) → str | pd.Index[source]¶ Return index that can be used to access column directly.
In user input the column can be defined by name or by it’s index. Then selecting the column has different syntax. It’s verified whether column is available. If it’s integer index, it’s converted to string so the syntax is always the same.
Parameters: - df (pd.DataFrame) – Input data
- index (PandasIndex) – Also integer index.
Example
>>> df = pd.DataFrame([[1, 2, 3]], columns=["a", "b", "c"]) >>> get_column_name(df, "b") 'b' >>> get_column_name(df, 2) 'c' >>> get_column_name(df, "z") Traceback (most recent call last): KeyError...