:py:mod:`bluecast.preprocessing.feature_selection` ================================================== .. py:module:: bluecast.preprocessing.feature_selection Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: bluecast.preprocessing.feature_selection.BoostaRootaWrapper bluecast.preprocessing.feature_selection.BoostARoota Functions ~~~~~~~~~ .. autoapisummary:: bluecast.preprocessing.feature_selection._create_shadow bluecast.preprocessing.feature_selection._reduce_vars_xgb bluecast.preprocessing.feature_selection._reduce_vars_sklearn bluecast.preprocessing.feature_selection._BoostARoota .. py:class:: BoostaRootaWrapper(class_problem: Literal[binary, multiclass, regression], random_state) Bases: :py:obj:`bluecast.preprocessing.custom.CustomPreprocessing` .. py:method:: fit_transform(df: pandas.DataFrame, targets: pandas.Series) -> Tuple[pandas.DataFrame, Optional[pandas.Series]] .. py:method:: transform(df: pandas.DataFrame, target: Optional[pandas.Series] = None, predicton_mode: bool = False) -> Tuple[pandas.DataFrame, Optional[pandas.Series]] .. py:class:: BoostARoota(metric=None, clf=None, cutoff=200, iters=10, max_rounds=100, delta=0.1, silent=True) Bases: :py:obj:`object` .. py:method:: fit(x, y) .. py:method:: transform(x) .. py:method:: fit_transform(x, y) .. py:function:: _create_shadow(x_train) Take all X variables, creating copies and randomly shuffling them :param x_train: the dataframe to create shadow features on :return: dataframe 2x width and the names of the shadows for removing later .. py:function:: _reduce_vars_xgb(x, y, metric, this_round, cutoff, n_iterations, delta, silent) Function to run through each :param x: Input dataframe - X :param y: Target variable :param metric: Metric to optimize in XGBoost :param this_round: Round so it can be printed to screen :return: tuple - stopping criteria and the variables to keep .. py:function:: _reduce_vars_sklearn(x, y, clf, this_round, cutoff, n_iterations, delta, silent) Function to run through each :param x: Input dataframe - X :param y: Target variable :param clf: the fully specified classifier passed in by user :param this_round: Round so it can be printed to screen :return: tuple - stopping criteria and the variables to keep .. py:function:: _BoostARoota(x, y, metric, clf, cutoff, iters, max_rounds, delta, silent) Function loops through, waiting for the stopping criteria to change :param x: X dataframe One Hot Encoded :param y: Labels for the target variable :param metric: The metric to optimize in XGBoost :return: names of the variables to keep