bluecast.ml_modelling.catboost¶
Module Contents¶
Classes¶
CatBoost classification model, mirroring the XgboostModel structure. |
- class bluecast.ml_modelling.catboost.CatboostModel(class_problem: Literal[binary, multiclass], conf_training: bluecast.config.training_config.TrainingConfig | None = None, conf_catboost: bluecast.config.training_config.CatboostTuneParamsConfig | None = None, conf_params_catboost: bluecast.config.training_config.CatboostFinalParamConfig | None = None, experiment_tracker: bluecast.experimentation.tracking.ExperimentTracker | None = None, custom_in_fold_preprocessor: bluecast.preprocessing.custom.CustomPreprocessing | None = None, cat_columns: List[str | float | int] | None = None, single_fold_eval_metric_func: bluecast.evaluation.eval_metrics.ClassificationEvalWrapper | None = None)¶
Bases:
bluecast.ml_modelling.base_classes.CatboostBaseModelCatBoost classification model, mirroring the XgboostModel structure.
This class can train and/or tune a CatBoost model, handle sample weighting, and do repeated stratified cross-validation or single-fold training.
- fit(x_train: pandas.DataFrame, x_test: pandas.DataFrame, y_train: pandas.Series, y_test: pandas.Series) catboost.CatBoost¶
Train a CatBoost classification model. This includes optional hyperparameter tuning via ‘orchestrate_hyperparameter_tuning’, then final training on the entire data (or a subset if needed).
- autotune(x_train: pandas.DataFrame, x_test: pandas.DataFrame, y_train: pandas.Series, y_test: pandas.Series) None¶
- train_single_fold_model(train_pool: catboost.Pool, test_pool: catboost.Pool, y_test: pandas.Series, params: Dict[str, Any]) float¶
- _fine_tune_precise(tuned_params: Dict[str, Any], x_train: pandas.DataFrame, y_train: pandas.Series, x_test: pandas.DataFrame, y_test: pandas.Series) float¶
Manual repeated stratified K-fold approach, similar to _fine_tune_precise in your XgboostModel.
- fine_tune(x_train: pandas.DataFrame, x_test: pandas.DataFrame, y_train: pandas.Series, y_test: pandas.Series) None¶
Grid-search style fine tuning, analogous to the XGBoost model’s method. We create an objective function that: 1) Uses a small param space from create_fine_tune_search_space() 2) Calls _optimize_and_plot_grid_search_study() to run an Optuna study with GridSampler
- predict(df: pandas.DataFrame) Tuple[numpy.ndarray, numpy.ndarray]¶
Predict probabilities and classes on new data. Returns (predicted_probs, predicted_classes).
- predict_proba(df: pandas.DataFrame) numpy.ndarray¶
Predict class probabilities for new data (only relevant for classification).