:py:mod:`bluecast.evaluation.error_analysis_regression` ======================================================= .. py:module:: bluecast.evaluation.error_analysis_regression Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: bluecast.evaluation.error_analysis_regression.OutOfFoldDataReaderRegression bluecast.evaluation.error_analysis_regression.OutOfFoldDataReaderRegressionCV bluecast.evaluation.error_analysis_regression.ErrorAnalyserRegressionMixin bluecast.evaluation.error_analysis_regression.ErrorDistributionRegressionPlotterMixin bluecast.evaluation.error_analysis_regression.ErrorAnalyserRegression bluecast.evaluation.error_analysis_regression.ErrorAnalyserRegressionCV .. py:class:: OutOfFoldDataReaderRegression(bluecast_instance: bluecast.blueprints.cast_regression.BlueCastRegression) Bases: :py:obj:`bluecast.evaluation.base_classes.DataReader` .. py:method:: read_data_from_bluecast_instance() -> polars.DataFrame Read out of fold datasetsfrom defined storage location. :return: Out of fold dataset. .. py:method:: read_data_from_bluecast_cv_instance() -> polars.DataFrame Function to fail when called. Please use read_data_from_bluecast_instance instead. :return: Will raise an error. .. py:class:: OutOfFoldDataReaderRegressionCV(bluecast_instance: bluecast.blueprints.cast_cv_regression.BlueCastCVRegression) Bases: :py:obj:`bluecast.evaluation.base_classes.DataReader` .. py:method:: read_data_from_bluecast_instance() -> polars.DataFrame Function to fail when called. Please use read_data_from_bluecast_cv_instance instead. :return: Will raise an error. .. py:method:: read_data_from_bluecast_cv_instance() -> polars.DataFrame Read out of fold datasets from defined storage location. :return: Concatenated out of fold dataset. .. py:class:: ErrorAnalyserRegressionMixin Bases: :py:obj:`bluecast.evaluation.base_classes.ErrorAnalyser` .. py:method:: analyse_errors(df: Union[pandas.DataFrame, polars.DataFrame], descending: bool = True) -> polars.DataFrame Find mean absolute errors for all subsegments :param df: Preprocessed out of fold DataFrame. :param descending: Bool indicating if errors shall be ordered descending in final DataFrame. :return: Polars DataFrame with all subsegments and mean absolute error in each of them. .. py:class:: ErrorDistributionRegressionPlotterMixin(ignore_columns_during_visualization: List[str]) Bases: :py:obj:`bluecast.evaluation.base_classes.ErrorDistributionPlotter` .. py:method:: plot_error_distributions(df: polars.DataFrame, target_column: str = 'target_quantiles') .. py:class:: ErrorAnalyserRegression(bluecast_instance: bluecast.blueprints.cast_regression.BlueCastRegression, ignore_columns_during_visualization=None) Bases: :py:obj:`OutOfFoldDataReaderRegression`, :py:obj:`bluecast.evaluation.base_classes.ErrorPreprocessor`, :py:obj:`ErrorAnalyserRegressionMixin`, :py:obj:`ErrorDistributionRegressionPlotterMixin` .. py:method:: stack_predictions_by_class(df: polars.DataFrame) -> polars.DataFrame Add additional column with binned target. :param df: Polars DataFrame with original targets. :return: Polars DataFrame with additional binned targets column. .. py:method:: calculate_errors(df: Union[pandas.DataFrame, polars.DataFrame]) -> polars.DataFrame Analyse errors of predictions on out of fold data. :param df: DataFrame holding out of fold data and predictions. :return: Polars DataFrame with additional 'prediction_error' column. .. py:method:: analyse_segment_errors() -> polars.DataFrame Pipeline for error analysis. Reads the out of fold datasets from the output location defined in the training config inside the provided BlueCast instance, preprocess the data and calculate errors for all subsegments of the data. Numerical columns will be split into quantiles to get subsegments. :return: Polars DataFrame with subsegments and errors. .. py:class:: ErrorAnalyserRegressionCV(bluecast_instance: bluecast.blueprints.cast_cv_regression.BlueCastCVRegression, ignore_columns_during_visualization=None) Bases: :py:obj:`OutOfFoldDataReaderRegressionCV`, :py:obj:`bluecast.evaluation.base_classes.ErrorPreprocessor`, :py:obj:`ErrorAnalyserRegressionMixin`, :py:obj:`ErrorDistributionRegressionPlotterMixin` .. py:method:: stack_predictions_by_class(df: polars.DataFrame) -> polars.DataFrame Add additional column with binned target. :param df: Polars DataFrame with original targets. :return: Polars DataFrame with additional binned targets column. .. py:method:: calculate_errors(df: Union[pandas.DataFrame, polars.DataFrame]) Analyse errors of predictions on out of fold data. :param df: DataFrame holding out of fold data and predictions. :param loss_func: Function that takes (y_true, y_pred) and returns a score. Will be used to evaluate prediction errors. :return: None .. py:method:: analyse_segment_errors() -> polars.DataFrame Pipeline for error analysis. Reads the out of fold datasets from the output location defined in the training config inside the provided BlueCast instance, preprocess the data and calculate errors for all subsegments of the data. Numerical columns will be split into quantiles to get subsegments. :return: Polars DataFrame with subsegments and errors.