bluecast.preprocessing.train_test_split

This module contains functions to split data into train and test sets.

The train-test split can be done in two ways:
  • Randomly

  • Based on a provided order (i.e. time)

Module Contents

Functions

train_test_split_cross(df, target_col[, train_size, ...])

Split data into train and test. Stratification is possible.

train_test_split_time(df, target_col, split_by_col[, ...])

Split data into train and test based on a provided order (i.e. time).

train_test_split(df, target_col[, split_by_col, ...])

bluecast.preprocessing.train_test_split.train_test_split_cross(df: pandas.DataFrame, target_col: str, train_size=0.8, random_state: int = 100, stratify: bool = False)

Split data into train and test. Stratification is possible.

bluecast.preprocessing.train_test_split.train_test_split_time(df: pandas.DataFrame, target_col: str, split_by_col: str, train_size: float = 0.8)

Split data into train and test based on a provided order (i.e. time).

bluecast.preprocessing.train_test_split.train_test_split(df: pandas.DataFrame, target_col: str, split_by_col: str | None = None, train_size: float = 0.8, random_state: int = 0, stratify: bool = False)