
Train machine learning models with optional stacking and feature selection
Source:R/machine_learning.R
compute_features.training.ML.Rd
This function trains one or more machine learning models using repeated k-fold cross-validation, with optional model stacking and feature selection using Boruta. It supports stratified cross-validation, including the construction of k-folds stratified by cohorts when this information is available.
Usage
compute_features.training.ML(
features_train,
target_var,
trait.positive,
metric = "Accuracy",
stack,
k_folds = 10,
n_rep = 5,
LODO = FALSE,
batch_id = NULL,
file_name = NULL,
ncores = NULL,
return = FALSE,
fold_construction_fun = NULL,
fold_construction_args_fixed = NULL,
fold_construction_args_tunable = NULL
)
Arguments
- features_train
A data frame containing the features used for training (samples should be as rows).
- target_var
A vector containing the target variable to predict.
- trait.positive
Value in
target_var
to be considered as the positive class.- metric
Character. Metric used for hyperparameter tuning and model selection. Supported values are
"Accuracy"
,"AUROC"
, and"AUPRC"
.- stack
Logical. Whether to perform model stacking. Default is
FALSE
.- k_folds
Integer. Number of folds to use in cross-validation.
- n_rep
Integer. Number of repetitions of the cross-validation.
- LODO
Logical. If
TRUE
, constructs folds stratified by cohorts (Leave-One-Dataset-Out CV).- batch_id
A vector indicating the cohort or batch for each sample (required only if
LODO = TRUE
).- file_name
Character. File name used to save plots in the
Results/
directory.- ncores
Integer. Number of cores to use for parallelization. If not given, detectCores() - 1 will be used.
- return
Logical. Whether to return and save the plots generated by the function.
- fold_construction_fun
Function. A custom function used to construct the cross-validation folds. This function must accept a
bestune
argument, which is used internally to inject optimized parameters after hyperparameter tuning. Ifbestune = NULL
, the function will explore a parameter grid across folds (parallelized withforeach
); ifbestune
is provided, the optimized parameters will be applied to rebuild the features on the full training data.- fold_construction_args_fixed
List. A list of arguments passed to
fold_construction_fun
that remain fixed during both cross-validation and final training.- fold_construction_args_tunable
List. A list of arguments passed to
fold_construction_fun
that define the hyperparameters to be tuned during cross-validation. Each element should contain candidate values for tuning.