
Train and evaluate machine learning models on previously constructed k folds
Source:R/machine_learning.R
compute_custom_k_fold_CV.Rd
This function performs k-fold cross-validation using custom folds created from custom functions to be used for cohort-dependent algorithms (see vignette for more information about this). It supports hyperparameter tuning over a grid and returns a model object that mimicks the caret's training output, including performance metrics and predictions.
Arguments
- processed_folds
A list of folds. Each fold contains processed training and test data with features.
- ml_method
A character string indicating the machine learning model to use, as supported by the
caret
package (e.g.,"rf"
,"svmRadial"
,"glmnet"
).- tuneGrid
Optional. A data frame specifying the grid of hyperparameters to evaluate. If
NULL
, a default grid of length 3 is generated using caret'sgetModelInfo()
.- ncores
Integer. Number of cores to use for parallelization. If not given, detectCores() - 1 will be used.
Value
A list with the following components:
Results_folds
: A data frame summarizing average cross-validated Accuracy, Kappa, and their standard deviations for each hyperparameter combination.Prediction_folds
: A data frame of predictions from each fold, including class probabilities, observed and predicted labels, and hyperparameter values.Resample_matrix
: A data frame summarizing Accuracy and Kappa per fold for the best-tuned model.Besttune
: List of optimized hyperparameters.
Details
This function performs the following:
Trains models for each fold and hyperparameter combination.
Predicts on the held-out test data of each fold.
Aggregates prediction results and evaluates Accuracy and Kappa for each fold and hyperparameter set.
Selects the best-performing hyperparameter set based on mean Accuracy across folds.
Trains the final model on the full dataset using the selected hyperparameters.