Skip to contents

This function processes a dataset for k-fold cross-validation using the multideconv framework. For each fold, it generates training and test datasets by computing deconvolution subgroups features from the deconvolution matrix. It also processes the entire dataset once to provide a final processed training set.

Usage

prepare_multideconv_folds(deconv, folds, coldata)

Arguments

folds

A list of integer vectors indicating row indices for the training set in each fold. The test set is implicitly defined as the complement.

coldata

A data frame with metadata (e.g., sample annotations), must match the number and order of samples in data.

data

A matrix or data frame of deconvolution features (samples x features) and a column named target indicating class labels.

Value

A list of two elements:

  • processed_folds: A list of folds, where each fold contains:

    • train_data: Processed training data with cell group features and target column.

    • test_data: Test data projected into the learned cell group feature space.

    • obs_test: True class labels for the test set.

    • rowIndex: Row indices corresponding to the test set.

    • fold_name: Optional fold name if provided in the folds list.

  • train_cell_data_final: Final cell group feature matrix for the full dataset, including the target column.

Details

The function runs the compute.deconvolution.analysis() function on each fold's training set and uses the trained projection to compute the test set representation. It also runs multideconv on the full dataset to return the complete processed training set.