The function calculates cell abundance based on cell type signatures using different methods and signatures. Methods available are Quantiseq, MCP, XCell, CIBERSORTx, EpiDISH, DWLS and DeconRNASeq. Provided signatures included signatures based on bulk and methylation data (7 methods and 10 signature in total). Signatures are present in the src/signatures directory, user can add its own signatures by adding the .txt files in this same folder. Second generation methods to perform deconvolution based on single cell data are also available if scRNAseq object is provided.
Usage
compute.deconvolution(
raw.counts,
methods = c("Quantiseq", "CBSX", "Epidish", "DeconRNASeq", "DWLS"),
signatures_exclude = NULL,
normalized = TRUE,
doParallel = FALSE,
workers = NULL,
return = TRUE,
create_signature = FALSE,
credentials.mail = NULL,
credentials.token = NULL,
sc_deconv = FALSE,
sc_matrix = NULL,
sc_metadata = NULL,
methods_sc = c("Autogenes", "BayesPrism", "Bisque", "CPM", "MuSic", "SCDC"),
cell_label = NULL,
sample_label = NULL,
cell_markers = NULL,
methods_sig = c("DWLS", "CIBERSORTx", "MOMF", "BSeqsc"),
name_sc_signature = NULL,
file_name = NULL
)
Arguments
- raw.counts
A matrix with the raw counts (samples as columns and genes symbols as rows)
- methods
A character vector with the deconvolution methods to run. Default are "Quantiseq", "MCP", "xCell", "CBSX", "Epidish", "DeconRNASeq", "DWLS"
- signatures_exclude
A character vector with the signatures to exclude from the src/signatures folder.
- normalized
If raw.counts are not available, user can input its normalized counts. In that case this argument need to be set to False.
- doParallel
Whether to do or not parallelization. Only CBSX and DWLS methods will run in parallel.
- workers
Number of processes available to run on parallel. If no number is set, this will correspond to detectCores() - 1
- return
Whether to save or not the csv file with the deconvolution features
- create_signature
Whether to create or not the signatures using the methods MOMF, CBSX, DWLS and BSeq-SC. If TRUE, sc_matrix shuld be provide.
- credentials.mail
(Optional) Credential email for running CIBERSORTx. If not provided, CIBERSORTx method will not be run.
- credentials.token
(Optional) Credential token for running CIBERSORTx. If not provided, CIBERSORTx method will not be run.
- sc_deconv
Whether to run or not deconvolution methods based on single cell.
- sc_matrix
If sc_deconv = T, the matrix of counts across cells from the scRNAseq object is provided.
- sc_metadata
Dataframe with metadata from the single cell object. The matrix should include the columns cell_label and sample_label.
- methods_sc
A character vector with the sc-deconvolution methods to run. Default are "Autogenes", "BayesPrism", "Bisque", "CPM", "MuSic", "SCDC"
- cell_label
If sc_deconv = T, a character vector indicating the cell labels (same order as the count matrix)
- sample_label
If sc_deconv = T, a character vector indicating the cell samples IDs (same order as the count matrix)
- cell_markers
Named list with the genes markers names as Symbol per cell types to be used to create the signature using the BSeq-SC method. If NULL, the method will be ignored during the signature creation.
- methods_sig
A character vector specifying which methods to run. Options are "DWLS", "CIBERSORTx", "MOMF", and "BSeqsc". Default runs all available methods.
- name_sc_signature
If sc_deconv = T, the name you want to give to the signature generated
- file_name
File name for the csv files and plots saved in the Results/ directory
References
Sturm, G., Finotello, F., Petitprez, F., Zhang, J. D., Baumbach, J., Fridman, W. H., ..., List, M., Aneichyk, T. (2019). Comprehensive evaluation of transcriptome-based cell-type quantification methods for immuno-oncology. Bioinformatics, 35(14), i436-i445. https://doi.org/10.1093/bioinformatics/btz363
Benchmarking second-generation methods for cell-type deconvolution of transcriptomic data. Dietrich, Alexander and Merotto, Lorenzo and Pelz, Konstantin and Eder, Bernhard and Zackl, Constantin and Reinisch, Katharina and Edenhofer, Frank and Marini, Federico and Sturm, Gregor and List, Markus and Finotello, Francesca. (2024) https://doi.org/10.1101/2024.06.10.598226
Examples
data("raw_counts")
data("cell_labels")
data("sample_labels")
data("metacells_data")
data("metacells_metadata")
data("pseudobulk")
deconv = compute.deconvolution(raw_counts, normalized = TRUE,
methods = c("Epidish", "DeconRNASeq"), return = FALSE)
#> Performing TPM normalization ................................................................................
#>
#> Converting input to matrix.
#> Running deconvolution using the following methods...............................................................
#>
#> * Epidish
#> * DeconRNASeq
#>
#> The following method-signature combinations are going to be calculated...............................................................
#>
#> Methods
#> * Epidish
#> * DeconRNASeq
#>
#> Signatures
#> * BPRNACan
#> * BPRNACan3DProMet
#> * BPRNACanProMet
#> * CBSX-HNSCC-scRNAseq
#> * CBSX-Melanoma-scRNAseq
#> * CBSX-NSCLC-PBMCs-scRNAseq
#> * CCLE-TIL10
#> * LM22
#> * TIL10
#>
#> Running DeconRNASeq...............................................................
#>
#> Loading required package: Biobase
#> Loading required package: BiocGenerics
#> Loading required package: generics
#>
#> Attaching package: ‘generics’
#> The following objects are masked from ‘package:base’:
#>
#> as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
#> setequal, union
#>
#> Attaching package: ‘BiocGenerics’
#> The following objects are masked from ‘package:stats’:
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from ‘package:base’:
#>
#> Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#> as.data.frame, basename, cbind, colnames, dirname, do.call,
#> duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
#> mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
#> rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
#> unsplit, which.max, which.min
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
#>
#> Attaching package: ‘pcaMethods’
#> The following object is masked from ‘package:stats’:
#>
#> loadings
#> svd calculated PCA
#> Importance of component(s):
#> PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
#> R2 0.6788 0.1242 0.08553 0.05229 0.01417 0.0082 0.00602 0.00494
#> Cumulative R2 0.6788 0.8029 0.88847 0.94076 0.95493 0.9631 0.96915 0.97409
#> PC9 PC10
#> R2 0.00393 0.00282
#> Cumulative R2 0.97802 0.98084
#>
#> Attention: the number of pure cell types = 10 defined in the signature matrix;
#>
#> PCA results indicate that the number of cell types in the mixtures = 1
#>
#> Running Epidish...............................................................
#>
#>
#> Running DeconRNASeq...............................................................
#>
#> svd calculated PCA
#> Importance of component(s):
#> PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
#> R2 0.6788 0.1242 0.08553 0.05229 0.01417 0.0082 0.00602 0.00494
#> Cumulative R2 0.6788 0.8029 0.88847 0.94076 0.95493 0.9631 0.96915 0.97409
#> PC9 PC10
#> R2 0.00393 0.00282
#> Cumulative R2 0.97802 0.98084
#>
#> Attention: the number of pure cell types = 10 defined in the signature matrix;
#>
#> PCA results indicate that the number of cell types in the mixtures = 1
#>
#> Running Epidish...............................................................
#>
#>
#> Running DeconRNASeq...............................................................
#>
#> svd calculated PCA
#> Importance of component(s):
#> PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
#> R2 0.6788 0.1242 0.08553 0.05229 0.01417 0.0082 0.00602 0.00494
#> Cumulative R2 0.6788 0.8029 0.88847 0.94076 0.95493 0.9631 0.96915 0.97409
#> PC9 PC10
#> R2 0.00393 0.00282
#> Cumulative R2 0.97802 0.98084
#>
#> Attention: the number of pure cell types = 10 defined in the signature matrix;
#>
#> PCA results indicate that the number of cell types in the mixtures = 1
#>
#> Running Epidish...............................................................
#>
#>
#> Running DeconRNASeq...............................................................
#>
#> svd calculated PCA
#> Importance of component(s):
#> PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
#> R2 0.6788 0.1242 0.08553 0.05229 0.01417 0.0082 0.00602 0.00494
#> Cumulative R2 0.6788 0.8029 0.88847 0.94076 0.95493 0.9631 0.96915 0.97409
#> PC9 PC10
#> R2 0.00393 0.00282
#> Cumulative R2 0.97802 0.98084
#>
#> Attention: the number of pure cell types = 10 defined in the signature matrix;
#>
#> PCA results indicate that the number of cell types in the mixtures = 1
#>
#> Running Epidish...............................................................
#>
#>
#> Running DeconRNASeq...............................................................
#>
#> svd calculated PCA
#> Importance of component(s):
#> PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
#> R2 0.6788 0.1242 0.08553 0.05229 0.01417 0.0082 0.00602 0.00494
#> Cumulative R2 0.6788 0.8029 0.88847 0.94076 0.95493 0.9631 0.96915 0.97409
#>
#> Attention: the number of pure cell types = 8 defined in the signature matrix;
#>
#> PCA results indicate that the number of cell types in the mixtures = 1
#>
#> Running Epidish...............................................................
#>
#>
#> Running DeconRNASeq...............................................................
#>
#> svd calculated PCA
#> Importance of component(s):
#> PC1 PC2 PC3 PC4 PC5 PC6
#> R2 0.6788 0.1242 0.08553 0.05229 0.01417 0.0082
#> Cumulative R2 0.6788 0.8029 0.88847 0.94076 0.95493 0.9631
#>
#> Attention: the number of pure cell types = 6 defined in the signature matrix;
#>
#> PCA results indicate that the number of cell types in the mixtures = 1
#>
#> Running Epidish...............................................................
#>
#>
#> Running DeconRNASeq...............................................................
#>
#> svd calculated PCA
#> Importance of component(s):
#> PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
#> R2 0.6788 0.1242 0.08553 0.05229 0.01417 0.0082 0.00602 0.00494
#> Cumulative R2 0.6788 0.8029 0.88847 0.94076 0.95493 0.9631 0.96915 0.97409
#> PC9 PC10 PC11
#> R2 0.00393 0.00282 0.00234
#> Cumulative R2 0.97802 0.98084 0.98318
#>
#> Attention: the number of pure cell types = 11 defined in the signature matrix;
#>
#> PCA results indicate that the number of cell types in the mixtures = 1
#>
#> Running Epidish...............................................................
#>
#>
#> Running DeconRNASeq...............................................................
#>
#> svd calculated PCA
#> Importance of component(s):
#> PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
#> R2 0.6788 0.1242 0.08553 0.05229 0.01417 0.0082 0.00602 0.00494
#> Cumulative R2 0.6788 0.8029 0.88847 0.94076 0.95493 0.9631 0.96915 0.97409
#> PC9 PC10 PC11 PC12 PC13 PC14 PC15 PC16
#> R2 0.00393 0.00282 0.00234 0.00183 0.00178 0.00162 0.0014 0.00123
#> Cumulative R2 0.97802 0.98084 0.98318 0.98501 0.98679 0.98841 0.9898 0.99104
#> PC17 PC18 PC19 PC20 PC21 PC22
#> R2 0.0011 0.00105 0.00098 0.00083 0.00075 0.00069
#> Cumulative R2 0.9921 0.99319 0.99417 0.99500 0.99575 0.99644
#>
#> Attention: the number of pure cell types = 22 defined in the signature matrix;
#>
#> PCA results indicate that the number of cell types in the mixtures = 16
#>
#> Running Epidish...............................................................
#>
#>
#> Running DeconRNASeq...............................................................
#>
#> svd calculated PCA
#> Importance of component(s):
#> PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
#> R2 0.6788 0.1242 0.08553 0.05229 0.01417 0.0082 0.00602 0.00494
#> Cumulative R2 0.6788 0.8029 0.88847 0.94076 0.95493 0.9631 0.96915 0.97409
#> PC9 PC10
#> R2 0.00393 0.00282
#> Cumulative R2 0.97802 0.98084
#>
#> Attention: the number of pure cell types = 10 defined in the signature matrix;
#>
#> PCA results indicate that the number of cell types in the mixtures = 1
#>
#> Running Epidish...............................................................
#>
#> Preprocessing deconvolution features...............................................................
#>
#> Checking consistency in deconvolution cell fractions across patients...............................................................
#>
#>
#> Total sum across samples of combination Epidish_BPRNACan_ is 1
#> Total sum across samples of combination Epidish_BPRNACanProMet is 1
#> Total sum across samples of combination Epidish_BPRNACan3DProMet is 1
#> Total sum across samples of combination Epidish_CBSX.HNSCC.scRNAseq is 1
#> Total sum across samples of combination Epidish_CBSX.Melanoma.scRNAseq is 1
#> Total sum across samples of combination Epidish_CBSX.NSCLC.PBMCs.scRNAseq is 1
#> Total sum across samples of combination Epidish_CCLE.TIL10 is 1
#> Total sum across samples of combination Epidish_TIL10 is 1
#> Total sum across samples of combination Epidish_LM22 is 1
#> Total sum across samples of combination DeconRNASeq_BPRNACan_ is 1
#> Total sum across samples of combination DeconRNASeq_BPRNACanProMet is 1
#> Total sum across samples of combination DeconRNASeq_BPRNACan3DProMet is 1
#> Total sum across samples of combination DeconRNASeq_CBSX.HNSCC.scRNAseq is 1
#> Total sum across samples of combination DeconRNASeq_CBSX.Melanoma.scRNAseq is 1
#> Total sum across samples of combination DeconRNASeq_CBSX.NSCLC.PBMCs.scRNAseq is 1
#> Total sum across samples of combination DeconRNASeq_CCLE.TIL10 is 1
#> Total sum across samples of combination DeconRNASeq_TIL10 is 1
#> Total sum across samples of combination DeconRNASeq_LM22 is 1
#> Total sum across samples of combination Epidish_CBSX.Melanoma.scRNAseq is 1