Contents

Installation

library(cBioPortalData)
library(AnVIL)

Introduction

This vignette lays out the two main user-facing functions for downloading and representing data from the cBioPortal API. cBioDataPack makes use of the legacy distribution data method in cBioPortal (via tarballs). cBioPortalData allows for a more flexibile approach to obtaining data based on several available parameters including available molecular profiles.

Two main interfaces

cBioDataPack: Obtain Study Data as Zipped Tarballs

This function will access the packaged data from and return an integrative MultiAssayExperiment representation.

## Use ask=FALSE for non-interactive use
cBioDataPack("laml_tcga", ask = FALSE)
## A MultiAssayExperiment object of 12 listed
##  experiments with user-defined names and respective classes.
##  Containing an ExperimentList class object of length 12:
##  [1] CNA: SummarizedExperiment with 24776 rows and 191 columns
##  [2] RNA_Seq_expression_median: SummarizedExperiment with 19720 rows and 179 columns
##  [3] RNA_Seq_mRNA_median_all_sample_Zscores: SummarizedExperiment with 19720 rows and 179 columns
##  [4] RNA_Seq_v2_expression_median: SummarizedExperiment with 20531 rows and 173 columns
##  [5] RNA_Seq_v2_mRNA_median_Zscores: SummarizedExperiment with 20440 rows and 173 columns
##  [6] RNA_Seq_v2_mRNA_median_all_sample_Zscores: SummarizedExperiment with 20531 rows and 173 columns
##  [7] cna_hg19.seg: RaggedExperiment with 13571 rows and 191 columns
##  [8] linear_CNA: SummarizedExperiment with 24776 rows and 191 columns
##  [9] methylation_hm27: SummarizedExperiment with 10919 rows and 194 columns
##  [10] methylation_hm450: SummarizedExperiment with 10919 rows and 194 columns
##  [11] mutations_extended: RaggedExperiment with 2584 rows and 197 columns
##  [12] mutations_mskcc: RaggedExperiment with 2584 rows and 197 columns
## Functionality:
##  experiments() - obtain the ExperimentList instance
##  colData() - the primary/phenotype DataFrame
##  sampleMap() - the sample coordination DataFrame
##  `$`, `[`, `[[` - extract colData columns, subset, or experiment
##  *Format() - convert into a long or wide DataFrame
##  assays() - convert ExperimentList to a SimpleList of matrices
##  exportClass() - save all data to files

cBioPortalData: Obtain data from the cBioPortal API

This function provides a more flexible and granular way to request a MultiAssayExperiment object from a study ID, molecular profile, gene panel, sample list.

cbio <- cBioPortal()
acc <- cBioPortalData(api = cbio, by = "hugoGeneSymbol", studyId = "acc_tcga",
    genePanelId = "IMPACT341",
    molecularProfileIds = c("acc_tcga_rppa", "acc_tcga_linear_CNA")
)
## harmonizing input:
##   removing 1 colData rownames not in sampleMap 'primary'
acc
## A MultiAssayExperiment object of 2 listed
##  experiments with user-defined names and respective classes.
##  Containing an ExperimentList class object of length 2:
##  [1] acc_tcga_rppa: SummarizedExperiment with 57 rows and 46 columns
##  [2] acc_tcga_linear_CNA: SummarizedExperiment with 339 rows and 90 columns
## Functionality:
##  experiments() - obtain the ExperimentList instance
##  colData() - the primary/phenotype DataFrame
##  sampleMap() - the sample coordination DataFrame
##  `$`, `[`, `[[` - extract colData columns, subset, or experiment
##  *Format() - convert into a long or wide DataFrame
##  assays() - convert ExperimentList to a SimpleList of matrices
##  exportClass() - save all data to files

Clearing the cache

cBioDataPack

In cases where a download is interrupted, the user may experience a corrupt cache. The user can clear the cache for a particular study by using the removeCache function. Note that this function only works for data downloaded through the cBioDataPack function.

removeCache("laml_tcga")

cBioPortalData

For users who wish to clear the entire cBioPortalData cache, it is recommended that they use:

unlink("~/.cache/cBioPortalData/")

sessionInfo

sessionInfo()
## R Under development (unstable) (2021-01-05 r79797)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.13-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.13-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] cBioPortalData_2.3.16        MultiAssayExperiment_1.17.10
##  [3] SummarizedExperiment_1.21.1  Biobase_2.51.0              
##  [5] GenomicRanges_1.43.3         GenomeInfoDb_1.27.6         
##  [7] IRanges_2.25.6               S4Vectors_0.29.7            
##  [9] BiocGenerics_0.37.1          MatrixGenerics_1.3.1        
## [11] matrixStats_0.58.0           AnVIL_1.3.18                
## [13] dplyr_1.0.4                  BiocStyle_2.19.1            
## 
## loaded via a namespace (and not attached):
##  [1] bitops_1.0-6              bit64_4.0.5              
##  [3] filelock_1.0.2            progress_1.2.2           
##  [5] httr_1.4.2                GenomicDataCommons_1.15.0
##  [7] tools_4.1.0               R6_2.5.0                 
##  [9] DBI_1.1.1                 withr_2.4.1              
## [11] tidyselect_1.1.0          prettyunits_1.1.1        
## [13] TCGAutils_1.11.7          bit_4.0.4                
## [15] curl_4.3                  compiler_4.1.0           
## [17] cli_2.3.0                 rvest_0.3.6              
## [19] formatR_1.7               xml2_1.3.2               
## [21] DelayedArray_0.17.7       rtracklayer_1.51.4       
## [23] bookdown_0.21             readr_1.4.0              
## [25] askpass_1.1               rappdirs_0.3.3           
## [27] rapiclient_0.1.3          RCircos_1.2.1            
## [29] Rsamtools_2.7.1           stringr_1.4.0            
## [31] digest_0.6.27             rmarkdown_2.6            
## [33] XVector_0.31.1            pkgconfig_2.0.3          
## [35] htmltools_0.5.1.1         dbplyr_2.1.0             
## [37] fastmap_1.1.0             limma_3.47.6             
## [39] rlang_0.4.10              rstudioapi_0.13          
## [41] RSQLite_2.2.3             BiocIO_1.1.2             
## [43] generics_0.1.0            jsonlite_1.7.2           
## [45] BiocParallel_1.25.4       RCurl_1.98-1.2           
## [47] magrittr_2.0.1            GenomeInfoDbData_1.2.4   
## [49] futile.logger_1.4.3       Matrix_1.3-2             
## [51] Rcpp_1.0.6                lifecycle_0.2.0          
## [53] stringi_1.5.3             yaml_2.2.1               
## [55] RaggedExperiment_1.15.1   RJSONIO_1.3-1.4          
## [57] zlibbioc_1.37.0           BiocFileCache_1.15.1     
## [59] grid_4.1.0                blob_1.2.1               
## [61] crayon_1.4.0              lattice_0.20-41          
## [63] Biostrings_2.59.2         splines_4.1.0            
## [65] GenomicFeatures_1.43.3    hms_1.0.0                
## [67] KEGGREST_1.31.1           ps_1.5.0                 
## [69] knitr_1.31                pillar_1.4.7             
## [71] rjson_0.2.20              codetools_0.2-18         
## [73] biomaRt_2.47.4            futile.options_1.0.1     
## [75] XML_3.99-0.5              glue_1.4.2               
## [77] evaluate_0.14             lambda.r_1.2.4           
## [79] data.table_1.13.6         BiocManager_1.30.10      
## [81] vctrs_0.3.6               png_0.1-7                
## [83] tidyr_1.1.2               openssl_1.4.3            
## [85] purrr_0.3.4               assertthat_0.2.1         
## [87] cachem_1.0.3              xfun_0.20                
## [89] restfulr_0.0.13           survival_3.2-7           
## [91] tibble_3.0.6              RTCGAToolbox_2.21.5      
## [93] GenomicAlignments_1.27.2  AnnotationDbi_1.53.1     
## [95] memoise_2.0.0             ellipsis_0.3.1