R/download_study.R
download_study.Rd
Download the gene or exon level RangedSummarizedExperiment-class objects provided by the recount project. Alternatively download the counts, metadata or file information for a given SRA study id. You can also download the sample bigWig files or the mean coverage bigWig file.
download_study(
project,
type = "rse-gene",
outdir = project,
download = TRUE,
version = 2,
...
)
A character vector with one SRA study id.
Specifies which files to download. The options are:
the gene-level RangedSummarizedExperiment-class object in a file named rse_gene.Rdata.
the exon-level RangedSummarizedExperiment-class object in a file named rse_exon.Rdata.
the exon-exon junction level RangedSummarizedExperiment-class object in a file named rse_jx.Rdata.
the transcript level RangedSummarizedExperiment-class object in a file named rse_tx.RData.
the gene-level counts in a tsv file named counts_gene.tsv.gz.
the exon-level counts in a tsv file named counts_exon.tsv.gz.
the exon-exon junction level counts in a tsv file named counts_jx.tsv.gz.
the phenotype data for the study in a tsv file named
project
.tsv.
the files information for the given study (including md5sum hashes) in a tsv file named files_info.tsv.
one bigWig file per sample in the study.
one mean bigWig file for the samples in the study, with each sample normalized to a 40 million 100 bp library using the total coverage sum (area under the coverage curve, AUC) for the given sample.
Downloads all the above types. Note that it might take some
time if the project has many samples. When using type = 'all'
a
small delay will be added before each download request to avoid
request issues.
Downloads the FANTOM-CAT/recount2 rse file described in Imada, Sanchez, et al., bioRxiv, 2019.
The destination directory for the downloaded file(s).
Alternatively check the SciServer
section on the vignette to see
how to access all the recount data via a R Jupyter Notebook.
Whether to download the files or just get the download urls.
A single integer specifying which version of the files to
download. Valid options are 1 and 2, as described in
https://jhubiostatistics.shinyapps.io/recount/ under the
documentation tab. Briefly, version 1 are counts based on reduced exons while
version 2 are based on disjoint exons. This argument mostly just matters for
the exon counts. Defaults to version 2 (disjoint exons).
Use version = 1
for backward compatability with exon counts
prior to version 1.5.3 of the package.
Additional arguments passed to download.
Returns invisibly the URL(s) for the files that were downloaded.
Check http://stackoverflow.com/a/34383991 if you need to find the effective URLs. For example, http://duffel.rail.bio/recount/DRP000366/bw/mean_DRP000366.bw points to a link from SciServer.
Transcript quantifications are described in Fu et al, bioRxiv, 2018. https://www.biorxiv.org/content/10.1101/247346v2
FANTOM-CAT/recount2 quantifications are described in Imada, Sanchez, et al., bioRxiv, 2019. https://www.biorxiv.org/content/10.1101/659490v1
## Find the URL to download the RangedSummarizedExperiment for the
## Geuvadis consortium study.
url <- download_study("ERP001942", download = FALSE)
## See the actual URL
url
#> [1] "http://duffel.rail.bio/recount/v2/ERP001942/rse_gene.Rdata"
if (FALSE) {
## Download the example data included in the package for study SRP009615
url2 <- download_study("SRP009615")
url2
## Load the data
load(file.path("SRP009615", "rse_gene.Rdata"))
## Compare the data
library("testthat")
expect_equivalent(rse_gene, rse_gene_SRP009615)
}