This report creates the CSV files with the candidate differentially expressed regions (DERs) found using derfinder
(Collado-Torres, Frazee, Love, Irizarry, et al., 2015) via the single-base level approach on the Snyder and Hippo data sets. It also has Venn diagrams illustrating the overlap with known annotation.
The following code shows how to generate the CSV files from the R
objects.
## Setup
suppressMessages(library("GenomicRanges"))
suppressMessages(library("derfinder"))
suppressMessages(library("derfinderPlot"))
## Experiments
exps <- c('snyder', 'hippo')
## Find the latest run from each experiment
run <- sapply(exps, function(exp) {
runs <- dir(file.path('..', exp, 'derAnalysis'), pattern = 'run')
runs[length(runs)]
})
## Load regions and annotation data
regions <- lapply(exps, function(exp) {
load(file.path('..', exp, 'derAnalysis', run[exp], 'fullRegions.Rdata'))
res <- fullRegions
names(res) <- seq_len(length(res))
return(res)
})
anno <- lapply(exps, function(exp) {
load(file.path('..', exp, 'derAnalysis', run[exp],
'fullAnnotatedRegions.Rdata'))
## Fix region labels: its intergenic, not intragenic
if('intragenic' %in% names(fullAnnotatedRegions$countTable)) {
## Fix countTable
names(fullAnnotatedRegions$countTable)[names(fullAnnotatedRegions$countTable) == 'intragenic'] <- 'intergenic'
}
res <- fullAnnotatedRegions
return(res)
})
## Fix names
names(regions) <- names(anno) <- exps
## Perform check and create csv files
check <- vector("list", length(exps))
names(check) <- exps
for(exp in exps) {
## Peform check
check[[exp]] <- identical(nrow(anno[[exp]]$countTable),
length(regions[[exp]]))
## Export regions information into plain text
write.csv(as.data.frame(regions[[exp]]), file = paste0("supplementaryFile",
which(exp == exps), ".csv"), quote = FALSE, row.names = FALSE)
## You can later read them in R using:
# read.csv("supplementaryFile1.csv")
# read.csv("supplementaryFile2.csv")
## Compress
system(paste0('gzip supplementaryFile', which(exp == exps), '.csv'))
}
## Check that the rows match
unlist(check)
## snyder hippo
## TRUE TRUE
CSV files correspond to experiments snyder, hippo from runs run3-v1.0.10, run3-v1.0.10 respectively.
The following Venn diagrams show how many candidate differentially expressed regions (DERs) overlap with an exon, intron, or intergenic regions of the UCSC hg19 knownGene annotation. By default, a minimum overlap of 20 base pairs is required to say that a candidate DER overlaps any of feature. The annotation overlap was done using the mergeResults()
function from derfinder
(Collado-Torres, Frazee, Love, Irizarry, et al., 2015) while the Venn diagrams were made using derfinderPlot
(Collado-Torres, Jaffe, and Leek, 2015).
A maximum of three Venn diagrams are shown per data set. The first one uses all the candidate DERs, the second one uses only the candidate DERs that had a significant q-value (by default less than 0.10), and the third uses only the candidate DERs that had a FWER adjusted p-value less than 0.05.
for(exp in exps) {
## Using all candidate DERs
vennRegions(anno[[exp]], main=paste("\n", exp, "using UCSC.hg19.knownGene"),
counts.col="blue")
## Using candidate DERs with a significant q-value
if(sum(regions[[exp]]$significantQval == "TRUE") > 0)
vennRegions(anno[[exp]], regions[[exp]]$significantQval == "TRUE",
main=paste("\n\n", exp, "using UCSC.hg19.knownGene\nRestricted to significant q-value candidate DERs"),
counts.col = "blue")
## Using candidate DERs with a significant FWER adjusted p-value
if(sum(regions[[exp]]$significantFWER == "TRUE") > 0)
vennRegions(anno[[exp]], regions[[exp]]$significantFWER == "TRUE",
main=paste("\n\n", exp, "using UCSC.hg19.knownGene\nRestricted to significant FWER adjusted p-value candidate DERs"),
counts.col="blue")
}
The supplementary files 1 and 2 contain the information for the candidate DERs found in CSV format. The CSV files contain the following columns:
qvalue
(with contributions from Andrew J. Bass, Dabney, and Robinson, 2015) function from the package with the same name.annotateNearest()
from the bumphunter
package. They were calculated using the UCSC hg19 annotation. name refers to the nearest gene.bumphunter::annotateNearest()
.bumphunter::annotateNearest()
.bumphunter::annotateNearest()
.bumphunter::annotateNearest()
.bumphunter::annotateNearest()
.bumphunter::annotateNearest()
.bumphunter::annotateNearest()
.bumphunter::annotateNearest()
.bumphunter::annotateNearest()
.bumphunter::annotateNearest()
.bumphunter::annotateNearest()
.Date the report was generated.
## [1] "2016-02-20 10:27:43 EST"
Wallclock time spent generating the report.
## Time difference of 21.416 secs
R
session information.
## Session info -----------------------------------------------------------------------------------------------------------
## setting value
## version R version 3.2.2 (2015-08-14)
## system x86_64, darwin13.4.0
## ui X11
## language (EN)
## collate en_US.UTF-8
## tz America/New_York
## date 2016-02-20
## Packages ---------------------------------------------------------------------------------------------------------------
## package * version date source
## acepack 1.3-3.3 2014-11-24 CRAN (R 3.2.0)
## AnnotationDbi 1.32.3 2015-12-24 Bioconductor
## bibtex 0.4.0 2014-12-31 CRAN (R 3.2.0)
## Biobase 2.30.0 2015-10-14 Bioconductor
## BiocGenerics * 0.16.1 2015-11-06 Bioconductor
## BiocInstaller 1.20.1 2015-11-18 Bioconductor
## BiocParallel 1.4.3 2015-12-16 Bioconductor
## BiocStyle * 1.8.0 2015-10-14 Bioconductor
## biomaRt 2.26.1 2015-11-23 Bioconductor
## Biostrings 2.38.3 2016-01-02 Bioconductor
## biovizBase 1.18.0 2015-10-14 Bioconductor
## bitops 1.0-6 2013-08-17 CRAN (R 3.2.0)
## BSgenome 1.38.0 2015-10-14 Bioconductor
## bumphunter 1.10.0 2015-10-14 Bioconductor
## Cairo 1.5-9 2015-09-26 CRAN (R 3.2.0)
## cluster 2.0.3 2015-07-21 CRAN (R 3.2.2)
## codetools 0.2-14 2015-07-15 CRAN (R 3.2.2)
## colorspace 1.2-6 2015-03-11 CRAN (R 3.2.0)
## DBI 0.3.1 2014-09-24 CRAN (R 3.2.0)
## derfinder * 1.5.19 2015-12-15 Bioconductor
## derfinderHelper 1.4.1 2015-11-03 Bioconductor
## derfinderPlot * 1.4.1 2015-11-03 Bioconductor
## devtools 1.10.0 2016-01-23 CRAN (R 3.2.3)
## dichromat 2.0-0 2013-01-24 CRAN (R 3.2.0)
## digest 0.6.9 2016-01-08 CRAN (R 3.2.3)
## doRNG 1.6 2014-03-07 CRAN (R 3.2.0)
## evaluate 0.8 2015-09-18 CRAN (R 3.2.0)
## foreach 1.4.3 2015-10-13 CRAN (R 3.2.0)
## foreign 0.8-66 2015-08-19 CRAN (R 3.2.0)
## formatR 1.2.1 2015-09-18 CRAN (R 3.2.0)
## Formula 1.2-1 2015-04-07 CRAN (R 3.2.0)
## futile.logger 1.4.1 2015-04-20 CRAN (R 3.2.0)
## futile.options 1.0.0 2010-04-06 CRAN (R 3.2.0)
## GenomeInfoDb * 1.6.3 2016-01-26 Bioconductor
## GenomicAlignments 1.6.3 2016-01-06 Bioconductor
## GenomicFeatures 1.22.12 2016-01-28 Bioconductor
## GenomicFiles 1.6.2 2015-12-30 Bioconductor
## GenomicRanges * 1.22.4 2016-01-30 Bioconductor
## GGally 1.0.1 2016-01-14 CRAN (R 3.2.3)
## ggbio 1.18.3 2016-01-13 Bioconductor
## ggplot2 2.0.0 2015-12-18 CRAN (R 3.2.3)
## graph 1.48.0 2015-10-14 Bioconductor
## gridExtra 2.0.0 2015-07-14 CRAN (R 3.2.0)
## gtable 0.1.2 2012-12-05 CRAN (R 3.2.0)
## Hmisc 3.17-1 2015-12-18 CRAN (R 3.2.3)
## htmltools 0.3 2015-12-29 CRAN (R 3.2.3)
## httr 1.1.0 2016-01-28 CRAN (R 3.2.3)
## IRanges * 2.4.6 2015-12-12 Bioconductor
## iterators 1.0.8 2015-10-13 CRAN (R 3.2.0)
## knitcitations * 1.0.7 2015-10-28 CRAN (R 3.2.0)
## knitr 1.12.3 2016-01-22 CRAN (R 3.2.3)
## lambda.r 1.1.7 2015-03-20 CRAN (R 3.2.0)
## lattice 0.20-33 2015-07-14 CRAN (R 3.2.2)
## latticeExtra 0.6-26 2013-08-15 CRAN (R 3.2.0)
## limma 3.26.7 2016-01-28 Bioconductor
## locfit 1.5-9.1 2013-04-20 CRAN (R 3.2.0)
## lubridate 1.5.0 2015-12-03 CRAN (R 3.2.3)
## magrittr 1.5 2014-11-22 CRAN (R 3.2.0)
## Matrix 1.2-3 2015-11-28 CRAN (R 3.2.2)
## matrixStats 0.50.1 2015-12-15 CRAN (R 3.2.3)
## memoise 1.0.0 2016-01-29 CRAN (R 3.2.3)
## munsell 0.4.2 2013-07-11 CRAN (R 3.2.0)
## nnet 7.3-11 2015-08-30 CRAN (R 3.2.0)
## OrganismDbi 1.12.1 2015-12-23 Bioconductor
## pkgmaker 0.22 2014-05-14 CRAN (R 3.2.0)
## plyr 1.8.3 2015-06-12 CRAN (R 3.2.1)
## qvalue 2.2.2 2016-01-08 Bioconductor
## R6 2.1.2 2016-01-26 CRAN (R 3.2.3)
## RBGL 1.46.0 2015-10-14 Bioconductor
## RColorBrewer 1.1-2 2014-12-07 CRAN (R 3.2.0)
## Rcpp 0.12.3 2016-01-10 CRAN (R 3.2.3)
## RCurl 1.95-4.7 2015-06-30 CRAN (R 3.2.1)
## RefManageR 0.10.5 2016-01-02 CRAN (R 3.2.3)
## registry 0.3 2015-07-08 CRAN (R 3.2.1)
## reshape 0.8.5 2014-04-23 CRAN (R 3.2.0)
## reshape2 1.4.1 2014-12-06 CRAN (R 3.2.0)
## RJSONIO 1.3-0 2014-07-28 CRAN (R 3.2.0)
## rmarkdown * 0.9.2 2016-01-01 CRAN (R 3.2.3)
## rngtools 1.2.4 2014-03-06 CRAN (R 3.2.0)
## rpart 4.1-10 2015-06-29 CRAN (R 3.2.2)
## Rsamtools 1.22.0 2015-10-14 Bioconductor
## RSQLite 1.0.0 2014-10-25 CRAN (R 3.2.0)
## rtracklayer 1.30.1 2015-10-22 Bioconductor
## S4Vectors * 0.8.11 2016-01-29 Bioconductor
## scales 0.3.0 2015-08-25 CRAN (R 3.2.0)
## stringi 1.0-1 2015-10-22 CRAN (R 3.2.0)
## stringr 1.0.0 2015-04-30 CRAN (R 3.2.0)
## SummarizedExperiment 1.0.2 2016-01-01 Bioconductor
## survival 2.38-3 2015-07-02 CRAN (R 3.2.2)
## VariantAnnotation 1.16.4 2015-12-09 Bioconductor
## XML 3.98-1.3 2015-06-30 CRAN (R 3.2.0)
## xtable 1.8-0 2015-11-02 CRAN (R 3.2.0)
## XVector 0.10.0 2015-10-14 Bioconductor
## yaml 2.1.13 2014-06-12 CRAN (R 3.2.0)
## zlibbioc 1.16.0 2015-10-14 Bioconductor
This report was generated using BiocStyle
(Morgan, Oleś, and Huber, 2016) with knitr
(Xie, 2014) and rmarkdown
(Allaire, Cheng, Xie, McPherson, et al., 2016) running behind the scenes.
Citations made with knitcitations
(Boettiger, 2015). Citation file: venn.bib.
[1] J. Allaire, J. Cheng, Y. Xie, J. McPherson, et al. rmarkdown: Dynamic Documents for R. R package version 0.9.2. 2016. URL: http://CRAN.R-project.org/package=rmarkdown.
[2] J. D. S. with contributions from Andrew J. Bass, A. Dabney and D. Robinson. qvalue: Q-value estimation for false discovery rate control. R package version 2.2.2. 2015. URL: http://github.com/jdstorey/qvalue.
[3] C. Boettiger. knitcitations: Citations for ‘Knitr’ Markdown Files. R package version 1.0.7. 2015. URL: http://CRAN.R-project.org/package=knitcitations.
[4] L. Collado-Torres, A. C. Frazee, M. I. Love, R. A. Irizarry, et al. “derfinder: Software for annotation-agnostic RNA-seq differential expression analysis”. In: bioRxiv (2015). DOI: 10.1101/015370. URL: http://www.biorxiv.org/content/early/2015/02/19/015370.abstract.
[5] L. Collado-Torres, A. E. Jaffe and J. T. Leek. derfinderPlot: Plotting functions for derfinder. https://github.com/leekgroup/derfinderPlot - R package version 1.4.1. 2015. URL: http://www.bioconductor.org/packages/release/bioc/html/derfinderPlot.html.
[6] M. Morgan, A. Oleś and W. Huber. BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 1.8.0. 2016. URL: https://github.com/Bioconductor/BiocStyle.
[7] Y. Xie. “knitr: A Comprehensive Tool for Reproducible Research in R”. In: Implementing Reproducible Computational Research. Ed. by V. Stodden, F. Leisch and R. D. Peng. ISBN 978-1466561595. Chapman and Hall/CRC, 2014. URL: http://www.crcpress.com/product/isbn/9781466561595.