Compare vs count-based methods

This report compares the original implementation of derfinder available at alyssafrazee/derfinder, DESeq2, edgeR-robust and our implementation of derfinder using the exonic segments described in evaluate.html.

Counts-based analysis

This section has the code for running edgeR-robust and DESeq2 on the simulation data set using the known exons as features.

This first code chunk loads the necessary data.

library('edgeR')

## Loading required package: limma

library('DESeq2')

## Loading required package: S4Vectors
## Loading required package: stats4
## Loading required package: BiocGenerics
## Loading required package: parallel
## 
## Attaching package: 'BiocGenerics'
## 
## The following objects are masked from 'package:parallel':
## 
##     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
##     clusterExport, clusterMap, parApply, parCapply, parLapply,
##     parLapplyLB, parRapply, parSapply, parSapplyLB
## 
## The following object is masked from 'package:limma':
## 
##     plotMA
## 
## The following object is masked from 'package:stats':
## 
##     xtabs
## 
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, append, as.data.frame, as.vector, cbind,
##     colnames, do.call, duplicated, eval, evalq, Filter, Find, get,
##     intersect, is.unsorted, lapply, Map, mapply, match, mget,
##     order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
##     rbind, Reduce, rep.int, rownames, sapply, setdiff, sort,
##     table, tapply, union, unique, unlist, unsplit
## 
## Loading required package: IRanges
## Loading required package: GenomicRanges
## Loading required package: GenomeInfoDb
## Loading required package: Rcpp
## Loading required package: RcppArmadillo

library('GenomicRanges')

## Load data
load("../coverageToExon/covToEx-ucsc.Rdata")
load("../derAnalysis/run2-v1.0.10/groupInfo.Rdata")
if(file.exists("../derAnalysis/run2-v1.0.10/colsubset.Rdat")) {
    load("../derAnalysis/run2-v1.0.10/colsubset.Rdata")
} else {
    colsubset <- seq_len(length(groupInfo))
}

## GenomicState object
if(file.exists('/home/epi/ajaffe/Lieber/Projects/RNAseq/derannotator/rdas/GenomicState.Hsapiens.UCSC.hg19.knownGene.rda')) {
    load('/home/epi/ajaffe/Lieber/Projects/RNAseq/derannotator/rdas/GenomicState.Hsapiens.UCSC.hg19.knownGene.rda')
} else if(file.exists('../../GenomicState.Hsapiens.UCSC.hg19.knownGene.rda')) {
    load('../../GenomicState.Hsapiens.UCSC.hg19.knownGene.rda')
} else {
    stop('Missing UCSC hg19 genomic state object')
}

## Annotation used
exons <- GenomicState.Hsapiens.UCSC.hg19.knownGene$fullGenome
exons <- exons[exons$theRegion == 'exon']

## Round matrix and remove exons with 0s
counts <- round(covToEx[, colsubset])
nonzero <- sapply(rowSums(counts), function(x) {x > 0})

DESeq2

The following code performs the DESeq2 analysis. Code is based on edgeR_Robust supplementary code. The main change is that it has been modified for the multi-group scenario.

## Round matrix and specify design
dse <- DESeqDataSetFromMatrix(counts[nonzero, ], data.frame(group = groupInfo), ~ group)

## converting counts to integer mode

## Perform DE analysis
system.time( dse <- DESeq(dse, test = 'LRT', reduced = ~ 1) )

## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing

##    user  system elapsed 
##   0.812   0.020   0.832

## Extract results
deseq <- exons[nonzero]
mcols(deseq) <- cbind(mcols(deseq), results(dse))

## Which are significant?
deseq$sig <- deseq$padj < 0.05
deseq$sig[is.na(deseq$sig)] <- FALSE

## Save results
save(deseq, file = 'deseq.Rdata')

edgeR-robust

The following code performs the DESeq2 analysis. Code is based on edgeR_Robust supplementary code. The main change is that it has been modified for the multi-group scenario.

## Determine design matrix
design <- model.matrix(~ groupInfo)

## Perform DE analysis
d <- DGEList(counts = counts[nonzero, ], group = groupInfo)
d <- calcNormFactors(d)
system.time(dw <- estimateGLMRobustDisp(d, design = design, prior.df = 10, maxit = 6))

##    user  system elapsed 
##   1.907   0.050   1.957

fw <- glmFit(dw, design = design, coef = 2:3)
lrw <- glmLRT(fw, coef = 2:3)

## Extract results
edger <- exons[nonzero]
mcols(edger) <- cbind(mcols(edger), DataFrame(lrw$table))
edger$pvalue <-  lrw$table$PValue
edger$padj <- p.adjust(lrw$table$PValue, 'BH')

## Which are significant?
edger$sig <- edger$padj < 0.05
edger$sig[is.na(edger$sig)] <- FALSE

## Save results
save(edger, file = 'edger.Rdata')

Comparison

library('TxDb.Hsapiens.UCSC.hg19.knownGene')

## Loading required package: GenomicFeatures
## Loading required package: AnnotationDbi
## Loading required package: Biobase
## Welcome to Bioconductor
## 
##     Vignettes contain introductory material; view with
##     'browseVignettes()'. To cite Bioconductor, see
##     'citation("Biobase")', and for packages 'citation("pkgname")'.

#library('knitr')
library('derfinder')

## Want to contribute a new feature? Fork derfinder at
##  https://github.com/lcolladotor/derfinder/fork
## Then submit a pull request =)

library('derfinderHelper')
library('derfinderPlot')

## Find out what's changed in derfinderPlot with
## news(Version == "1.1.6", package = "derfinderPlot")

library('qvalue')
#library('bumphunter')
load('../simulation_info.Rdata')
load('../derAnalysis/run2-v1.0.10/fullRegions.Rdata')
load('../derAnalysis/run2-v1.0.10/models.Rdata')
load('../derAnalysis/run2-v1.0.10/chr22/optionsStats.Rdata')
load('../CoverageInfo/fullCov.Rdata')
names(fullRegions) <- seq_len(length(fullRegions))

Exonic segments

Just as in evaluate.html we can compare the results against the exonic segments. From that document:

Next we can evaluate the simulation by classifying the exonic segments as whether they should be DE or not. Then, we can find if the DERs overlap such segments.

The following code is a subset of evaluate.html and generates the exonic segments.

## Find exons
txdb <- keepSeqlevels(TxDb.Hsapiens.UCSC.hg19.knownGene, 'chr22')
txinfo <- select(txdb, keys = chosen$ucsckg_id, columns = columns(txdb), keytype = 'TXNAME')

## Buiild GRangesList with exons grouped by transcript
tx <- split(GRanges(seqnames = txinfo$EXONCHROM, IRanges(start = txinfo$EXONSTART, end = txinfo$EXONEND), strand = txinfo$EXONSTRAND), txinfo$TXNAME)
tx <- tx[match(chosen$ucsckg_id, names(tx))]

## Gene level: DE if at least one transcript is DE
gene <- data.frame(gene_id = unique(chosen$gene_id))
gene$DE <- sapply(gene$gene_id, function(x) { any(chosen$DE[chosen$gene_id == x])  })
gene$case <- sapply(gene$gene_id, function(x) { unique(chosen$case[chosen$gene_id == x])  })

## Identify exonic segments
segments <- GRangesList(lapply(gene$gene_id, function(x) {
    i <- chosen$ucsckg_id[ chosen$gene_id == x]
    
    ## Find segments
    segs <- disjoin(unlist(tx[i]))
    ov <- findOverlaps(segs, tx[i])
    
    ## Find DE status per segment
    segs$DE <- as.vector(tapply(subjectHits(ov), queryHits(ov), function(y) {
        any(chosen$DE[ chosen$gene_id == x])
    }))
    
    ## Finish
    return(segs)
}))
names(segments) <- gene$gene_id
segs <- unlist(segments)

derfinder-original

The original derfinder implementation does not support multi-group comparisons. So we performed all pair comparisons for the three groups. We can extract those that are differentially expressed (states 3 and 4) with a FDR adjusted p-value < 0.05.

load('../deranalysis-original/chr22/regions-merged-chr22-AB.Rdata')
original <- list('AB' = regions.merged)
load('../deranalysis-original/chr22/regions-merged-chr22-AC.Rdata')
original <- c(original, list('AC' = regions.merged))
load('../deranalysis-original/chr22/regions-merged-chr22-BC.Rdata')
original <- c(original, list('BC' = regions.merged))
original <- lapply(original, function(x) { 
    GRanges(seqnames = x$chr, ranges = IRanges(x$start, x$end), strand = '*', state = x$state, mean.t = x$mean.t, mean.fold.change = x$mean.fold.change)
})

## Load p-value info
original <- lapply(names(original), function(x) {
    load(paste0('../deranalysis-original/chr22/pvals-chr22-', x, '.Rdata'))
    original[[x]]$pvalue <- pvals
    return(original[[x]])
})


## Identify DE regions
ori.de <- lapply(original, function(x) {
    state.idx <- x$state > 2
    x[state.idx & !is.na(x$pvalue)]
})

## Adjust p-values for FDR and select significant ones
ori.de.sig <- do.call(c, ori.de)
ori.de.sig$qvalue <- qvalue(ori.de.sig$pvalue)$qvalue
ori.de.sig <- ori.de.sig[ori.de.sig$qvalue < 0.05]

Due to the simulation setup, only one of the three groups is differentially expressed at a time. One option is to compare the exonic segments versus the resulting DERs and ask if the exonic segments overlap at least one DER as shown below.

## Overlaps at least 1 DER
addmargins(table('DE status' = segs$DE, 'Overlaps DER (sig) -- original, min 1' = countOverlaps(segs, ori.de.sig) > 0))

##          Overlaps DER (sig) -- original, min 1
## DE status FALSE TRUE Sum
##     FALSE    96   15 111
##     TRUE     11  158 169
##     Sum     107  173 280

The results from the new implementation are shown below (extracted from evaluate.html). The original implementation has 15 false positive (versus 0 in the new implemenation) and less false negatives (11 versus 30).

## Compare against new derfinder:
## Check result with FDR sig 
fullRegions$significantFDR <- factor(fullRegions$qvalues < 0.05, levels = c('TRUE', 'FALSE'))
fdr <-  fullRegions[fullRegions$significantFDR == 'TRUE']
addmargins(table('DE status' = segs$DE, 'Overlaps DER (sig FDR) -- new' = countOverlaps(segs, fdr) > 0))

##          Overlaps DER (sig FDR) -- new
## DE status FALSE TRUE Sum
##     FALSE   111    0 111
##     TRUE     30  139 169
##     Sum     141  139 280

However, because of the simulation setup, with the original implementation we would actually expect at least two DERs to overlap each exonic segment that was set to be DE. When doing this comparison, we get 3 false positives and 28 false negatives versus 0 and 30 in the new implementation.

## Identical results? to derfinder-original if we ask segments to overlap 
## two DERs given that we are doing 3 group comparisons with 1 group
## higher/lower than the other two
addmargins(table('DE status' = segs$DE, 'Overlaps DER (sig) -- original, min 2' = countOverlaps(segs, ori.de.sig) > 1))

##          Overlaps DER (sig) -- original, min 2
## DE status FALSE TRUE Sum
##     FALSE   108    3 111
##     TRUE     28  141 169
##     Sum     136  144 280

identical(
    addmargins(table('DE status' = segs$DE, 'Overlaps DER (sig)' = countOverlaps(segs, fdr) > 0)),
    addmargins(table('DE status' = segs$DE, 'Overlaps DER (sig)' = countOverlaps(segs, ori.de.sig) > 1))
)

## [1] FALSE

While the global numbers are similar, the agreement at the exonic segment level is not perfect between the original and new implementations as shown below.

addmargins(table('new' = countOverlaps(segs, fdr) > 0, 'original -- min 2' = countOverlaps(segs, ori.de.sig) > 1))

##        original -- min 2
## new     FALSE TRUE Sum
##   FALSE   129   12 141
##   TRUE      7  132 139
##   Sum     136  144 280

The following code was adapted from evaluate.html and explores the false negatives when requiring at least 2 DERs to overlap each exonic segment. As with the new implementation, the original implementation struggles in genes where one of the transcripts was set to be DE while the other wasn't.

## Explore false negative segments using sig DERs
seg.fn <- which(segs$DE & !countOverlaps(segs, ori.de.sig) > 1)

## Some segments are short
summary(width(segs[seg.fn]))

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     5.0    69.0   155.5   191.5   223.2   853.0

## new:
##
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     5.0    73.5   119.0   242.7   241.0  1844.0
chosen[chosen$gene_id == '6523', ]

##     tx_idx tx_n tx_i gene_id  ucsckg_id fasta_i    DE group1 group2 group3
## 175    400    2    1    6523 uc003amc.3     457 FALSE normal normal normal
## 176    400    2    2    6523 uc011alz.2     458  TRUE normal normal   high
##     width readspertx mean1 mean2 mean3  case
## 175  5061       2024  2024  2024  2024 oneDE
## 176  4779       1912  1912  1912  3824 oneDE

## 6 / 28 of the segmeents are from gene with id 6523
## new is 9 / 30 for the same gene
tail(sort(table(names(segs[seg.fn]))))

## 
##   7494  10738 339669   3976   6948   6523 
##      1      2      2      2      4      6

## Cases of the genes with at least one FN segment
table(tapply(subset(chosen, gene_id %in% names(seg.fn))$case, subset(chosen, gene_id %in% names(seg.fn))$gene_id, unique))

## 
##   bothDE    oneDE singleDE 
##        2       11        4

## new:
## 
##   bothDE    oneDE singleDE 
##        1       10        5

## Type of gene where the segments come from. Mostly oneDE genes
table(sapply(names(segs[seg.fn]), function(x) { unique(chosen$case[chosen$gene_id == x]) }))

## 
##   bothDE    oneDE singleDE 
##        2       22        4

## new:
## 
##   bothDE    oneDE singleDE 
##        1       24        5

Counts-based

Just as a verification step, the exonic segments used are disjoint as well as the exons used for the counts-based methods. Also, each exonic segment overlaps only one exon.

max(countOverlaps(segs)) - 1

## [1] 0

max(countOverlaps(exons)) - 1

## [1] 0

table(countOverlaps(segs, exons))

## 
##   1 
## 280

DESeq2

Next, we can compare the exonic segments versus the significant DE exons as determined by DESeq2. Significance can be determined by regular p-values, FDR FDR adjusted p-values (BH method), and FWER adjusted p-values (Holm method). In all cases we use a cutoff of < 0.05.

count_comp <- function(info, ptype = 'padj', cut = 0.05) {
    if(ptype == 'padj') {
        idx <- info$padj < cut
    } else if (ptype == 'pvalue') {
        idx <- info$pvalue < cut
    } else {
        p <- p.adjust(info$pvalue, ptype)
        idx <- p < cut
    }
    idx[is.na(idx)] <- FALSE

    ## Overlaps at least 1 DE exon
    addmargins(table('DE status' = segs$DE, 'Overlaps DE exon' = countOverlaps(segs, info[idx]) > 0))
}

## Regular p-values
count_comp(deseq, ptype = 'pvalue')

##          Overlaps DE exon
## DE status FALSE TRUE Sum
##     FALSE   101   10 111
##     TRUE      6  163 169
##     Sum     107  173 280

## FDR adjusted p-values by method BH
count_comp(deseq)

##          Overlaps DE exon
## DE status FALSE TRUE Sum
##     FALSE   104    7 111
##     TRUE      7  162 169
##     Sum     111  169 280

## FWER adjusted p-values by method Holm
count_comp(deseq, ptype = 'holm')

##          Overlaps DE exon
## DE status FALSE TRUE Sum
##     FALSE   110    1 111
##     TRUE     10  159 169
##     Sum     120  160 280

identical(count_comp(deseq, ptype = 'holm'), count_comp(deseq, ptype = 'bonferroni'))

## [1] TRUE

As expected the number of false positives is the highest with regular p-values and the lowest with FWER adjusted p-values. In all cases, there are fewer false negatives compared to derfinder.

edgeR-robust

Similar results are shown below when using edgeR-robust.

## Regular p-values
count_comp(edger, ptype = 'pvalue')

##          Overlaps DE exon
## DE status FALSE TRUE Sum
##     FALSE    92   19 111
##     TRUE      6  163 169
##     Sum      98  182 280

## FDR adjusted p-values by method BH
count_comp(edger)

##          Overlaps DE exon
## DE status FALSE TRUE Sum
##     FALSE    98   13 111
##     TRUE      6  163 169
##     Sum     104  176 280

## FWER adjusted p-values by method Holm
count_comp(edger, ptype = 'holm')

##          Overlaps DE exon
## DE status FALSE TRUE Sum
##     FALSE   110    1 111
##     TRUE      8  161 169
##     Sum     118  162 280

identical(count_comp(edger, ptype = 'holm'), count_comp(edger, ptype = 'bonferroni'))

## [1] TRUE

Overlap

## Some formatting and subsets
fullRegions$sigFDR <- as.logical(fullRegions$significantFDR)
fullRegs20 <- fullRegions[width(fullRegions) >= 20]

## Overlap table for all 4 cases
ov_table <- function(ders, counts, query = 'der', minov = 0) {
    if(query == 'der') {
        if(minov == 0) {
            res <- addmargins(table('Significant DER (FDR)' = ders$sigFDR, 'Overlaps sig DE exon' = countOverlaps(ders, counts[counts$sig]) > 0))
        } else {
            res <- addmargins(table(ders$sigFDR, countOverlaps(ders, counts[counts$sig], minoverlap = minov) > 0, dnn = c('Significant DER (FDR)', paste0('Overlaps sig DE exon (min ', minov, 'bp)'))))
        }
    } else if (query == 'counts') {
        if(minov == 0) {
            res <- addmargins(table('Significant DE exon' = counts$sig, 'Overlaps sig DER (FDR)' = countOverlaps(counts, ders[ders$sigFDR]) > 0))
        } else {
            res <- addmargins(table(counts$sig[width(counts) >= minov], countOverlaps(counts[width(counts) >= minov], ders[ders$sigFDR], minoverlap = minov) > 0, dnn = c('Significant DE exon', paste0('Overlaps sig DER (FDR, min ', minov, 'bp)'))))
        }
    }
    return(res)
}

## Explore mistmatched cases for DERs vs Exons direction
explore_ov <- function(ders, counts, case = "FALSE-TRUE", minov = 0L) {
    if(case == 'FALSE-TRUE') {
        i <- which(countOverlaps(ders, counts[counts$sig], minoverlap = minov) > 0 & !ders$sigFDR)
    } else if (case == 'TRUE-FALSE') {
        i <- which(!countOverlaps(ders, counts[counts$sig], minoverlap = minov) > 0 & ders$sigFDR)
    } else{
        stop('invalid case')
    }
    if(length(i) == 0) return("No such cases")
    
    if(case == 'FALSE-TRUE') {
        res <- list(
            n_overlaps = table(countOverlaps(ders[i], counts[counts$sig], minoverlap = minov)),
            width_der = summary(width(ders[i])),
            ders_per_exon_table = table(table(subjectHits(findOverlaps(ders[i], counts[counts$sig], minoverlap = minov)))),
            ders_per_exon = sort(table(subjectHits(findOverlaps(ders[i], counts[counts$sig], minoverlap = minov)))),
            i = i
        )
    } else {
        res <- list(
            width_der = summary(width(ders[i])),
            distance_nearest_sum = summary(mcols(distanceToNearest(ders[i], counts, ignore.strand = TRUE))$distance),
            distance_nearest_sig_sum = summary(mcols(distanceToNearest(ders[i], counts[counts$sig], ignore.strand = TRUE))$distance),
            distance_nearest = distanceToNearest(ders[i], counts, ignore.strand = TRUE),
            distance_nearest_sig = distanceToNearest(ders[i], counts[counts$sig], ignore.strand = TRUE),
            i = i
        )
    }
    
    return(res)
}

## Explore mistmatched cases for Exons vs DERs direction
explore_ov_counts <- function(ders, counts, case = "FALSE-TRUE", minov = 0L) {
    counts <- counts[width(counts) >= minov]
    if(case == 'FALSE-TRUE') {
        i <- which(countOverlaps(counts, ders[ders$sigFDR], minoverlap = minov) > 0 & !counts$sig)
    } else if (case == 'TRUE-FALSE') {
        i <- which(!countOverlaps(counts, ders[ders$sigFDR], minoverlap = minov) > 0 & counts$sig)
    } else{
        stop('invalid case')
    }
    if(length(i) == 0) return("No such cases")
    
    if(case == 'FALSE-TRUE') {
        res <- list(
            n_overlaps = table(countOverlaps(counts[i], ders[ders$sigFDR], minoverlap = minov)),
            width_exon = summary(width(counts[i])),
            exons_per_der_table = table(table(subjectHits(findOverlaps(counts[i], ders[ders$sigFDR], minoverlap = minov)))),
            exons_per_der = sort(table(subjectHits(findOverlaps(counts[i], ders[ders$sigFDR], minoverlap = minov)))),
            i = i
        )
    } else {
        res <- list(
            width_exon = summary(width(counts[i])),
            distance_nearest_sum = summary(mcols(distanceToNearest(counts[i], ders, ignore.strand = TRUE))$distance),
             distance_nearest_sig_sum = summary(mcols(distanceToNearest(counts[i], ders[ders$sigFDR], ignore.strand = TRUE))$distance),
            distance_nearest = distanceToNearest(counts[i], ders, ignore.strand = TRUE),
            distance_nearest_sig = distanceToNearest(counts[i], ders[ders$sigFDR], ignore.strand = TRUE),
            i = i
        )
    }
    
    return(res)
}

noNA <- function(x) {
    x[!is.na(x)]
}

DESeq2

Query: DERs

We can first compare the results by using the DERs as the query and the exons as the subject. The following output shows the comparison using all DERs and exploring the mismatched cases. Then its repeated using the DERs ≥ 20 bp and a minimum overlap of 20bp.

For the mismatched cases of non-significant DERs overlapping a significant exon, we check:

how many exons each DER overlaps,
the width of the DERs
the frequency table of how many DERs overlap the same exon

For the other mismatched case, we check:

the width of the DERs
distance to nearest exon (regardless of exon size)
distance to nearest significant DE exon (ibidem)

## Overlap between DERs and significant DE exons
ov_table(fullRegions, deseq)

##                      Overlaps sig DE exon
## Significant DER (FDR) FALSE TRUE Sum
##                 FALSE    15  301 316
##                 TRUE      0  153 153
##                 Sum      15  454 469

## Explore mismatched cases
noNA(explore_ov(fullRegions, deseq)[1:3])

## $n_overlaps
## 
##   1   2 
## 293   8 
## 
## $width_der
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   1.000   2.000   5.236   6.000  37.000 
## 
## $ders_per_exon_table
## 
##  1  2  3  4  5  6  7  9 10 11 15 36 76 
## 14 11  8  2  4  5  2  2  1  2  1  1  1

noNA(explore_ov(fullRegions, deseq, 'TRUE-FALSE')[1:3])

## [1] "No such cases"

## Min 20 bp overlap, using only DERs 20 bp long
ov_table(fullRegs20, deseq, minov = 20L)

##                      Overlaps sig DE exon (min 20bp)
## Significant DER (FDR) FALSE TRUE Sum
##                 FALSE     1   15  16
##                 TRUE      0  153 153
##                 Sum       1  168 169

## Explore mismatched cases, min 20bp overlap
noNA(explore_ov(fullRegs20, deseq, minov = 20L)[1:3])

## $n_overlaps
## 
##  1  2 
## 14  1 
## 
## $width_der
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   20.00   22.50   25.00   26.27   30.00   37.00 
## 
## $ders_per_exon_table
## 
##  1  6 
## 10  1

noNA(explore_ov(fullRegs20, deseq, 'TRUE-FALSE', minov = 20L)[1:3])

## [1] "No such cases"

Most of the non-significant DERs that overlap a significant exon are shorter than 20bp with the longest being 37bp long. When restricting the analysis to 20bp or longer DERs, 6 the DERs overlap a single exon. It is an exon from a gene with two isoforms where only one of them was set to be DE. As can be seen in the figure below, the F-statistics curve oscillates and a lower cutoff could have resulted in significant DERs in this exon.

## Identify long exon
long_exon <- deseq[which(deseq$sig)[as.integer(names(explore_ov(fullRegs20, deseq, minov = 20L)$ders_per_exon)[explore_ov(fullRegs20, deseq, minov = 20L)$ders_per_exon == 6])]]

## What was it set to be?
mcols(segs[subjectHits(findOverlaps(long_exon, segs))])$DE

## [1] TRUE

gene[gene$gene_id %in% unique(names(segs[subjectHits(findOverlaps(long_exon, segs))])),]

##    gene_id   DE  case
## 51    6523 TRUE oneDE

## Calculate F-stats
range <- start(long_exon):end(long_exon)
dat <- fullCov$chr22[range, ]

## Log2 transform
for(i in 1:30) dat[[i]] <- log2(dat[[i]] + 32) 

## Calculate f-stats
fstats <- as.numeric(fstats.apply(data = dat, mod = models$mod, mod0 = models$mod0))

## Find annotation
annoReg <- annotateRegions(long_exon, GenomicState.Hsapiens.UCSC.hg19.knownGene$fullGenome, verbose = FALSE)
## Remove symbol name because it gets chomped on the plot
mcols(annoReg$annotationList[[1]])$symbol <- NA

## Plot long exon
long_exon_plot <- function() {
    plotRegionCoverage(long_exon, getRegionCoverage(fullCov, long_exon, verbose = FALSE), groupInfo, data.frame(name = "sig DE exon via DESeq2, no sig DERs", distance = NA, region = 'SLC5A1'), annoReg, verbose = FALSE, ask = FALSE, txdb = txdb)

    ## Add F-stat track
    par(fig = c(0, 1, 0.065, 0.125), new = TRUE, xaxt = 'n', oma = c(0, 0, 0, 0), mar = c(0, 4.5, 0, 1.1))
    plot(y = fstats, x = range, ylab = 'F-stat', type = 'l', xlab = '', bty = 'n', ylim = c(0, max(fstats[is.finite(fstats)], optionsStats$cutoffFstatUsed) * 1.1), yaxt = 'n')
    axis(2, at = c(0, 10, 20, 30), c(0, 10, 20, NA), las = 2, tick = TRUE)
    abline(h = optionsStats$cutoffFstatUsed, col = 'red')
    abline(h = 0, col = 'grey')
}

## Save in pdf
pdf(file = 'long_exon.pdf', width = 10, height = 7)
long_exon_plot()
dev.off()

## pdf 
##   2

#system('open long_exon.pdf')

## Render in png
long_exon_plot()

Query: exons

We can now repeat the comparison using the exons as the query and the DERs as the subject.

For the mismatched cases of non-significant exons overlapping a significant DER, we check:

how many DERs each exon overlaps,
the width of the exons
the frequency table of how many exons overlap the same DER

For the other mismatched case, we check:

the width of the exons
distance to nearest DER (regardless of DER size)
distance to nearest significant DER (ibidem)

## Overlap between exons and significant DERs
ov_table(fullRegions, deseq, 'counts')

##                    Overlaps sig DER (FDR)
## Significant DE exon FALSE TRUE Sum
##               FALSE   120    0 120
##               TRUE     22  129 151
##               Sum     142  129 271

## Explore mismatched cases
noNA(explore_ov_counts(fullRegions, deseq)[1:3])

## [1] "No such cases"

noNA(explore_ov_counts(fullRegions, deseq, 'TRUE-FALSE')[1:3])

## $width_exon
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    60.0   105.0   123.5   551.0   219.0  4971.0 
## 
## $distance_nearest_sum
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0       0       0  338500  401300 2986000 
## 
## $distance_nearest_sig_sum
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     240    7876   99000  448000  531800 3512000

## Overlap between exons and significant DERs, min 20 bp
ov_table(fullRegions, deseq, 'counts', 20L)

##                    Overlaps sig DER (FDR, min 20bp)
## Significant DE exon FALSE TRUE Sum
##               FALSE   119    0 119
##               TRUE     22  129 151
##               Sum     141  129 270

## Explore mismatched cases
noNA(explore_ov_counts(fullRegions, deseq, minov = 20L)[1:3])

## [1] "No such cases"

noNA(explore_ov_counts(fullRegions, deseq, 'TRUE-FALSE', minov = 20L)[1:3])

## $width_exon
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    60.0   105.0   123.5   551.0   219.0  4971.0 
## 
## $distance_nearest_sum
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0       0       0  338500  401300 2986000 
## 
## $distance_nearest_sig_sum
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     240    7876   99000  448000  531800 3512000

We can further explore the cases where significant DE exons did not overlap a significant DER and compare against the exonic segments.

## Further explore 
i <- explore_ov_counts(fullRegions, deseq, 'TRUE-FALSE')[['i']]

## Exclude cases where an exon overlaps exonic segments from both strands
i.simple <- i[countOverlaps(deseq[i], segs, ignore.strand = TRUE) == 1]

## Number of cases excluded because of strand overlaps
length(i) - length(i.simple)

## [1] 4

## Check with exonic segments
table(mcols(segs)$DE[subjectHits(findOverlaps(deseq[i.simple], segs ))])

## 
## FALSE  TRUE 
##     3    15

## Gene cases that were set to be DE
table(gene$case[gene$gene_id %in% unique(names(segs)[intersect(subjectHits(findOverlaps(deseq[i.simple], segs )), which(mcols(segs)$DE))])])

## 
##    oneDE singleDE 
##        3        3

## Exon mistmatched cases (DE), what was the gene status?
table(sapply(names(segs)[intersect(subjectHits(findOverlaps(deseq[i.simple], segs )), which(mcols(segs)$DE))], function(x) { gene$case[gene$gene_id == x] }))

## 
##    oneDE singleDE 
##       12        3

## Gene cases that were not set to be DE
table(gene$case[gene$gene_id %in% unique(names(segs)[intersect(subjectHits(findOverlaps(deseq[i.simple], segs )), which(!mcols(segs)$DE))])])

## 
##      noneDE singleNotDE 
##           1           1

## Exon mistmatched cases (not DE), what was the gene status?
table(sapply(names(segs)[intersect(subjectHits(findOverlaps(deseq[i.simple], segs )), which(!mcols(segs)$DE))], function(x) { gene$case[gene$gene_id == x] }))

## 
##      noneDE singleNotDE 
##           2           1

Out of the 22 cases in disagreement (min overlap 0 bp), 4 of them are due to overlapping exons from both strands. 15 of these exons are correctly detected by DESeq2 whereas 3 are not. The ones incorrectly identified by derfinder to not be DE are mostly (12 / 15) from genes with two isoforms with only one of them DE.

## Identify longest exon with the error
err_deseq2 <- segs[intersect(subjectHits(findOverlaps(deseq[i.simple], segs )), which(!mcols(segs)$DE))]
err_deseq2 <- err_deseq2[which.max(width(err_deseq2))]

## What was it set to be?
mcols(err_deseq2)$DE

## [1] FALSE

## Calculate F-stats
range <- start(err_deseq2):end(err_deseq2)
dat <- fullCov$chr22[range, ]

## Log2 transform
for(i in 1:30) dat[[i]] <- log2(dat[[i]] + 32) 

## Calculate f-stats
fstats <- as.numeric(fstats.apply(data = dat, mod = models$mod, mod0 = models$mod0))

## Find annotation
annoReg <- annotateRegions(err_deseq2, GenomicState.Hsapiens.UCSC.hg19.knownGene$fullGenome, verbose = FALSE)
## Remove symbol name because it gets chomped on the plot
mcols(annoReg$annotationList[[1]])$symbol <- NA

## Plot long exon
fp_plot <- function() {
    plotRegionCoverage(err_deseq2, getRegionCoverage(fullCov, err_deseq2, verbose = FALSE), groupInfo, data.frame(name = "False positive via DESeq2/edgeR-robust", distance = NA, region = 'SF3A1'), annoReg, verbose = FALSE, ask = FALSE, txdb = txdb)

    ## Add F-stat track
    par(fig = c(0, 1, 0.065, 0.125), new = TRUE, xaxt = 'n', oma = c(0, 0, 0, 0), mar = c(0, 4.5, 0, 1.1))
    plot(y = fstats, x = range, ylab = 'F-stat', type = 'l', xlab = '', bty = 'n', ylim = c(0, max(fstats[is.finite(fstats)], optionsStats$cutoffFstatUsed) * 1.1), yaxt = 'n')
    axis(2, at = c(0, 5, 10), c(0, 5, 10), las = 2, tick = TRUE)
    abline(h = optionsStats$cutoffFstatUsed, col = 'red')
    abline(h = 0, col = 'grey')
}

## Save in pdf
pdf(file = 'fp_deseq2.pdf', width = 10, height = 7)
fp_plot()
dev.off()

## pdf 
##   2

#system('open fp_deseq2.pdf')

## Render in png
fp_plot()

edgeR-robust

Query: DERs

Similar comparison using DERs as query and exons as subject with edgeR-robust results.

## Overlap between DERs and significant DE exons
ov_table(fullRegions, edger)

##                      Overlaps sig DE exon
## Significant DER (FDR) FALSE TRUE Sum
##                 FALSE    13  303 316
##                 TRUE      0  153 153
##                 Sum      13  456 469

## Explore mismatched cases
noNA(explore_ov(fullRegions, edger)[1:3])

## $n_overlaps
## 
##   1   2 
## 295   8 
## 
## $width_der
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   1.000   2.000   5.228   6.000  37.000 
## 
## $ders_per_exon_table
## 
##  1  2  3  4  5  6  7  9 10 11 15 36 76 
## 14 12  8  2  4  5  2  2  1  2  1  1  1

noNA(explore_ov(fullRegions, edger, 'TRUE-FALSE')[1:3])

## [1] "No such cases"

## Min 20 bp overlap, using only DERs 20 bp long
ov_table(fullRegs20, edger, minov = 20L)

##                      Overlaps sig DE exon (min 20bp)
## Significant DER (FDR) FALSE TRUE Sum
##                 FALSE     1   15  16
##                 TRUE      0  153 153
##                 Sum       1  168 169

## Explore mismatched cases, min 20bp overlap
noNA(explore_ov(fullRegs20, edger, minov = 20L)[1:3])

## $n_overlaps
## 
##  1  2 
## 14  1 
## 
## $width_der
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   20.00   22.50   25.00   26.27   30.00   37.00 
## 
## $ders_per_exon_table
## 
##  1  6 
## 10  1

noNA(explore_ov(fullRegs20, edger, 'TRUE-FALSE', minov = 20L)[1:3])

## [1] "No such cases"

The results are fairly similar to those from using DESeq2.

Query: exons

Similar comparison using exons as query and DERs as subject with edgeR-robust results.

## Overlap between exons and significant DERs
ov_table(fullRegions, edger, 'counts')

##                    Overlaps sig DER (FDR)
## Significant DE exon FALSE TRUE Sum
##               FALSE   114    0 114
##               TRUE     28  129 157
##               Sum     142  129 271

## Explore mismatched cases
noNA(explore_ov_counts(fullRegions, edger)[1:3])

## [1] "No such cases"

noNA(explore_ov_counts(fullRegions, edger, 'TRUE-FALSE')[1:3])

## $width_exon
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    13.0   100.8   123.5   465.3   222.2  4971.0 
## 
## $distance_nearest_sum
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0       0     873  267300    9045 2986000 
## 
## $distance_nearest_sig_sum
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     240    8392   95680  375600  509100 3512000

## Overlap between exons and significant DERs, min 20 bp
ov_table(fullRegions, edger, 'counts', 20L)

##                    Overlaps sig DER (FDR, min 20bp)
## Significant DE exon FALSE TRUE Sum
##               FALSE   114    0 114
##               TRUE     27  129 156
##               Sum     141  129 270

## Explore mismatched cases
noNA(explore_ov_counts(fullRegions, edger, minov = 20L)[1:3])

## [1] "No such cases"

noNA(explore_ov_counts(fullRegions, edger, 'TRUE-FALSE', minov = 20L)[1:3])

## $width_exon
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    60.0   105.0   128.0   482.1   224.5  4971.0 
## 
## $distance_nearest_sum
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0       0     809  277100    9577 2986000 
## 
## $distance_nearest_sig_sum
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     240    8914   95990  389400  515900 3512000

As before, we can further explore the mismatched cases where edgeR-robust finds DE signal but derfinder does not.

## Further explore 
j <- explore_ov_counts(fullRegions, edger, 'TRUE-FALSE')[['i']]

## Exclude cases where an exon overlaps exonic segments from both strands
j.simple <- j[countOverlaps(edger[j], segs, ignore.strand = TRUE) == 1]

## Number of cases excluded because of strand overlaps
length(j) - length(j.simple)

## [1] 5

## Check with exonic segments
table(mcols(segs)$DE[subjectHits(findOverlaps(edger[j.simple], segs ))])

## 
## FALSE  TRUE 
##     7    16

## Gene cases that were set to be DE
table(gene$case[gene$gene_id %in% unique(names(segs)[intersect(subjectHits(findOverlaps(edger[j.simple], segs )), which(mcols(segs)$DE))])])

## 
##    oneDE singleDE 
##        4        3

## Exon mistmatched cases (DE), what was the gene status?
table(sapply(names(segs)[intersect(subjectHits(findOverlaps(edger[j.simple], segs )), which(mcols(segs)$DE))], function(x) { gene$case[gene$gene_id == x] }))

## 
##    oneDE singleDE 
##       13        3

## Gene cases that were not set to be DE
table(gene$case[gene$gene_id %in% unique(names(segs)[intersect(subjectHits(findOverlaps(edger[j.simple], segs )), which(!mcols(segs)$DE))])])

## 
##      noneDE singleNotDE 
##           3           1

## Exon mistmatched cases (not DE), what was the gene status?
table(sapply(names(segs)[intersect(subjectHits(findOverlaps(edger[j.simple], segs )), which(!mcols(segs)$DE))], function(x) { gene$case[gene$gene_id == x] }))

## 
##      noneDE singleNotDE 
##           6           1

overall

The results are fairly similar to those from using DESeq2. As shown below, they agree on all but 6 of the exons analyzed. In particular, the longest false positive exon by DESeq2 is also a FP with edgeR-robust.

## edgeR vs DESeq2
addmargins(table('edgeR-robust' = edger$sig, 'DESeq2' = deseq$sig))

##             DESeq2
## edgeR-robust FALSE TRUE Sum
##        FALSE   114    0 114
##        TRUE      6  151 157
##        Sum     120  151 271

## Find FP errors
err_edger <- segs[intersect(subjectHits(findOverlaps(edger[j.simple], segs )), which(!mcols(segs)$DE))]

## Does it overlap FP from DESeq2? Yes
countOverlaps(err_deseq2, err_edger) == 1

## 10291 
##  TRUE

The following is a coverage plot of the longest exon where all methods agree that its differentially expressed.

## Find longest exon where edgeR and DEseq2 agree that it's DE
l <- which.max(width(edger)[which(edger$sig & deseq$sig)])
agree <- edger[which(edger$sig & deseq$sig)[l]]

## Width of exon
width(agree)

## [1] 5069

## DERs this exon overlaps
width(fullRegions[subjectHits(findOverlaps(agree, fullRegions))])

## [1] 4999    6    2    1

mcols(fullRegions[subjectHits(findOverlaps(agree, fullRegions))])$sigFDR

## [1]  TRUE FALSE FALSE FALSE

## Case
gene[gene$gene_id == unique(names(segs[subjectHits(findOverlaps(agree,segs))])), ]

##    gene_id   DE   case
## 38   27439 TRUE bothDE

## Calculate F-stats
range <- start(agree):end(agree)
dat <- fullCov$chr22[range, ]

## Log2 transform
for(i in 1:30) dat[[i]] <- log2(dat[[i]] + 32) 

## Calculate f-stats
fstats <- as.numeric(fstats.apply(data = dat, mod = models$mod, mod0 = models$mod0))

## Find annotation
annoReg <- annotateRegions(agree, GenomicState.Hsapiens.UCSC.hg19.knownGene$fullGenome, verbose = FALSE)
## Remove symbol name because it gets chomped on the plot
mcols(annoReg$annotationList[[1]])$symbol <- NA

## Plot long exon
agree_plot <- function() {
    plotRegionCoverage(agree, getRegionCoverage(fullCov, agree, verbose = FALSE), groupInfo, data.frame(name = "Full agreement: differentially expressed", distance = NA, region = 'CECR6'), annoReg, verbose = FALSE, ask = FALSE, txdb = txdb)

    ## Add F-stat track
    par(fig = c(0, 1, 0.065, 0.125), new = TRUE, xaxt = 'n', oma = c(0, 0, 0, 0), mar = c(0, 4.5, 0, 1.1))
    plot(y = fstats, x = range, ylab = 'F-stat', type = 'l', xlab = '', bty = 'n', ylim = c(0, max(fstats[is.finite(fstats)], optionsStats$cutoffFstatUsed) * 1.1), yaxt = 'n')
    axis(2, at = c(0, 50, 100, 150, 200, 250), c(0, NA, 100, NA, 200, NA), las = 2, tick = TRUE)
    abline(h = optionsStats$cutoffFstatUsed, col = 'red')
    abline(h = 0, col = 'grey')
}

## Save in pdf
pdf(file = 'agree.pdf', width = 10, height = 7)
agree_plot()
dev.off()

## pdf 
##   2

#system('open agree.pdf')

## Render in png
agree_plot()

The following is a coverage plot of the longest exon where all methods agree that its not differentially expressed. For derfinder, we are requiring DERs to be at least 20 bp long.

## Find longest exon where edgeR and DEseq2 agree that it's not DE
## Exclude the one from chr12
l <- which(width(edger)[which(!edger$sig & !deseq$sig)] == 5789)
agree_notde <- edger[which(!edger$sig & !deseq$sig)[l]]

## Width of exon
width(agree_notde)

## [1] 5789

## DERs this exon overlaps
width(fullRegions[subjectHits(findOverlaps(agree_notde, fullRegions))])

## [1] 17  7  1  1

mcols(fullRegions[subjectHits(findOverlaps(agree, fullRegions))])$sigFDR

## [1]  TRUE FALSE FALSE FALSE

## Case
gene[gene$gene_id == unique(names(segs[subjectHits(findOverlaps(agree_notde, segs))])), ]

##    gene_id    DE   case
## 52    6942 FALSE noneDE

## Calculate F-stats
range <- start(agree_notde):end(agree_notde)
dat <- fullCov$chr22[range, ]

## Log2 transform
for(i in 1:30) dat[[i]] <- log2(dat[[i]] + 32) 

## Calculate f-stats
fstats <- as.numeric(fstats.apply(data = dat, mod = models$mod, mod0 = models$mod0))

## Find annotation
annoReg <- annotateRegions(agree_notde, GenomicState.Hsapiens.UCSC.hg19.knownGene$fullGenome, verbose = FALSE)
## Remove symbol name because it gets chomped on the plot
mcols(annoReg$annotationList[[1]])$symbol <- NA

## Plot long exon
agree_notde_plot <- function() {
    plotRegionCoverage(agree_notde, getRegionCoverage(fullCov, agree_notde, verbose = FALSE), groupInfo, data.frame(name = "Full agreement: not differentially expressed", distance = NA, region = 'TCF20'), annoReg, verbose = FALSE, ask = FALSE, txdb = txdb)

    ## Add F-stat track
    par(fig = c(0, 1, 0.065, 0.125), new = TRUE, xaxt = 'n', oma = c(0, 0, 0, 0), mar = c(0, 4.5, 0, 1.1))
    plot(y = fstats, x = range, ylab = 'F-stat', type = 'l', xlab = '', bty = 'n', ylim = c(0, max(fstats[is.finite(fstats)], optionsStats$cutoffFstatUsed) * 1.1), yaxt = 'n')
    axis(2, at = c(0, 5, 10), c(0, 5, 10), las = 2, tick = TRUE)
    abline(h = optionsStats$cutoffFstatUsed, col = 'red')
    abline(h = 0, col = 'grey')
}

## Save in pdf
pdf(file = 'agree_notde.pdf', width = 10, height = 7)
agree_notde_plot()
dev.off()

## pdf 
##   2

#system('open agree_notde.pdf')

## Render in png
agree_notde_plot()

Conclusions

Globally, the results between the original implementation and our implementation of derfinder are very similar, with both resulting in the same empirical power and absence of false positives. Both struggle in the scenario where one of the two isoforms of a gene was set to be differentially expressed.

emp_power <- function(info, ptype) {
    m <- count_comp(info, ptype)
    round(m[2, 2] / m[2, 3] * 100, 2)
}
emp_fpr <- function(info, ptype) {
    m <- count_comp(info, ptype)
    round(m[1, 2] / m[1, 3] * 100, 2)
}
emp_fdr <- function(info, ptype) {
    m <- count_comp(info, ptype)
    round(m[1, 2] / m[3, 2] * 100, 2)
}

When adjusting p-values to control the FDR, both DESeq2 and edgeR-robust result in higher empirical power than derfinder: 95.86 and 96.45 respectively versus 82.25. However, both DESeq2 and edgeR-robust have non-zero false positive rates: 6.31 and 11.71 respectively. Their false discovery rate is: 4.14 and 7.39 respectively. Note that the number of false positives is reduced to 1 in both cases when controlling the FWER.

We knew from evaluate.html that derfinder struggled in the scenario where a two transcript gene had only one of them set to be differentially expressed. DESeq2 and edgeR-robust are able to correctly identify such exons as differentially expressed in most cases, but they also introduce some occasional false positives whose coverage plots do not reveal apparent differences between the groups.

Reproducibility

## Reproducibility info
Sys.time()

## [1] "2015-04-06 21:38:54 EDT"

proc.time()

##    user  system elapsed 
## 106.257   2.013 118.414

options(width = 120)
devtools::session_info()

## Session info-----------------------------------------------------------------------------------------------------------

##  setting  value                                             
##  version  R Under development (unstable) (2014-11-01 r66923)
##  system   x86_64, darwin10.8.0                              
##  ui       X11                                               
##  language (EN)                                              
##  collate  en_US.UTF-8                                       
##  tz       America/New_York

## Packages---------------------------------------------------------------------------------------------------------------

##  package                           * version     date       source                                    
##  acepack                             1.3.3.3     2013-05-03 CRAN (R 3.2.0)                            
##  annotate                            1.45.4      2015-03-21 Bioconductor                              
##  AnnotationDbi                     * 1.29.21     2015-04-03 Bioconductor                              
##  Biobase                           * 2.27.3      2015-03-27 Bioconductor                              
##  BiocGenerics                      * 0.13.11     2015-04-03 Bioconductor                              
##  BiocParallel                        1.1.21      2015-03-24 Bioconductor                              
##  biomaRt                             2.23.5      2014-11-22 Bioconductor                              
##  Biostrings                          2.35.12     2015-03-26 Bioconductor                              
##  biovizBase                          1.15.3      2015-03-30 Bioconductor                              
##  bitops                              1.0.6       2013-08-17 CRAN (R 3.2.0)                            
##  BSgenome                            1.35.20     2015-03-27 Bioconductor                              
##  bumphunter                          1.7.6       2015-03-13 Github (lcolladotor/bumphunter@37d10e7)   
##  cluster                             2.0.1       2015-01-31 CRAN (R 3.2.0)                            
##  codetools                           0.2.11      2015-03-10 CRAN (R 3.2.0)                            
##  colorout                          * 1.0.2       2014-11-03 local                                     
##  colorspace                          1.2.6       2015-03-11 CRAN (R 3.2.0)                            
##  DBI                                 0.3.1       2014-09-24 CRAN (R 3.2.0)                            
##  derfinder                         * 1.1.18      2015-04-01 Bioconductor                              
##  derfinderHelper                   * 1.1.6       2015-03-15 Bioconductor                              
##  derfinderPlot                     * 1.1.6       2015-03-14 Github (lcolladotor/derfinderPlot@1319754)
##  DESeq2                            * 1.7.50      2015-04-04 Bioconductor                              
##  devtools                            1.6.1       2014-10-07 CRAN (R 3.2.0)                            
##  dichromat                           2.0.0       2013-01-24 CRAN (R 3.2.0)                            
##  digest                              0.6.8       2014-12-31 CRAN (R 3.2.0)                            
##  doRNG                               1.6         2014-03-07 CRAN (R 3.2.0)                            
##  edgeR                             * 3.9.15      2015-04-03 Bioconductor                              
##  evaluate                            0.5.5       2014-04-29 CRAN (R 3.2.0)                            
##  foreach                             1.4.2       2014-04-11 CRAN (R 3.2.0)                            
##  foreign                             0.8.63      2015-02-20 CRAN (R 3.2.0)                            
##  formatR                             1.0         2014-08-25 CRAN (R 3.2.0)                            
##  Formula                             1.2.0       2015-01-20 CRAN (R 3.2.0)                            
##  futile.logger                       1.4         2015-03-21 CRAN (R 3.2.0)                            
##  futile.options                      1.0.0       2010-04-06 CRAN (R 3.2.0)                            
##  genefilter                          1.49.2      2014-10-21 Bioconductor                              
##  geneplotter                         1.45.0      2014-10-14 Bioconductor                              
##  GenomeInfoDb                      * 1.3.16      2015-03-27 Bioconductor                              
##  GenomicAlignments                   1.3.33      2015-04-06 Bioconductor                              
##  GenomicFeatures                   * 1.19.36     2015-03-30 Bioconductor                              
##  GenomicFiles                        1.3.15      2015-04-01 Bioconductor                              
##  GenomicRanges                     * 1.19.52     2015-04-04 Bioconductor                              
##  GGally                              0.4.8       2014-08-26 CRAN (R 3.2.0)                            
##  ggbio                               1.15.3      2015-04-02 Bioconductor                              
##  ggplot2                             1.0.0       2014-05-21 CRAN (R 3.2.0)                            
##  graph                               1.45.3      2015-04-03 Bioconductor                              
##  gridExtra                           0.9.1       2012-08-09 CRAN (R 3.2.0)                            
##  gtable                              0.1.2       2012-12-05 CRAN (R 3.2.0)                            
##  Hmisc                               3.14.5      2014-09-12 CRAN (R 3.2.0)                            
##  htmltools                           0.2.6       2014-09-08 CRAN (R 3.2.0)                            
##  IRanges                           * 2.1.43      2015-03-07 Bioconductor                              
##  iterators                           1.0.7       2014-04-11 CRAN (R 3.2.0)                            
##  knitr                               1.7         2014-10-13 CRAN (R 3.2.0)                            
##  knitrBootstrap                      1.0.0       2014-11-03 Github (jimhester/knitrBootstrap@76c41f0) 
##  lambda.r                            1.1.7       2015-03-20 CRAN (R 3.2.0)                            
##  lattice                             0.20.31     2015-03-30 CRAN (R 3.2.0)                            
##  latticeExtra                        0.6.26      2013-08-15 CRAN (R 3.2.0)                            
##  limma                             * 3.23.12     2015-04-05 Bioconductor                              
##  locfit                              1.5.9.1     2013-04-20 CRAN (R 3.2.0)                            
##  markdown                            0.7.4       2014-08-24 CRAN (R 3.2.0)                            
##  MASS                                7.3.40      2015-03-21 CRAN (R 3.2.0)                            
##  Matrix                              1.2.0       2015-04-04 CRAN (R 3.2.0)                            
##  matrixStats                         0.14.0      2015-02-14 CRAN (R 3.2.0)                            
##  mime                                0.3         2015-03-29 CRAN (R 3.2.0)                            
##  munsell                             0.4.2       2013-07-11 CRAN (R 3.2.0)                            
##  nnet                                7.3.9       2015-02-11 CRAN (R 3.2.0)                            
##  OrganismDbi                         1.9.15      2015-03-30 Bioconductor                              
##  pkgmaker                            0.22        2014-05-14 CRAN (R 3.2.0)                            
##  plyr                                1.8.1       2014-02-26 CRAN (R 3.2.0)                            
##  proto                               0.3.10      2012-12-22 CRAN (R 3.2.0)                            
##  qvalue                            * 1.99.1      2015-04-04 Bioconductor                              
##  RBGL                                1.43.0      2014-10-14 Bioconductor                              
##  RColorBrewer                        1.1.2       2014-12-07 CRAN (R 3.2.0)                            
##  Rcpp                              * 0.11.5      2015-03-06 CRAN (R 3.2.0)                            
##  RcppArmadillo                     * 0.4.650.1.1 2015-02-26 CRAN (R 3.2.0)                            
##  RCurl                               1.95.4.5    2014-12-28 CRAN (R 3.2.0)                            
##  registry                            0.2         2012-01-24 CRAN (R 3.2.0)                            
##  reshape                             0.8.5       2014-04-23 CRAN (R 3.2.0)                            
##  reshape2                            1.4.1       2014-12-06 CRAN (R 3.2.0)                            
##  rmarkdown                           0.3.3       2014-09-17 CRAN (R 3.2.0)                            
##  rngtools                            1.2.4       2014-03-06 CRAN (R 3.2.0)                            
##  rpart                               4.1.9       2015-02-24 CRAN (R 3.2.0)                            
##  Rsamtools                           1.19.49     2015-03-27 Bioconductor                              
##  RSQLite                             1.0.0       2014-10-25 CRAN (R 3.2.0)                            
##  rstudioapi                          0.2         2014-12-31 CRAN (R 3.2.0)                            
##  rtracklayer                         1.27.11     2015-04-01 Bioconductor                              
##  S4Vectors                         * 0.5.22      2015-03-06 Bioconductor                              
##  scales                              0.2.4       2014-04-22 CRAN (R 3.2.0)                            
##  stringr                             0.6.2       2012-12-06 CRAN (R 3.2.0)                            
##  survival                            2.38.1      2015-02-24 CRAN (R 3.2.0)                            
##  TxDb.Hsapiens.UCSC.hg19.knownGene * 3.0.0       2014-09-26 Bioconductor                              
##  VariantAnnotation                   1.13.46     2015-03-26 Bioconductor                              
##  XML                                 3.98.1.1    2013-06-20 CRAN (R 3.2.0)                            
##  xtable                              1.7.4       2014-09-12 CRAN (R 3.2.0)                            
##  XVector                           * 0.7.4       2015-02-08 Bioconductor                              
##  yaml                                2.1.13      2014-06-12 CRAN (R 3.2.0)                            
##  zlibbioc                            1.13.3      2015-03-23 Bioconductor