vignettes/bumphunterExample.Rmd
bumphunterExample.Rmd
The bumphunter package can be used for methylation analyses where you are interested in identifying differentially methylated regions. The vignette explains in greater detail the data set we are using in this example.
## Load bumphunter
library("bumphunter")
## Loading required package: S4Vectors
## Loading required package: stats4
## Loading required package: BiocGenerics
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## anyDuplicated, aperm, append, as.data.frame, basename, cbind,
## colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
## get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
## match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
## Position, rank, rbind, Reduce, rownames, sapply, saveRDS, setdiff,
## table, tapply, union, unique, unsplit, which.max, which.min
##
## Attaching package: 'S4Vectors'
## The following object is masked from 'package:utils':
##
## findMatches
## The following objects are masked from 'package:base':
##
## expand.grid, I, unname
## Loading required package: IRanges
## Loading required package: GenomeInfoDb
## Loading required package: GenomicRanges
## Loading required package: foreach
## Loading required package: iterators
## Loading required package: parallel
## Loading required package: locfit
## locfit 1.5-9.10 2024-06-24
## Create data from the vignette
pos <- list(
pos1 = seq(1, 1000, 35),
pos2 = seq(2001, 3000, 35),
pos3 = seq(1, 1000, 50)
)
chr <- rep(paste0("chr", c(1, 1, 2)), times = sapply(pos, length))
pos <- unlist(pos, use.names = FALSE)
## Find clusters
cl <- clusterMaker(chr, pos, maxGap = 300)
## Build simulated bumps
Indexes <- split(seq_along(cl), cl)
beta1 <- rep(0, length(pos))
for (i in seq(along = Indexes)) {
ind <- Indexes[[i]]
x <- pos[ind]
z <- scale(x, median(x), max(x) / 12)
beta1[ind] <- i * (-1)^(i + 1) * pmax(1 - abs(z)^3, 0)^3 ## multiply by i to vary size
}
## Build data
beta0 <- 3 * sin(2 * pi * pos / 720)
X <- cbind(rep(1, 20), rep(c(0, 1), each = 10))
set.seed(23852577)
error <- matrix(rnorm(20 * length(beta1), 0, 1), ncol = 20)
y <- t(X[, 1]) %x% beta0 + t(X[, 2]) %x% beta1 + error
## Perform bumphunting
tab <- bumphunter(y, X, chr, pos, cl, cutoff = .5)
## [bumphunterEngine] Using a single core (backend: doSEQ, version: 1.5.2).
## [bumphunterEngine] Computing coefficients.
## [bumphunterEngine] Finding regions.
## [bumphunterEngine] Found 15 bumps.
## Explore data
lapply(tab, head)
## $table
## chr start end value area cluster indexStart indexEnd L
## 10 chr1 2316 2631 -1.5814747 15.8147473 2 39 48 10
## 7 chr2 451 551 1.5891293 4.7673878 3 68 70 3
## 2 chr1 456 526 1.0678828 3.2036485 1 14 16 3
## 5 chr1 2176 2211 0.7841794 1.5683589 2 35 36 2
## 6 chr1 2841 2841 1.2010184 1.2010184 2 54 54 1
## 4 chr1 771 771 0.7780902 0.7780902 1 23 23 1
## clusterL
## 10 29
## 7 20
## 2 29
## 5 29
## 6 29
## 4 29
##
## $coef
## [,1]
## [1,] 0.60960932
## [2,] -0.09052769
## [3,] -0.21482638
## [4,] 0.13053755
## [5,] -0.21723642
## [6,] 0.39934961
##
## $fitted
## [,1]
## [1,] 0.60960932
## [2,] -0.09052769
## [3,] -0.21482638
## [4,] 0.13053755
## [5,] -0.21723642
## [6,] 0.39934961
##
## $pvaluesMarginal
## [1] NA
Once we have the regions we can proceed to build the required
GRanges
object.
library("GenomicRanges")
## Build GRanges with sequence lengths
regions <- GRanges(
seqnames = tab$table$chr,
IRanges(start = tab$table$start, end = tab$table$end),
strand = "*", value = tab$table$value, area = tab$table$area,
cluster = tab$table$cluster, L = tab$table$L, clusterL = tab$table$clusterL
)
## Assign chr lengths
seqlengths(regions) <- seqlengths(
getChromInfoFromUCSC("hg19", as.Seqinfo = TRUE)
)[
names(seqlengths(regions))
]
## Explore the regions
regions
## GRanges object with 15 ranges and 5 metadata columns:
## seqnames ranges strand | value area cluster L
## <Rle> <IRanges> <Rle> | <numeric> <numeric> <numeric> <numeric>
## [1] chr1 2316-2631 * | -1.581475 15.81475 2 10
## [2] chr2 451-551 * | 1.589129 4.76739 3 3
## [3] chr1 456-526 * | 1.067883 3.20365 1 3
## [4] chr1 2176-2211 * | 0.784179 1.56836 2 2
## [5] chr1 2841 * | 1.201018 1.20102 2 1
## ... ... ... ... . ... ... ... ...
## [11] chr1 631 * | 0.618603 0.618603 1 1
## [12] chr1 1 * | 0.609609 0.609609 1 1
## [13] chr1 2911 * | -0.576423 0.576423 2 1
## [14] chr2 251 * | -0.556160 0.556160 3 1
## [15] chr1 2806 * | -0.521606 0.521606 2 1
## clusterL
## <integer>
## [1] 29
## [2] 20
## [3] 29
## [4] 29
## [5] 29
## ... ...
## [11] 29
## [12] 29
## [13] 29
## [14] 20
## [15] 29
## -------
## seqinfo: 2 sequences from an unspecified genome
Now that we have identified a set of differentially methylated regions we can proceed to creating the HTML report. Note that this report has less information than the DiffBind example because we don’t have a p-value variable.
## Load regionReport
library("regionReport")
## Now create the report
report <- renderReport(regions, "Example bumphunter",
pvalueVars = NULL,
densityVars = c(
"Area" = "area", "Value" = "value",
"Cluster Length" = "clusterL"
), significantVar = NULL,
output = "bumphunter-example", outdir = "bumphunter-example",
device = "png"
)
You can view the final report at
bumphunter-example/bumphunter-example.html
or
here.
In case the link does not work, a pre-compiled version of this document and its corresponding report are available at leekgroup.github.io/regionReportSupp/.
## Date generated:
Sys.time()
## [1] "2024-12-12 21:49:36 UTC"
## Time spent making this page:
proc.time()
## user system elapsed
## 9.847 0.651 10.512
## R and packages info:
options(width = 120)
sessioninfo::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 4.4.2 (2024-10-31)
## os Ubuntu 24.04.1 LTS
## system x86_64, linux-gnu
## ui X11
## language en
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz UTC
## date 2024-12-12
## pandoc 3.5 @ /usr/bin/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## abind 1.4-8 2024-09-12 [1] RSPM (R 4.4.0)
## AnnotationDbi 1.68.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## backports 1.5.0 2024-05-23 [1] RSPM (R 4.4.0)
## base64enc 0.1-3 2015-07-28 [2] RSPM (R 4.4.0)
## bibtex 0.5.1 2023-01-26 [1] RSPM (R 4.4.0)
## Biobase 2.66.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## BiocGenerics * 0.52.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## BiocIO 1.16.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## BiocManager 1.30.25 2024-08-28 [2] CRAN (R 4.4.2)
## BiocParallel 1.40.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## BiocStyle * 2.34.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## Biostrings 2.74.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## bit 4.5.0.1 2024-12-03 [1] RSPM (R 4.4.0)
## bit64 4.5.2 2024-09-22 [1] RSPM (R 4.4.0)
## bitops 1.0-9 2024-10-03 [1] RSPM (R 4.4.0)
## blob 1.2.4 2023-03-17 [1] RSPM (R 4.4.0)
## bookdown 0.41 2024-10-16 [1] RSPM (R 4.4.0)
## BSgenome 1.74.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## bslib 0.8.0 2024-07-29 [2] RSPM (R 4.4.0)
## bumphunter * 1.48.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## cachem 1.1.0 2024-05-16 [2] RSPM (R 4.4.0)
## checkmate 2.3.2 2024-07-29 [1] RSPM (R 4.4.0)
## cli 3.6.3 2024-06-21 [2] RSPM (R 4.4.0)
## cluster 2.1.8 2024-12-11 [3] RSPM (R 4.4.0)
## codetools 0.2-20 2024-03-31 [3] CRAN (R 4.4.2)
## colorspace 2.1-1 2024-07-26 [1] RSPM (R 4.4.0)
## crayon 1.5.3 2024-06-20 [2] RSPM (R 4.4.0)
## curl 6.0.1 2024-11-14 [2] RSPM (R 4.4.0)
## data.table 1.16.4 2024-12-06 [1] RSPM (R 4.4.0)
## DBI 1.2.3 2024-06-02 [1] RSPM (R 4.4.0)
## DEFormats 1.34.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## DelayedArray 0.32.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## derfinder 1.40.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## derfinderHelper 1.40.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## desc 1.4.3 2023-12-10 [2] RSPM (R 4.4.0)
## DESeq2 1.46.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## digest 0.6.37 2024-08-19 [2] RSPM (R 4.4.0)
## doRNG 1.8.6 2023-01-16 [1] RSPM (R 4.4.0)
## dplyr 1.1.4 2023-11-17 [1] RSPM (R 4.4.0)
## edgeR 4.4.1 2024-12-02 [1] Bioconductor 3.20 (R 4.4.2)
## evaluate 1.0.1 2024-10-10 [2] RSPM (R 4.4.0)
## fansi 1.0.6 2023-12-08 [2] RSPM (R 4.4.0)
## fastmap 1.2.0 2024-05-15 [2] RSPM (R 4.4.0)
## foreach * 1.5.2 2022-02-02 [1] RSPM (R 4.4.0)
## foreign 0.8-87 2024-06-26 [3] CRAN (R 4.4.2)
## Formula 1.2-5 2023-02-24 [1] RSPM (R 4.4.0)
## fs 1.6.5 2024-10-30 [2] RSPM (R 4.4.0)
## generics 0.1.3 2022-07-05 [1] RSPM (R 4.4.0)
## GenomeInfoDb * 1.42.1 2024-11-28 [1] Bioconductor 3.20 (R 4.4.2)
## GenomeInfoDbData 1.2.13 2024-12-10 [1] Bioconductor
## GenomicAlignments 1.42.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## GenomicFeatures 1.58.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## GenomicFiles 1.42.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## GenomicRanges * 1.58.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## ggplot2 3.5.1 2024-04-23 [1] RSPM (R 4.4.0)
## glue 1.8.0 2024-09-30 [2] RSPM (R 4.4.0)
## gridExtra 2.3 2017-09-09 [1] RSPM (R 4.4.0)
## gtable 0.3.6 2024-10-25 [1] RSPM (R 4.4.0)
## Hmisc 5.2-1 2024-12-02 [1] RSPM (R 4.4.0)
## htmlTable 2.4.3 2024-07-21 [1] RSPM (R 4.4.0)
## htmltools 0.5.8.1 2024-04-04 [2] RSPM (R 4.4.0)
## htmlwidgets 1.6.4 2023-12-06 [2] RSPM (R 4.4.0)
## httr 1.4.7 2023-08-15 [1] RSPM (R 4.4.0)
## IRanges * 2.40.1 2024-12-05 [1] Bioconductor 3.20 (R 4.4.2)
## iterators * 1.0.14 2022-02-05 [1] RSPM (R 4.4.0)
## jquerylib 0.1.4 2021-04-26 [2] RSPM (R 4.4.0)
## jsonlite 1.8.9 2024-09-20 [2] RSPM (R 4.4.0)
## KEGGREST 1.46.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## knitr 1.49 2024-11-08 [2] RSPM (R 4.4.0)
## knitrBootstrap 1.0.3 2024-02-06 [1] RSPM (R 4.4.0)
## lattice 0.22-6 2024-03-20 [3] CRAN (R 4.4.2)
## lifecycle 1.0.4 2023-11-07 [2] RSPM (R 4.4.0)
## limma 3.62.1 2024-11-03 [1] Bioconductor 3.20 (R 4.4.2)
## locfit * 1.5-9.10 2024-06-24 [1] RSPM (R 4.4.0)
## lubridate 1.9.4 2024-12-08 [1] RSPM (R 4.4.0)
## magrittr 2.0.3 2022-03-30 [2] RSPM (R 4.4.0)
## markdown 1.13 2024-06-04 [1] RSPM (R 4.4.0)
## Matrix 1.7-1 2024-10-18 [3] CRAN (R 4.4.2)
## MatrixGenerics 1.18.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## matrixStats 1.4.1 2024-09-08 [1] RSPM (R 4.4.0)
## memoise 2.0.1 2021-11-26 [2] RSPM (R 4.4.0)
## munsell 0.5.1 2024-04-01 [1] RSPM (R 4.4.0)
## nnet 7.3-19 2023-05-03 [3] CRAN (R 4.4.2)
## pillar 1.9.0 2023-03-22 [2] RSPM (R 4.4.0)
## pkgconfig 2.0.3 2019-09-22 [2] RSPM (R 4.4.0)
## pkgdown 2.1.1 2024-09-17 [2] RSPM (R 4.4.0)
## plyr 1.8.9 2023-10-02 [1] RSPM (R 4.4.0)
## png 0.1-8 2022-11-29 [1] RSPM (R 4.4.0)
## qvalue 2.38.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## R6 2.5.1 2021-08-19 [2] RSPM (R 4.4.0)
## ragg 1.3.3 2024-09-11 [2] RSPM (R 4.4.0)
## Rcpp 1.0.13-1 2024-11-02 [2] RSPM (R 4.4.0)
## RCurl 1.98-1.16 2024-07-11 [1] RSPM (R 4.4.0)
## RefManageR 1.4.0 2022-09-30 [1] RSPM (R 4.4.0)
## regionReport * 1.41.0 2024-12-12 [1] Bioconductor
## reshape2 1.4.4 2020-04-09 [1] RSPM (R 4.4.0)
## restfulr 0.0.15 2022-06-16 [1] RSPM (R 4.4.0)
## rjson 0.2.23 2024-09-16 [1] RSPM (R 4.4.0)
## rlang 1.1.4 2024-06-04 [2] RSPM (R 4.4.0)
## rmarkdown 2.29 2024-11-04 [2] RSPM (R 4.4.0)
## rngtools 1.5.2 2021-09-20 [1] RSPM (R 4.4.0)
## rpart 4.1.23 2023-12-05 [3] CRAN (R 4.4.2)
## Rsamtools 2.22.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## RSQLite 2.3.9 2024-12-03 [1] RSPM (R 4.4.0)
## rstudioapi 0.17.1 2024-10-22 [2] RSPM (R 4.4.0)
## rtracklayer 1.66.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## S4Arrays 1.6.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## S4Vectors * 0.44.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## sass 0.4.9 2024-03-15 [2] RSPM (R 4.4.0)
## scales 1.3.0 2023-11-28 [1] RSPM (R 4.4.0)
## sessioninfo 1.2.2 2021-12-06 [2] RSPM (R 4.4.0)
## SparseArray 1.6.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## statmod 1.5.0 2023-01-06 [1] RSPM (R 4.4.0)
## stringi 1.8.4 2024-05-06 [2] RSPM (R 4.4.0)
## stringr 1.5.1 2023-11-14 [2] RSPM (R 4.4.0)
## SummarizedExperiment 1.36.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## systemfonts 1.1.0 2024-05-15 [2] RSPM (R 4.4.0)
## textshaping 0.4.1 2024-12-06 [2] RSPM (R 4.4.0)
## tibble 3.2.1 2023-03-20 [2] RSPM (R 4.4.0)
## tidyselect 1.2.1 2024-03-11 [1] RSPM (R 4.4.0)
## timechange 0.3.0 2024-01-18 [1] RSPM (R 4.4.0)
## UCSC.utils 1.2.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## utf8 1.2.4 2023-10-22 [2] RSPM (R 4.4.0)
## VariantAnnotation 1.52.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## vctrs 0.6.5 2023-12-01 [2] RSPM (R 4.4.0)
## xfun 0.49 2024-10-31 [2] RSPM (R 4.4.0)
## XML 3.99-0.17 2024-06-25 [1] RSPM (R 4.4.0)
## xml2 1.3.6 2023-12-04 [2] RSPM (R 4.4.0)
## XVector 0.46.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
## yaml 2.3.10 2024-07-26 [2] RSPM (R 4.4.0)
## zlibbioc 1.52.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
##
## [1] /__w/_temp/Library
## [2] /usr/local/lib/R/site-library
## [3] /usr/local/lib/R/library
##
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────