This function uses the pre-computed mean coverage for a given SRA project to identify the expressed regions (ERs) for a given chromosome. It returns a GRanges-class object with the expressed regions as defined by findRegions.

expressed_regions(
  project,
  chr,
  cutoff,
  outdir = NULL,
  maxClusterGap = 300L,
  chrlen = NULL,
  verbose = TRUE,
  ...
)

Arguments

project

A character vector with one SRA study id.

chr

A character vector with the name of the chromosome.

cutoff

The base-pair level cutoff to use.

outdir

The destination directory for the downloaded file(s) that were previously downloaded with download_study. If the files are missing, but outdir is specified, they will get downloaded first. By default outdir is set to NULL which will use the data from the web. We only recommend downloading the full data if you will use it several times.

maxClusterGap

This determines the maximum gap between candidate ERs.

chrlen

The chromosome length in base pairs. If it's NULL, the chromosome length is extracted from the Rail-RNA runs GitHub repository. Alternatively check the SciServer section on the vignette to see how to access all the recount data via a R Jupyter Notebook.

verbose

If TRUE basic status updates will be printed along the way.

...

Additional arguments passed to download_study when outdir is specified but the required files are missing.

Value

A GRanges-class object as created by findRegions.

Author

Leonardo Collado-Torres

Examples

## Define expressed regions for study SRP009615, chrY
if (.Platform$OS.type != "windows") {
    ## Reading BigWig files is not supported by rtracklayer on Windows
    regions <- expressed_regions("SRP009615", "chrY",
        cutoff = 5L,
        maxClusterGap = 3000L
    )
}
#> 2023-05-07 05:53:36.849574 loadCoverage: loading BigWig file http://duffel.rail.bio/recount/SRP009615/bw/mean_SRP009615.bw
#> 2023-05-07 05:53:40.580696 loadCoverage: applying the cutoff to the merged data
#> 2023-05-07 05:53:40.993329 filterData: originally there were 57227415 rows, now there are 57227415 rows. Meaning that 0 percent was filtered.
#> 2023-05-07 05:53:40.995705 findRegions: identifying potential segments
#> 2023-05-07 05:53:40.999846 findRegions: segmenting information
#> 2023-05-07 05:53:41.000176 .getSegmentsRle: segmenting with cutoff(s) 5
#> 2023-05-07 05:53:41.044234 findRegions: identifying candidate regions
#> 2023-05-07 05:53:41.283391 findRegions: identifying region clusters
if (FALSE) {
## Define the regions for multiple chrs
regs <- sapply(chrs, expressed_regions, project = "SRP009615", cutoff = 5L)

## You can then combine them into a single GRanges object if you want to
library("GenomicRanges")
single <- unlist(GRangesList(regs))
}