Compute read counts — read_counts • recount

As described in the recount workflow, the counts provided by the recount2 project are base-pair counts. You can scale them using scale_counts or compute the read counts using the area under coverage information (AUC). We use the AUC because Rail-RNA soft clips some reads.

read_counts(rse, use_paired_end = TRUE, round = FALSE)

Arguments

rse: A RangedSummarizedExperiment-class object as downloaded with download_study.
use_paired_end: A logical vector. When TRUE it uses the paired-end flag (colData(rse)$paired_end) to determine whether the sample is paired-end or not. If it's paired-end, then it divides the counts by 2 to return paired-end read counts instead of fragment counts. When FALSE, this information is ignored.
round: Whether to round the counts to integers or not.

Value

Returns a RangedSummarizedExperiment-class object with the read counts.

Details

Check the recount workflow for details about the counts provided by the recount2 project. Note that this function should not be used after scale_counts or it will return non-sensical results.

References

Collado-Torres L, Nellore A and Jaffe AE. recount workflow: Accessing over 70,000 human RNA-seq samples with Bioconductor version 1; referees: 1 approved, 2 approved with reservations. F1000Research 2017, 6:1558 doi: 10.12688/f1000research.12223.1.

See also

Author

Leonardo Collado-Torres

Examples


## Difference between read counts and reads downloaded by Rail-RNA
colSums(assays(read_counts(rse_gene_SRP009615))$counts) / 1e6 -
    colData(rse_gene_SRP009615)$reads_downloaded / 1e6
#>  SRR387777  SRR387778  SRR387779  SRR387780  SRR389077  SRR389078  SRR389079 
#>  -8.839994 -12.892541  -7.536284  -6.497571 -13.239233 -15.816168 -14.775568 
#>  SRR389080  SRR389081  SRR389082  SRR389083  SRR389084 
#>  -6.274001  -5.054220  -2.328694  -5.265160  -4.421728 

## Paired-end read counts vs fragment counts (single-end counts)
download_study("DRP000499")
#> 2024-12-12 22:15:07.486836 downloading file rse_gene.Rdata to DRP000499
load("DRP000499/rse_gene.Rdata")
colSums(assays(read_counts(rse_gene, use_paired_end = FALSE))$counts) /
    colSums(assays(read_counts(rse_gene))$counts)
#> DRR001622 DRR001625 DRR001628 DRR001631 DRR001634 DRR001637 DRR001623 DRR001626 
#>         2         2         2         2         2         2         2         2 
#> DRR001629 DRR001632 DRR001635 DRR001638 DRR001642 DRR001624 DRR001627 DRR001630 
#>         2         2         2         2         2         2         2         2 
#> DRR001633 DRR001636 DRR001639 DRR001640 DRR001641 
#>         2         2         2         2         2 

## Difference between paired-end read counts vs paired-end reads downloaded
colSums(assays(read_counts(rse_gene))$counts) / 1e6 -
    colData(rse_gene)$reads_downloaded / 1e6 / 2
#>   DRR001622   DRR001625   DRR001628   DRR001631   DRR001634   DRR001637 
#>  -2.0348249  -0.6138274 -53.1563756  -9.8438273  -8.4707991  -7.3001527 
#>   DRR001623   DRR001626   DRR001629   DRR001632   DRR001635   DRR001638 
#>  -1.0163589  -0.9350952 -54.8523133 -15.8955635  -8.8314352  -8.0629589 
#>   DRR001642   DRR001624   DRR001627   DRR001630   DRR001633   DRR001636 
#>  -7.8494200  -1.5618029  -1.2820004 -16.5891652  -8.5003669 -12.5271479 
#>   DRR001639   DRR001640   DRR001641 
#> -15.3775958  -9.8064514  -7.3778134