This function appends sample metadata information to a RangedSummarizedExperiment-class from the recount2 project. The sample metadata comes from curated efforts independent from the original recount2 project. Currently the only information comes from the recount_brain project described in more detail at http://lieberinstitute.github.io/recount-brain/.
add_metadata(rse, source = "recount_brain_v2", is_tcga = FALSE, verbose = TRUE)
A RangedSummarizedExperiment-class object as downloaded with download_study. If this argument is not specified, the function will return the raw metadata table.
A valid source name. The only supported options at this
moment are recount_brain_v1
and recount_brain_v2
.
Set to TRUE
only when rse
is from TCGA.
Otherwise set to FALSE
(default).
If TRUE
it will print a message of where the
predictions file is being downloaded to.
A RangedSummarizedExperiment-class
object with the sample metadata columns appended to the colData()
slot.
For source = "recount_brain_v1"
and
source = "recount_brain_v2"
, the metadata columns are
described at http://lieberinstitute.github.io/recount-brain/.
Alternatively, you can explore recount_brain_v2
interactively at
https://jhubiostatistics.shinyapps.io/recount-brain/.
If you use the recount_brain data please cite the Razmara et al. bioRxiv, 2019 https://www.biorxiv.org/content/10.1101/618025v1. A bib file is available via citation('recount').
Razmara et al, bioRxiv, 2019. https://www.biorxiv.org/content/10.1101/618025v1
## Add the sample metadata to an example rse_gene object
rse_gene <- add_metadata(rse_gene_SRP009615, "recount_brain_v2")
#> 2024-05-21 17:45:27.937509 downloading the recount_brain metadata to /tmp/RtmpJLggZ6/recount_brain_v2.Rdata
#> Loading objects:
#> recount_brain
#> 2024-05-21 17:45:28.540394 found 0 out of 12 samples in the recount_brain metadata
## Explore the metadata
colData(rse_gene)
#> DataFrame with 12 rows and 85 columns
#> project sample experiment run
#> <character> <character> <character> <character>
#> SRR387777 SRP009615 SRS281685 SRX110461 SRR387777
#> SRR387778 SRP009615 SRS281686 SRX110462 SRR387778
#> SRR387779 SRP009615 SRS281687 SRX110463 SRR387779
#> SRR387780 SRP009615 SRS281688 SRX110464 SRR387780
#> SRR389077 SRP009615 SRS282369 SRX111299 SRR389077
#> ... ... ... ... ...
#> SRR389080 SRP009615 SRS282372 SRX111302 SRR389080
#> SRR389081 SRP009615 SRS282373 SRX111303 SRR389081
#> SRR389082 SRP009615 SRS282374 SRX111304 SRR389082
#> SRR389083 SRP009615 SRS282375 SRX111305 SRR389083
#> SRR389084 SRP009615 SRS282376 SRX111306 SRR389084
#> read_count_as_reported_by_sra reads_downloaded
#> <integer> <integer>
#> SRR387777 30631853 30631853
#> SRR387778 37001306 37001306
#> SRR387779 40552001 40552001
#> SRR387780 32466352 32466352
#> SRR389077 27819603 27819603
#> ... ... ...
#> SRR389080 34856203 34856203
#> SRR389081 23351679 23351679
#> SRR389082 18144828 18144828
#> SRR389083 24417368 24417368
#> SRR389084 23060084 23060084
#> proportion_of_reads_reported_by_sra_downloaded paired_end
#> <numeric> <logical>
#> SRR387777 1 FALSE
#> SRR387778 1 FALSE
#> SRR387779 1 FALSE
#> SRR387780 1 FALSE
#> SRR389077 1 FALSE
#> ... ... ...
#> SRR389080 1 FALSE
#> SRR389081 1 FALSE
#> SRR389082 1 FALSE
#> SRR389083 1 FALSE
#> SRR389084 1 FALSE
#> sra_misreported_paired_end mapped_read_count auc
#> <logical> <integer> <numeric>
#> SRR387777 FALSE 28798572 1029494445
#> SRR387778 FALSE 33170281 1184877985
#> SRR387779 FALSE 37322762 1336528969
#> SRR387780 FALSE 29970735 1073178116
#> SRR389077 FALSE 24966859 893978355
#> ... ... ... ...
#> SRR389080 FALSE 32469994 1163527939
#> SRR389081 FALSE 21904197 781685955
#> SRR389082 FALSE 17199795 616048853
#> SRR389083 FALSE 22499386 806323346
#> SRR389084 FALSE 21957003 787795710
#> sharq_beta_tissue sharq_beta_cell_type biosample_submission_date
#> <character> <character> <character>
#> SRR387777 blood k562 2011-12-05T15:40:03...
#> SRR387778 blood k562 2011-12-05T15:40:03...
#> SRR387779 blood k562 2011-12-05T15:40:03...
#> SRR387780 blood k562 2011-12-05T15:40:03...
#> SRR389077 blood k562 2011-12-13T11:26:05...
#> ... ... ... ...
#> SRR389080 blood k562 2011-12-13T11:26:05...
#> SRR389081 blood k562 2011-12-13T11:26:05...
#> SRR389082 blood k562 2011-12-13T11:26:05...
#> SRR389083 blood k562 2011-12-13T11:26:05...
#> SRR389084 blood k562 2011-12-13T11:26:05...
#> biosample_publication_date biosample_update_date avg_read_length
#> <character> <character> <integer>
#> SRR387777 2011-12-07T09:29:59... 2014-08-27T04:18:20... 36
#> SRR387778 2011-12-07T09:29:59... 2014-08-27T04:18:21... 36
#> SRR387779 2011-12-07T09:29:59... 2014-08-27T04:18:21... 36
#> SRR387780 2011-12-07T09:29:59... 2014-08-27T04:18:22... 36
#> SRR389077 2011-12-13T11:26:06... 2014-08-27T04:22:14... 36
#> ... ... ... ...
#> SRR389080 2011-12-13T11:26:06... 2014-08-27T04:22:15... 36
#> SRR389081 2011-12-13T11:26:06... 2014-08-27T04:22:16... 36
#> SRR389082 2011-12-13T11:26:06... 2014-08-27T04:22:16... 36
#> SRR389083 2011-12-13T11:26:06... 2014-08-27T04:22:17... 36
#> SRR389084 2011-12-13T11:26:06... 2014-08-27T04:22:17... 36
#> geo_accession bigwig_file title
#> <character> <character> <character>
#> SRR387777 GSM836270 SRR387777.bw K562 cells with shRN..
#> SRR387778 GSM836271 SRR387778.bw K562 cells with shRN..
#> SRR387779 GSM836272 SRR387779.bw K562 cells with shRN..
#> SRR387780 GSM836273 SRR387780.bw K562 cells with shRN..
#> SRR389077 GSM847561 SRR389077.bw K562 cells with shRN..
#> ... ... ... ...
#> SRR389080 GSM847564 SRR389080.bw K562 cells with shRN..
#> SRR389081 GSM847565 SRR389081.bw K562 cells with shRN..
#> SRR389082 GSM847566 SRR389082.bw K562 cells with shRN..
#> SRR389083 GSM847567 SRR389083.bw K562 cells with shRN..
#> SRR389084 GSM847568 SRR389084.bw K562 cells with shRN..
#> characteristics
#> <CharacterList>
#> SRR387777 cells: K562,shRNA expression: no,treatment: Puromycin
#> SRR387778 cells: K562,shRNA expression: ye..,treatment: Puromycin..
#> SRR387779 cells: K562,shRNA expression: no,treatment: Puromycin
#> SRR387780 cells: K562,shRNA expression: ye..,treatment: Puromycin..
#> SRR389077 cell line: K562,shRNA expression: no..,treatment: Puromycin
#> ... ...
#> SRR389080 cell line: K562,shRNA expression: ex..,treatment: Puromycin..
#> SRR389081 cell line: K562,shRNA expression: no..,treatment: Puromycin
#> SRR389082 cell line: K562,shRNA expression: ex..,treatment: Puromycin..
#> SRR389083 cell line: K562,shRNA expression: no..,treatment: Puromycin
#> SRR389084 cell line: K562,shRNA expression: ex..,treatment: Puromycin..
#> age age_units assay_type_s avgspotlen_l bioproject_s
#> <logical> <logical> <logical> <logical> <logical>
#> SRR387777 NA NA NA NA NA
#> SRR387778 NA NA NA NA NA
#> SRR387779 NA NA NA NA NA
#> SRR387780 NA NA NA NA NA
#> SRR389077 NA NA NA NA NA
#> ... ... ... ... ... ...
#> SRR389080 NA NA NA NA NA
#> SRR389081 NA NA NA NA NA
#> SRR389082 NA NA NA NA NA
#> SRR389083 NA NA NA NA NA
#> SRR389084 NA NA NA NA NA
#> biosample_s brain_bank brodmann_area cell_line center_name_s
#> <logical> <logical> <logical> <logical> <logical>
#> SRR387777 NA NA NA NA NA
#> SRR387778 NA NA NA NA NA
#> SRR387779 NA NA NA NA NA
#> SRR387780 NA NA NA NA NA
#> SRR389077 NA NA NA NA NA
#> ... ... ... ... ... ...
#> SRR389080 NA NA NA NA NA
#> SRR389081 NA NA NA NA NA
#> SRR389082 NA NA NA NA NA
#> SRR389083 NA NA NA NA NA
#> SRR389084 NA NA NA NA NA
#> clinical_stage_1 clinical_stage_2 consent_s development disease
#> <logical> <logical> <logical> <logical> <logical>
#> SRR387777 NA NA NA NA NA
#> SRR387778 NA NA NA NA NA
#> SRR387779 NA NA NA NA NA
#> SRR387780 NA NA NA NA NA
#> SRR389077 NA NA NA NA NA
#> ... ... ... ... ... ...
#> SRR389080 NA NA NA NA NA
#> SRR389081 NA NA NA NA NA
#> SRR389082 NA NA NA NA NA
#> SRR389083 NA NA NA NA NA
#> SRR389084 NA NA NA NA NA
#> disease_status experiment_s hemisphere insertsize_l instrument_s
#> <logical> <logical> <logical> <logical> <logical>
#> SRR387777 NA NA NA NA NA
#> SRR387778 NA NA NA NA NA
#> SRR387779 NA NA NA NA NA
#> SRR387780 NA NA NA NA NA
#> SRR389077 NA NA NA NA NA
#> ... ... ... ... ... ...
#> SRR389080 NA NA NA NA NA
#> SRR389081 NA NA NA NA NA
#> SRR389082 NA NA NA NA NA
#> SRR389083 NA NA NA NA NA
#> SRR389084 NA NA NA NA NA
#> library_name_s librarylayout_s libraryselection_s librarysource_s
#> <logical> <logical> <logical> <logical>
#> SRR387777 NA NA NA NA
#> SRR387778 NA NA NA NA
#> SRR387779 NA NA NA NA
#> SRR387780 NA NA NA NA
#> SRR389077 NA NA NA NA
#> ... ... ... ... ...
#> SRR389080 NA NA NA NA
#> SRR389081 NA NA NA NA
#> SRR389082 NA NA NA NA
#> SRR389083 NA NA NA NA
#> SRR389084 NA NA NA NA
#> loaddate_s mbases_l mbytes_l organism_s pathology platform_s
#> <logical> <logical> <logical> <logical> <logical> <logical>
#> SRR387777 NA NA NA NA NA NA
#> SRR387778 NA NA NA NA NA NA
#> SRR387779 NA NA NA NA NA NA
#> SRR387780 NA NA NA NA NA NA
#> SRR389077 NA NA NA NA NA NA
#> ... ... ... ... ... ... ...
#> SRR389080 NA NA NA NA NA NA
#> SRR389081 NA NA NA NA NA NA
#> SRR389082 NA NA NA NA NA NA
#> SRR389083 NA NA NA NA NA NA
#> SRR389084 NA NA NA NA NA NA
#> pmi pmi_units preparation present_in_recount race
#> <logical> <logical> <logical> <logical> <logical>
#> SRR387777 NA NA NA NA NA
#> SRR387778 NA NA NA NA NA
#> SRR387779 NA NA NA NA NA
#> SRR387780 NA NA NA NA NA
#> SRR389077 NA NA NA NA NA
#> ... ... ... ... ... ...
#> SRR389080 NA NA NA NA NA
#> SRR389081 NA NA NA NA NA
#> SRR389082 NA NA NA NA NA
#> SRR389083 NA NA NA NA NA
#> SRR389084 NA NA NA NA NA
#> releasedate_s rin sample_name_s sample_origin sex
#> <logical> <logical> <logical> <logical> <logical>
#> SRR387777 NA NA NA NA NA
#> SRR387778 NA NA NA NA NA
#> SRR387779 NA NA NA NA NA
#> SRR387780 NA NA NA NA NA
#> SRR389077 NA NA NA NA NA
#> ... ... ... ... ... ...
#> SRR389080 NA NA NA NA NA
#> SRR389081 NA NA NA NA NA
#> SRR389082 NA NA NA NA NA
#> SRR389083 NA NA NA NA NA
#> SRR389084 NA NA NA NA NA
#> sra_sample_s sra_study_s tissue_site_1 tissue_site_2 tissue_site_3
#> <logical> <logical> <logical> <logical> <logical>
#> SRR387777 NA NA NA NA NA
#> SRR387778 NA NA NA NA NA
#> SRR387779 NA NA NA NA NA
#> SRR387780 NA NA NA NA NA
#> SRR389077 NA NA NA NA NA
#> ... ... ... ... ... ...
#> SRR389080 NA NA NA NA NA
#> SRR389081 NA NA NA NA NA
#> SRR389082 NA NA NA NA NA
#> SRR389083 NA NA NA NA NA
#> SRR389084 NA NA NA NA NA
#> tumor_type viability Study_full drugName_full drug_info_full
#> <logical> <logical> <logical> <logical> <logical>
#> SRR387777 NA NA NA NA NA
#> SRR387778 NA NA NA NA NA
#> SRR387779 NA NA NA NA NA
#> SRR387780 NA NA NA NA NA
#> SRR389077 NA NA NA NA NA
#> ... ... ... ... ... ...
#> SRR389080 NA NA NA NA NA
#> SRR389081 NA NA NA NA NA
#> SRR389082 NA NA NA NA NA
#> SRR389083 NA NA NA NA NA
#> SRR389084 NA NA NA NA NA
#> drug_type_full full_260_280 count_file_identifier Dataset
#> <logical> <logical> <logical> <logical>
#> SRR387777 NA NA NA NA
#> SRR387778 NA NA NA NA
#> SRR387779 NA NA NA NA
#> SRR387780 NA NA NA NA
#> SRR389077 NA NA NA NA
#> ... ... ... ... ...
#> SRR389080 NA NA NA NA
#> SRR389081 NA NA NA NA
#> SRR389082 NA NA NA NA
#> SRR389083 NA NA NA NA
#> SRR389084 NA NA NA NA
#> brodmann_ontology brodmann_synonyms brodmann_parents
#> <logical> <logical> <logical>
#> SRR387777 NA NA NA
#> SRR387778 NA NA NA
#> SRR387779 NA NA NA
#> SRR387780 NA NA NA
#> SRR389077 NA NA NA
#> ... ... ... ...
#> SRR389080 NA NA NA
#> SRR389081 NA NA NA
#> SRR389082 NA NA NA
#> SRR389083 NA NA NA
#> SRR389084 NA NA NA
#> brodmann_parents_label disease_ontology tissue tissue_ontology
#> <logical> <logical> <logical> <logical>
#> SRR387777 NA NA NA NA
#> SRR387778 NA NA NA NA
#> SRR387779 NA NA NA NA
#> SRR387780 NA NA NA NA
#> SRR389077 NA NA NA NA
#> ... ... ... ... ...
#> SRR389080 NA NA NA NA
#> SRR389081 NA NA NA NA
#> SRR389082 NA NA NA NA
#> SRR389083 NA NA NA NA
#> SRR389084 NA NA NA NA
#> tissue_synonyms tissue_parents tissue_parents_label
#> <logical> <logical> <logical>
#> SRR387777 NA NA NA
#> SRR387778 NA NA NA
#> SRR387779 NA NA NA
#> SRR387780 NA NA NA
#> SRR389077 NA NA NA
#> ... ... ... ...
#> SRR389080 NA NA NA
#> SRR389081 NA NA NA
#> SRR389082 NA NA NA
#> SRR389083 NA NA NA
#> SRR389084 NA NA NA
## For a list of studies present in recount_brain check
## http://lieberinstitute.github.io/recount-brain/.
## recount_brain_v2 includes GTEx and TCGA brain samples in addition to the
## recount_brain_v1 data, plus ontology information.
## Obtain all the recount_brain_v2 metadata if you want to
## explore the metadata manually
recount_brain_v2 <- add_metadata(source = "recount_brain_v2")
#> 2024-05-21 17:45:28.620153 downloading the recount_brain metadata to /tmp/RtmpJLggZ6/recount_brain_v2.Rdata
#> Loading objects:
#> recount_brain