Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subsetting enrichment object #17

Closed
mdozmorov opened this issue Dec 22, 2018 · 6 comments
Closed

Subsetting enrichment object #17

mdozmorov opened this issue Dec 22, 2018 · 6 comments

Comments

@mdozmorov
Copy link

Hi Guangchuang,

The cnetplot is a really attractive feature. I wonder if it is possible to subset the enrichment object that goes into the cnetplot function?

Case scenario: Run an enrichment analysis with pvalueCutoff = 1, to see all results. Plotting them all would be infeasible. How to subset the enrichment object to, say, first ten most significant terms, and then plot it with cnetplot?

Thanks,
Mikhail

@GuangchuangYu
Copy link
Member

> require(DOSE)
Loading required package: DOSE
DOSE v3.8.0  For help: https://guangchuangyu.github.io/DOSE

If you use DOSE in published research, please cite:
Guangchuang Yu, Li-Gen Wang, Guang-Rong Yan, Qing-Yu He. DOSE: an R/Bioconductor package for Disease Ontology Semantic and Enrichment analysis. Bioinformatics 2015, 31(4):608-609

> data(geneList)
> de = names(geneList)[1:100]
> x = enrichDO(de, qvalueCutoff=1, pvalueCutoff=1)
> dim(x)
[1] 383   9
> head(x, 2)
                       ID            Description GeneRatio  BgRatio
DOID:0060071 DOID:0060071 pre-malignant neoplasm      5/77  22/8007
DOID:5295       DOID:5295     intestinal disease      9/77 157/8007
                   pvalue     p.adjust       qvalue
DOID:0060071 1.671524e-06 0.0006401937 0.0004609887
DOID:5295    1.759049e-05 0.0027885022 0.0020079362
                                                  geneID Count
DOID:0060071                    6280/6278/10232/332/4321     5
DOID:5295    4312/6279/3627/10563/4283/890/366/4902/3620     9
> y = x[x$pvalue < 0.05, ]
> dim(y)
[1] 121   9
> tail(y, 2)
                 ID                    Description GeneRatio  BgRatio
DOID:7474 DOID:7474 malignant pleural mesothelioma      2/77  36/8007
DOID:8692 DOID:8692               myeloid leukemia      4/77 142/8007
              pvalue  p.adjust    qvalue             geneID Count
DOID:7474 0.04659310 0.1487096 0.1070824          10232/332     2
DOID:8692 0.04748286 0.1502970 0.1082254 820/10232/332/3620     4
> tail(x, 2)
                 ID     Description GeneRatio  BgRatio    pvalue  p.adjust
DOID:5679 DOID:5679 retinal disease      1/77 358/8007 0.9709661 0.9735079
DOID:936   DOID:936   brain disease      1/77 454/8007 0.9890750 0.9890750
             qvalue geneID Count
DOID:5679 0.7010006   4312     1
DOID:936  0.7122101   3627     1

@GuangchuangYu
Copy link
Member

oops, y will be a data.frame.

@GuangchuangYu
Copy link
Member

asis parameter introduced with FALSE as default value. (DOSE v >= 3.9.1)

> y = x[x$pvalue < 0.05, asis=T]
> class(y)
[1] "enrichResult"
attr(,"package")
[1] "DOSE"
> y
#
# over-representation test
#
#...@organism    Homo sapiens
#...@ontology    DO
#...@keytype     ENTREZID
#...@gene        chr [1:100] "4312" "8318" "10874" "55143" "55388" "991" "6280" "2305" ...
#...pvalues adjusted by 'BH' with cutoff <1
#...121 enriched terms found
'data.frame':   121 obs. of  9 variables:
 $ ID         : chr  "DOID:0060071" "DOID:5295" "DOID:8719" "DOID:3007" ...
 $ Description: chr  "pre-malignant neoplasm" "intestinal disease" "in situ carcinoma" "breast ductal carcinoma" ...
 $ GeneRatio  : chr  "5/77" "9/77" "4/77" "4/77" ...
 $ BgRatio    : chr  "22/8007" "157/8007" "18/8007" "29/8007" ...
 $ pvalue     : num  1.67e-06 1.76e-05 2.18e-05 1.56e-04 2.08e-04 ...
 $ p.adjust   : num  0.00064 0.00279 0.00279 0.0136 0.0136 ...
 $ qvalue     : num  0.000461 0.002008 0.002008 0.009796 0.009796 ...
 $ geneID     : chr  "6280/6278/10232/332/4321" "4312/6279/3627/10563/4283/890/366/4902/3620" "6280/6278/10232/332" "6280/6279/4751/6286" ...
 $ Count      : int  5 9 4 4 13 6 13 5 5 6 ...
#...Citation
  Guangchuang Yu, Li-Gen Wang, Guang-Rong Yan, Qing-Yu He. DOSE: an
  R/Bioconductor package for Disease Ontology Semantic and Enrichment
  analysis. Bioinformatics 2015, 31(4):608-609

@mdozmorov
Copy link
Author

Yes, I was missing the asis = T part. Tried assigning class, and something like as. enrichResult - nothing worked. Excellent, thanks, @GuangchuangYu

@GuangchuangYu
Copy link
Member

another solution by using dplyr verbs

@mdozmorov
Copy link
Author

Thanks, @GuangchuangYu, the https://github.com/YuLab-SMU/clusterProfiler.dplyr functionality is a great addition! On my list to try for the next analysis.

GuangchuangYu pushed a commit that referenced this issue Dec 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants