Skip to content

Commit

Permalink
Merge pull request #539 from maxulysse/dev_annotation
Browse files Browse the repository at this point in the history
Improving annotation
  • Loading branch information
FriederikeHanssen committed May 12, 2022
2 parents b76d09d + 42509e6 commit 818e8e6
Show file tree
Hide file tree
Showing 24 changed files with 339 additions and 333 deletions.
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#511](https://github.com/nf-core/sarek/pull/511) - Sync `TEMPLATE` with `tools` `2.3.2`
- [#520](https://github.com/nf-core/sarek/pull/520) - Improve annotation subworkflows
- [#537](https://github.com/nf-core/sarek/pull/537) - Update workflow figure
- [#539](https://github.com/nf-core/sarek/pull/539) - Update `CITATIONS.md`

### Fixed

Expand Down Expand Up @@ -78,6 +79,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#485](https://github.com/nf-core/sarek/pull/485) - `--skip_qc`, `--skip_markduplicates` and `--skip_bqsr` is now `--skip_tools`
- [#538](https://github.com/nf-core/sarek/pull/538) - `--sequencing_center` is now `--seq_center`
- [#538](https://github.com/nf-core/sarek/pull/538) - `--markdup_java_options` has been removed
- [#539](https://github.com/nf-core/sarek/pull/539) - `--annotate_tools` has been removed
- [#539](https://github.com/nf-core/sarek/pull/539) - `--cadd_cache`, `--cadd_indels`, `--cadd_indels_tbi`, `--cadd_wg_snvs`, `--cadd_wg_snvs_tbi` have been removed
- [#539](https://github.com/nf-core/sarek/pull/539) - `--genesplicer` has been removed
- [#539](https://github.com/nf-core/sarek/pull/539) - `conf/genomes.config` and `params.genomes_base` have been removed

## [2.7.1](https://github.com/nf-core/sarek/releases/tag/2.7.1) - Pårtejekna

Expand Down
17 changes: 17 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,23 @@
> Danecek P, Auton A, Abecasis G, et al.: The variant call format and VCFtools. Bioinformatics. 2011 Aug 1;27(15):2156-8. doi: 10.1093/bioinformatics/btr330. Epub 2011 Jun 7. PubMed PMID: 21653522; PubMed Central PMCID: PMC3137218.
- [VEP](https://pubmed.ncbi.nlm.nih.gov/27268795/)

> McLaren W, Gil L, Hunt SE, et al.: The Ensembl Variant Effect Predictor. Genome Biol. 2016 Jun 6;17(1):122. doi: 10.1186/s13059-016-0974-4. PubMed PMID: 27268795; PubMed Central PMCID: PMC4893825.
- [dbNSFP](https://pubmed.ncbi.nlm.nih.gov/33261662/)

> Liu X, et al.: dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Med. 2020 Dec 2;12(1):103. doi: 10.1186/s13073-020-00803-9. PubMed PMID: 33261662; PubMed Central PMCID: PMC7709417.
- [LOFTEE](https://pubmed.ncbi.nlm.nih.gov/32461654/)

> Karczewski KJ, et al.: The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020 May;581(7809):434-443. doi: 10.1038/s41586-020-2308-7. PubMed PMID: 32461654; PubMed Central PMCID: PMC7334197.
- [SpliceAI](https://pubmed.ncbi.nlm.nih.gov/30661751/)

> Jaganathan K, et al.: Predicting Splicing from Primary Sequence with Deep Learning. Cell. 2019 Jan 24;176(3):535-548.e24. doi: 10.1016/j.cell.2018.12.015. PubMed PMID: 30661751.
- [SpliceRegion](https://github.com/Ensembl/VEP_plugins/blob/release/106/SpliceRegion.pm)

## R packages

- [R](https://www.R-project.org/)
Expand All @@ -88,6 +103,7 @@
> Trevor L Davis (2018). optparse: Command Line Option Parser.
- [RColorBrewer](https://CRAN.R-project.org/package=RColorBrewer)

> Erich Neuwirth (2014). RColorBrewer: ColorBrewer Palettes.
## Software packaging/containerisation tools
Expand All @@ -107,4 +123,5 @@
- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)

- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)

> Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.
6 changes: 4 additions & 2 deletions assets/email_template.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,10 @@
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1" />

<!-- prettier-ignore -->
<meta name="description" content="nf-core/sarek: An open-source analysis pipeline to detect germline or somatic variants from whole genome or targeted sequencing" />
<meta
name="description"
content="nf-core/sarek: An open-source analysis pipeline to detect germline or somatic variants from whole genome or targeted sequencing"
/>
<title>nf-core/sarek Pipeline Report</title>
</head>
<body>
Expand Down
36 changes: 0 additions & 36 deletions conf/genomes.config

This file was deleted.

12 changes: 8 additions & 4 deletions conf/igenomes.config
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ params {
known_indels_tbi = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/GATKBundle/{1000G_phase1,Mills_and_1000G_gold_standard}.indels.b37.vcf.idx"
mappability = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/Control-FREEC/out100m2_hg19.gem"
snpeff_db = 'GRCh37.75'
vep_cache_version = '104'
snpeff_genome = 'GRCh37'
vep_cache_version = 104
vep_genome = 'GRCh37'
vep_species = 'homo_sapiens'
}
Expand All @@ -51,7 +52,8 @@ params {
known_indels_tbi = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/{Mills_and_1000G_gold_standard.indels.hg38,beta/Homo_sapiens_assembly38.known_indels}.vcf.gz.tbi"
mappability = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/Control-FREEC/out100m2_hg38.gem"
snpeff_db = 'GRCh38.99'
vep_cache_version = '104'
snpeff_genome = 'GRCh38'
vep_cache_version = 104
vep_genome = 'GRCh38'
vep_species = 'homo_sapiens'
}
Expand All @@ -78,7 +80,8 @@ params {
mappability = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/Control-FREEC/GRCm38_68_mm10.gem"
readme = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/README.txt"
snpeff_db = 'GRCm38.99'
vep_cache_version = '102'
snpeff_genome = 'GRCm38'
vep_cache_version = 102
vep_genome = 'GRCm38'
vep_species = 'mus_musculus'
}
Expand All @@ -101,7 +104,8 @@ params {
bwa = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/BWAIndex/version0.6.0/"
fasta = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/WholeGenomeFasta/genome.fa"
snpeff_db = 'WBcel235.99'
vep_cache_version = '104'
snpeff_genome = 'WBcel235'
vep_cache_version = 104
vep_genome = 'WBcel235'
vep_species = 'caenorhabditis_elegans'
}
Expand Down
68 changes: 28 additions & 40 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -963,24 +963,32 @@ process{
withName: 'VCFTOOLS_SUMMARY'{
ext.args = "--FILTER-summary"
}
}

// ANNOTATE
process {

withName: 'ENSEMBLVEP' {
ext.args = '--everything --filter_common --per_gene --total_length --offline'
container = { "nfcore/vep:104.3.${params.genome}" }
ext.args = [
'--everything --filter_common --per_gene --total_length --offline',
(params.vep_dbnsfp && params.dbnsfp) ? '--plugin dbNSFP,dbNSFP.gz,rs_dbSNP,HGVSc_VEP,HGVSp_VEP,1000Gp3_EAS_AF,1000Gp3_AMR_AF,LRT_score,GERP++_RS,gnomAD_exomes_AF' : '',
(params.vep_loftee) ? '--plugin LoF,loftee_path:/opt/conda/envs/nf-core-vep-104.3/share/ensembl-vep-104.3-0' : '',
(params.vep_spliceai && params.spliceai_snv && params.spliceai_indel) ? '--plugin SpliceAI,snv=spliceai_scores.raw.snv.hg38.vcf.gz,indel=spliceai_scores.raw.indel.hg38.vcf.gz' : '',
(params.vep_spliceregion) ? '--plugin SpliceRegion' : ''
].join(' ').trim()
if (!params.vep_cache) container = { params.vep_genome ? "nfcore/vep:104.3.${params.vep_genome}" : "nfcore/vep:104.3.${params.genome}" }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/reports/EnsemblVEP/${meta.id}/${meta.variantcaller}" },
pattern: "*html"
]
}

withName: ".*:ANNOTATION_MERGE:ENSEMBLVEP" {
ext.prefix = {"${meta.id}_snpEff"}
}

withName: 'SNPEFF' {
ext.args = '-nodownload -canon -v'
container = { "nfcore/snpeff:5.0.${params.genome}" }
if (!params.snpeff_cache) container = { params.snpeff_genome ? "nfcore/snpeff:5.0.${params.snpeff_genome}" : "nfcore/snpeff:5.0.${params.genome}" }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/reports/SnpEff/${meta.id}/${meta.variantcaller}" },
Expand All @@ -989,56 +997,36 @@ process {
]
}

withName: 'ANNOTATION_BGZIPTABIX' {
withName: "NFCORE_SAREK:SAREK:ANNOTATE:.*:TABIX_BGZIPTABIX" {
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/annotation/${meta.id}/${meta.variantcaller}" },
pattern: "*{gz,gz.tbi}"
]
}
}

if (params.tools && (params.tools.contains('snpeff') || params.tools.contains('merge'))) {
process {
withName: 'NFCORE_SAREK:SAREK:ANNOTATE:ANNOTATION_SNPEFF:ANNOTATION_BGZIPTABIX' {
ext.prefix = {"${meta.id}_snpEff.ann.vcf"}
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/annotation/${meta.id}/${meta.variantcaller}" },
pattern: "*{gz,gz.tbi}",
saveAs: { params.tools.contains('snpeff') ? it : null }
]
}
withName: 'NFCORE_SAREK:SAREK:ANNOTATE:ANNOTATION_SNPEFF:TABIX_BGZIPTABIX' {
ext.prefix = {"${meta.id}_snpEff.ann.vcf"}
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/annotation/${meta.id}/${meta.variantcaller}" },
pattern: "*{gz,gz.tbi}",
saveAs: { params.tools.contains('snpeff') ? it : null }
]
}
}

if (params.tools && (params.tools.contains('vep'))) {
process {
withName: 'NFCORE_SAREK:SAREK:ANNOTATE:ANNOTATION_ENSEMBLVEP:ANNOTATION_BGZIPTABIX' {
ext.prefix = {"${meta.id}_VEP.ann.vcf"}
}
withName: 'NFCORE_SAREK:SAREK:ANNOTATE:ANNOTATION_ENSEMBLVEP:TABIX_BGZIPTABIX' {
ext.prefix = {"${meta.id}_VEP.ann.vcf"}
}
}

if (params.tools && (params.tools.contains('merge'))) {
process {
withName: 'NFCORE_SAREK:SAREK:ANNOTATE:ANNOTATION_MERGE:ANNOTATION_BGZIPTABIX' {
ext.prefix = {"${meta.id}_snpEff_VEP.ann.vcf"}
}
withName: 'NFCORE_SAREK:SAREK:ANNOTATE:ANNOTATION_MERGE:TABIX_BGZIPTABIX' {
ext.prefix = {"${meta.id}_snpEff_VEP.ann.vcf"}
}
}

process {
// MULTIQC

withName:'MULTIQC' {
errorStrategy = {task.exitStatus == 143 ? 'retry' : 'ignore'}
ext.args = { params.multiqc_config ? "--config $multiqc_custom_config" : "" }
}
}

// process {
// withName: CUSTOM_DUMPSOFTWAREVERSIONS {
// publishDir = [
// mode: params.publish_dir_mode,
// path: { "${params.outdir}/pipeline_info" },
// pattern: '*_versions.yml'
// }
Loading

0 comments on commit 818e8e6

Please sign in to comment.