Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving annotation #539

Merged
merged 28 commits into from
May 12, 2022
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#511](https://github.com/nf-core/sarek/pull/511) - Sync `TEMPLATE` with `tools` `2.3.2`
- [#520](https://github.com/nf-core/sarek/pull/520) - Improve annotation subworkflows
- [#537](https://github.com/nf-core/sarek/pull/537) - Update workflow figure
- [#539](https://github.com/nf-core/sarek/pull/539) - Update `CITATIONS.md`

### Fixed

Expand Down Expand Up @@ -77,6 +78,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#485](https://github.com/nf-core/sarek/pull/485) - `--skip_qc`, `--skip_markduplicates` and `--skip_bqsr` is now `--skip_tools`
- [#538](https://github.com/nf-core/sarek/pull/538) - `--sequencing_center` is now `--seq_center`
- [#538](https://github.com/nf-core/sarek/pull/538) - `--markdup_java_options` has been removed
maxulysse marked this conversation as resolved.
Show resolved Hide resolved
- [#539](https://github.com/nf-core/sarek/pull/539) - `--annotate_tools` has been removed
- [#539](https://github.com/nf-core/sarek/pull/539) - `--cadd_cache`, `--cadd_indels`, `--cadd_indels_tbi`, `--cadd_wg_snvs`, `--cadd_wg_snvs_tbi` have been removed
- [#539](https://github.com/nf-core/sarek/pull/539) - `--genesplicer` has been removed
FriederikeHanssen marked this conversation as resolved.
Show resolved Hide resolved
- [#539](https://github.com/nf-core/sarek/pull/539) - `conf/genomes.config` and `params.genomes_base` have been removed

## [2.7.1](https://github.com/nf-core/sarek/releases/tag/2.7.1) - Pårtejekna

Expand Down
17 changes: 17 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,23 @@
> Danecek P, Auton A, Abecasis G, et al.: The variant call format and VCFtools. Bioinformatics. 2011 Aug 1;27(15):2156-8. doi: 10.1093/bioinformatics/btr330. Epub 2011 Jun 7. PubMed PMID: 21653522; PubMed Central PMCID: PMC3137218.

- [VEP](https://pubmed.ncbi.nlm.nih.gov/27268795/)

maxulysse marked this conversation as resolved.
Show resolved Hide resolved
> McLaren W, Gil L, Hunt SE, et al.: The Ensembl Variant Effect Predictor. Genome Biol. 2016 Jun 6;17(1):122. doi: 10.1186/s13059-016-0974-4. PubMed PMID: 27268795; PubMed Central PMCID: PMC4893825.

- [dbNSFP](https://pubmed.ncbi.nlm.nih.gov/33261662/)

> Liu X, et al.: dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Med. 2020 Dec 2;12(1):103. doi: 10.1186/s13073-020-00803-9. PubMed PMID: 33261662; PubMed Central PMCID: PMC7709417.

- [LOFTEE](https://pubmed.ncbi.nlm.nih.gov/32461654/)

> Karczewski KJ, et al.: The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020 May;581(7809):434-443. doi: 10.1038/s41586-020-2308-7. PubMed PMID: 32461654; PubMed Central PMCID: PMC7334197.

- [SpliceAI](https://pubmed.ncbi.nlm.nih.gov/30661751/)

> Jaganathan K, et al.: Predicting Splicing from Primary Sequence with Deep Learning. Cell. 2019 Jan 24;176(3):535-548.e24. doi: 10.1016/j.cell.2018.12.015. PubMed PMID: 30661751.

- [SpliceRegion](https://github.com/Ensembl/VEP_plugins/blob/release/106/SpliceRegion.pm)

## R packages

- [R](https://www.R-project.org/)
Expand All @@ -88,6 +103,7 @@
> Trevor L Davis (2018). optparse: Command Line Option Parser.

- [RColorBrewer](https://CRAN.R-project.org/package=RColorBrewer)

> Erich Neuwirth (2014). RColorBrewer: ColorBrewer Palettes.

## Software packaging/containerisation tools
Expand All @@ -107,4 +123,5 @@
- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)

- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)

> Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.
142 changes: 42 additions & 100 deletions assets/email_template.html
Original file line number Diff line number Diff line change
@@ -1,111 +1,53 @@
<html>
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">

<!-- prettier-ignore -->
<meta name="description" content="nf-core/sarek: An open-source analysis pipeline to detect germline or somatic variants from whole genome or targeted sequencing" />
<title>nf-core/sarek Pipeline Report</title>
</head>
<body>
<div style="font-family: Helvetica, Arial, sans-serif; padding: 30px; max-width: 800px; margin: 0 auto">
<img src="cid:nfcorepipelinelogo" />
<meta name="description" content="nf-core/sarek: An open-source analysis pipeline to detect germline or somatic variants from whole genome or targeted sequencing">
<title>nf-core/sarek Pipeline Report</title>
</head>
<body>
<div style="font-family: Helvetica, Arial, sans-serif; padding: 30px; max-width: 800px; margin: 0 auto;">

<h1>nf-core/sarek v${version}</h1>
<h2>Run Name: $runName</h2>
<img src="cid:nfcorepipelinelogo">

<% if (!success){ out << """
<div
style="
color: #a94442;
background-color: #f2dede;
border-color: #ebccd1;
padding: 15px;
margin-bottom: 20px;
border: 1px solid transparent;
border-radius: 4px;
"
>
<h4 style="margin-top: 0; color: inherit">nf-core/sarek execution completed unsuccessfully!</h4>
<h1>nf-core/sarek v${version}</h1>
<h2>Run Name: $runName</h2>

<% if (!success){
out << """
<div style="color: #a94442; background-color: #f2dede; border-color: #ebccd1; padding: 15px; margin-bottom: 20px; border: 1px solid transparent; border-radius: 4px;">
<h4 style="margin-top:0; color: inherit;">nf-core/sarek execution completed unsuccessfully!</h4>
<p>The exit status of the task that caused the workflow execution to fail was: <code>$exitStatus</code>.</p>
<p>The full error message was:</p>
<pre style="white-space: pre-wrap; overflow: visible; margin-bottom: 0">${errorReport}</pre>
</div>
""" } else { out << """
<div
style="
color: #3c763d;
background-color: #dff0d8;
border-color: #d6e9c6;
padding: 15px;
margin-bottom: 20px;
border: 1px solid transparent;
border-radius: 4px;
"
>
<pre style="white-space: pre-wrap; overflow: visible; margin-bottom: 0;">${errorReport}</pre>
</div>
"""
} else {
out << """
<div style="color: #3c763d; background-color: #dff0d8; border-color: #d6e9c6; padding: 15px; margin-bottom: 20px; border: 1px solid transparent; border-radius: 4px;">
nf-core/sarek execution completed successfully!
</div>
""" } %>
</div>
"""
}
%>

<p>The workflow was completed at <strong>$dateComplete</strong> (duration: <strong>$duration</strong>)</p>
<p>The command used to launch the workflow was as follows:</p>
<pre
style="
white-space: pre-wrap;
overflow: visible;
background-color: #ededed;
padding: 15px;
border-radius: 4px;
margin-bottom: 30px;
"
>
$commandLine</pre
>
<p>The workflow was completed at <strong>$dateComplete</strong> (duration: <strong>$duration</strong>)</p>
<p>The command used to launch the workflow was as follows:</p>
<pre style="white-space: pre-wrap; overflow: visible; background-color: #ededed; padding: 15px; border-radius: 4px; margin-bottom:30px;">$commandLine</pre>

<h3>Pipeline Configuration:</h3>
<table
style="
width: 100%;
max-width: 100%;
border-spacing: 0;
border-collapse: collapse;
border: 0;
margin-bottom: 30px;
"
>
<tbody style="border-bottom: 1px solid #ddd">
<% out << summary.collect{ k,v -> "
<tr>
<th
style="
text-align: left;
padding: 8px 0;
line-height: 1.42857143;
vertical-align: top;
border-top: 1px solid #ddd;
"
>
$k
</th>
<td
style="
text-align: left;
padding: 8px;
line-height: 1.42857143;
vertical-align: top;
border-top: 1px solid #ddd;
"
>
<pre style="white-space: pre-wrap; overflow: visible">$v</pre>
</td>
</tr>
" }.join("\n") %>
</tbody>
</table>
<h3>Pipeline Configuration:</h3>
<table style="width:100%; max-width:100%; border-spacing: 0; border-collapse: collapse; border:0; margin-bottom: 30px;">
<tbody style="border-bottom: 1px solid #ddd;">
<% out << summary.collect{ k,v -> "<tr><th style='text-align:left; padding: 8px 0; line-height: 1.42857143; vertical-align: top; border-top: 1px solid #ddd;'>$k</th><td style='text-align:left; padding: 8px; line-height: 1.42857143; vertical-align: top; border-top: 1px solid #ddd;'><pre style='white-space: pre-wrap; overflow: visible;'>$v</pre></td></tr>" }.join("\n") %>
</tbody>
</table>

<p>nf-core/sarek</p>
<p><a href="https://github.com/nf-core/sarek">https://github.com/nf-core/sarek</a></p>
</div>
</body>
<p>nf-core/sarek</p>
<p><a href="https://github.com/nf-core/sarek">https://github.com/nf-core/sarek</a></p>

</div>

</body>
</html>
36 changes: 0 additions & 36 deletions conf/genomes.config

This file was deleted.

12 changes: 8 additions & 4 deletions conf/igenomes.config
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ params {
known_indels_tbi = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/GATKBundle/{1000G_phase1,Mills_and_1000G_gold_standard}.indels.b37.vcf.idx"
mappability = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/Control-FREEC/out100m2_hg19.gem"
snpeff_db = 'GRCh37.75'
vep_cache_version = '104'
snpeff_genome = 'GRCh37'
vep_cache_version = 104
vep_genome = 'GRCh37'
vep_species = 'homo_sapiens'
}
Expand All @@ -51,7 +52,8 @@ params {
known_indels_tbi = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/{Mills_and_1000G_gold_standard.indels.hg38,beta/Homo_sapiens_assembly38.known_indels}.vcf.gz.tbi"
mappability = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/Control-FREEC/out100m2_hg38.gem"
snpeff_db = 'GRCh38.99'
vep_cache_version = '104'
snpeff_genome = 'GRCh38'
vep_cache_version = 104
vep_genome = 'GRCh38'
vep_species = 'homo_sapiens'
}
Expand All @@ -78,7 +80,8 @@ params {
mappability = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/Control-FREEC/GRCm38_68_mm10.gem"
readme = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/README.txt"
snpeff_db = 'GRCm38.99'
vep_cache_version = '102'
snpeff_genome = 'GRCm38'
vep_cache_version = 102
vep_genome = 'GRCm38'
vep_species = 'mus_musculus'
}
Expand All @@ -101,7 +104,8 @@ params {
bwa = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/BWAIndex/version0.6.0/"
fasta = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/WholeGenomeFasta/genome.fa"
snpeff_db = 'WBcel235.99'
vep_cache_version = '104'
snpeff_genome = 'WBcel235'
vep_cache_version = 104
vep_genome = 'WBcel235'
vep_species = 'caenorhabditis_elegans'
}
Expand Down
68 changes: 28 additions & 40 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -953,24 +953,32 @@ process{
withName: 'VCFTOOLS_SUMMARY'{
ext.args = "--FILTER-summary"
}
}

// ANNOTATE
process {

withName: 'ENSEMBLVEP' {
ext.args = '--everything --filter_common --per_gene --total_length --offline'
container = { "nfcore/vep:104.3.${params.genome}" }
ext.args = [
'--everything --filter_common --per_gene --total_length --offline',
(params.vep_dbnsfp && params.dbnsfp) ? '--plugin dbNSFP,dbNSFP.gz,rs_dbSNP,HGVSc_VEP,HGVSp_VEP,1000Gp3_EAS_AF,1000Gp3_AMR_AF,LRT_score,GERP++_RS,gnomAD_exomes_AF' : '',
(params.vep_loftee) ? '--plugin LoF,loftee_path:/opt/conda/envs/nf-core-vep-104.3/share/ensembl-vep-104.3-0' : '',
(params.vep_spliceai && params.spliceai_snv && params.spliceai_indel) ? '--plugin SpliceAI,snv=spliceai_scores.raw.snv.hg38.vcf.gz,indel=spliceai_scores.raw.indel.hg38.vcf.gz' : '',
(params.vep_spliceregion) ? '--plugin SpliceRegion' : ''
].join(' ').trim()
if (!params.vep_cache) container = { params.vep_genome ? "nfcore/vep:104.3.${params.vep_genome}" : "nfcore/vep:104.3.${params.genome}" }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/reports/EnsemblVEP/${meta.id}/${meta.variantcaller}" },
pattern: "*html"
]
}

withName: ".*:ANNOTATION_MERGE:ENSEMBLVEP" {
ext.prefix = {"${meta.id}_snpEff"}
}

withName: 'SNPEFF' {
ext.args = '-nodownload -canon -v'
container = { "nfcore/snpeff:5.0.${params.genome}" }
if (!params.snpeff_cache) container = { params.snpeff_genome ? "nfcore/snpeff:5.0.${params.snpeff_genome}" : "nfcore/snpeff:5.0.${params.genome}" }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/reports/SnpEff/${meta.id}/${meta.variantcaller}" },
Expand All @@ -979,56 +987,36 @@ process {
]
}

withName: 'ANNOTATION_BGZIPTABIX' {
withName: "NFCORE_SAREK:SAREK:ANNOTATE:.*:TABIX_BGZIPTABIX" {
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/annotation/${meta.id}/${meta.variantcaller}" },
pattern: "*{gz,gz.tbi}"
]
}
}

if (params.tools && (params.tools.contains('snpeff') || params.tools.contains('merge'))) {
FriederikeHanssen marked this conversation as resolved.
Show resolved Hide resolved
process {
withName: 'NFCORE_SAREK:SAREK:ANNOTATE:ANNOTATION_SNPEFF:ANNOTATION_BGZIPTABIX' {
ext.prefix = {"${meta.id}_snpEff.ann.vcf"}
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/annotation/${meta.id}/${meta.variantcaller}" },
pattern: "*{gz,gz.tbi}",
saveAs: { params.tools.contains('snpeff') ? it : null }
]
}
withName: 'NFCORE_SAREK:SAREK:ANNOTATE:ANNOTATION_SNPEFF:TABIX_BGZIPTABIX' {
ext.prefix = {"${meta.id}_snpEff.ann.vcf"}
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/annotation/${meta.id}/${meta.variantcaller}" },
pattern: "*{gz,gz.tbi}",
saveAs: { params.tools.contains('snpeff') ? it : null }
]
}
}

if (params.tools && (params.tools.contains('vep'))) {
process {
withName: 'NFCORE_SAREK:SAREK:ANNOTATE:ANNOTATION_ENSEMBLVEP:ANNOTATION_BGZIPTABIX' {
ext.prefix = {"${meta.id}_VEP.ann.vcf"}
}
withName: 'NFCORE_SAREK:SAREK:ANNOTATE:ANNOTATION_ENSEMBLVEP:TABIX_BGZIPTABIX' {
ext.prefix = {"${meta.id}_VEP.ann.vcf"}
}
}

if (params.tools && (params.tools.contains('merge'))) {
process {
withName: 'NFCORE_SAREK:SAREK:ANNOTATE:ANNOTATION_MERGE:ANNOTATION_BGZIPTABIX' {
ext.prefix = {"${meta.id}_snpEff_VEP.ann.vcf"}
}
withName: 'NFCORE_SAREK:SAREK:ANNOTATE:ANNOTATION_MERGE:TABIX_BGZIPTABIX' {
ext.prefix = {"${meta.id}_snpEff_VEP.ann.vcf"}
}
}

process {
// MULTIQC

withName:'MULTIQC' {
errorStrategy = {task.exitStatus == 143 ? 'retry' : 'ignore'}
ext.args = { params.multiqc_config ? "--config $multiqc_custom_config" : "" }
}
}

// process {
// withName: CUSTOM_DUMPSOFTWAREVERSIONS {
// publishDir = [
// mode: params.publish_dir_mode,
// path: { "${params.outdir}/pipeline_info" },
// pattern: '*_versions.yml'
// }
Loading