Skip to content

Commit

Permalink
feat: mutational signatures (#308)
Browse files Browse the repository at this point in the history
* initial commit

* change to sigprofilerassignment

* use siglasso

* add missing rules

* save intermediate changes

* add tooltips

* refactoring

* introduce prior

* change output

* update cosmic descriptions

* fix template

* fix header

* cleanup

* fixes and cleanup

* Update config/config.yaml

* Update config/config.yaml

* Update workflow/rules/mutational_signatures.smk

* Update workflow/rules/mutational_signatures.smk

* Update workflow/rules/mutational_signatures.smk

---------

Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de>
  • Loading branch information
FelixMoelder and johanneskoester committed Jun 20, 2024
1 parent 940439e commit f037305
Show file tree
Hide file tree
Showing 24 changed files with 452 additions and 8 deletions.
6 changes: 6 additions & 0 deletions .test/config-chm-eval/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,12 @@ mutational_burden:
- SOMATIC_TUMOR_MEDIUM
- SOMATIC_TUMOR_HIGH

# Plotting of known mutational signatures
mutational_signatures:
activate: false
events:
- some_id

calling:
delly:
activate: true
Expand Down
6 changes: 6 additions & 0 deletions .test/config-giab/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,12 @@ mutational_burden:
- somatic_tumor_medium
- somatic_tumor_low

# Plotting of known mutational signatures
mutational_signatures:
activate: false
events:
- some_id

# printing of variants in a matrix, sorted by recurrence
report:
# if stratificatio is deactivated, one oncoprint for all
Expand Down
6 changes: 6 additions & 0 deletions .test/config-no-candidate-filtering/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,12 @@ mutational_burden:
- SOMATIC_TUMOR_MEDIUM
- SOMATIC_TUMOR_HIGH

# Plotting of known mutational signatures
mutational_signatures:
activate: false
events:
- some_id

calling:
delly:
activate: true
Expand Down
6 changes: 6 additions & 0 deletions .test/config-simple/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,12 @@ mutational_burden:
events:
- present

# Plotting of known mutational signatures
mutational_signatures:
activate: false
events:
- some_id

calling:
delly:
activate: true
Expand Down
6 changes: 6 additions & 0 deletions .test/config-sra/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,12 @@ mutational_burden:
events:
- changed_only

# Plotting of known mutational signatures
mutational_signatures:
activate: false
events:
- some_id

calling:
delly:
activate: true
Expand Down
6 changes: 6 additions & 0 deletions .test/config-target-regions/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,12 @@ mutational_burden:
events:
- present

# Plotting of known mutational signatures
mutational_signatures:
activate: false
events:
- some_id

calling:
delly:
activate: true
Expand Down
6 changes: 6 additions & 0 deletions .test/config-target-regions/config_multiple_beds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,12 @@ mutational_burden:
events:
- present

# Plotting of known mutational signatures
mutational_signatures:
activate: false
events:
- some_id

calling:
delly:
activate: true
Expand Down
6 changes: 6 additions & 0 deletions .test/config_primers/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,12 @@ mutational_burden:
- SOMATIC_TUMOR_MEDIUM
- SOMATIC_TUMOR_HIGH

# Plotting of known mutational signatures
mutational_signatures:
activate: false
events:
- some_id

calling:
delly:
activate: true
Expand Down
7 changes: 7 additions & 0 deletions config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,13 @@ mutational_burden:
- somatic_tumor_medium
- somatic_tumor_high

# Quantify known mutational signatures (human only)
mutational_signatures:
activate: false
events:
# select events (callsets, defined under calling/fdr-control/events) to evaluate
- some_id

# Sets the minimum average coverage for each gene.
# Genes with lower average coverage will not be concidered in gene coverage datavzrd report
# If not present min_avg_coverage will be set to 0 rendering all genes.
Expand Down
1 change: 1 addition & 0 deletions workflow/Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ include: "rules/datavzrd.smk"
include: "rules/fusion_calling.smk"
include: "rules/testcase.smk"
include: "rules/population.smk"
include: "rules/mutational_signatures.smk"


batches = "all"
Expand Down
5 changes: 0 additions & 5 deletions workflow/envs/pysam.yaml

This file was deleted.

10 changes: 10 additions & 0 deletions workflow/envs/pystats.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
channels:
- conda-forge
- bioconda
dependencies:
- biopython =1.83
- pysam =0.22
- python =3.7
- pandas =2.2
- altair =5.3
- vl-convert-python =1.3
4 changes: 4 additions & 0 deletions workflow/envs/siglasso.post-deploy.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!env bash
set -o pipefail

Rscript -e "devtools::install_github('gersteinlab/siglasso@9c5e6b0')"
11 changes: 11 additions & 0 deletions workflow/envs/siglasso.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
channels:
- conda-forge
- bioconda
dependencies:
- r-base =4.3
- r-devtools =2.4
- r-glmnet =4.1
- r-nnls =1.5
- r-rcolorbrewer =1.1
- r-colorramps =2.3
- r-tidyverse =2.0
87 changes: 87 additions & 0 deletions workflow/resources/cosmic_signature_desc_v3.4.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
Signature Description
SBS1 Spontaneous deamination of 5-methylcytosine (clock-like signature)
SBS2 Activity of APOBEC family of cytidine deaminases
SBS3 Defective homologous recombination DNA damage repair
SBS4 Tobacco smoking
SBS5 Unknown (clock-like signature)
SBS6 Defective DNA mismatch repair
SBS7a Ultraviolet light exposure
SBS7b Ultraviolet light exposure
SBS7c Ultraviolet light exposure
SBS7d Ultraviolet light exposure
SBS8 Unknown
SBS9 Polimerase eta somatic hypermutation activity
SBS10a Polymerase epsilon exonuclease domain mutations
SBS10b Polymerase epsilon exonuclease domain mutations
SBS10c Defective POLD1 proofreading
SBS10d Defective POLD1 proofreading
SBS11 Temozolomide treatment
SBS12 Unknown
SBS13 Activity of APOBEC family of cytidine deaminases
SBS14 Concurrent polymerase epsilon mutation and defective DNA mismatch repair
SBS15 Defective DNA mismatch repair
SBS16 Unknown
SBS17a Unknown
SBS17b Unknown
SBS18 Damage by reactive oxygen species
SBS19 Unknown
SBS20 Concurrent POLD1 mutations and defective DNA mismatch repair
SBS21 Defective DNA mismatch repair
SBS22a Aristolochic acid exposure
SBS22b Aristolochic acid exposure
SBS23 Unknown
SBS24 Aflatoxin exposure
SBS25 Chemotherapy treatment
SBS26 Defective DNA mismatch repair
SBS27 Possible sequencing artefact
SBS28 Unknown
SBS29 Tobacco chewing
SBS30 Defective DNA base excision repair due to NTHL1 mutations
SBS31 Platinum chemotherapy treatment
SBS32 Azathioprine treatment
SBS33 Unknown
SBS34 Unknown
SBS35 Platinum chemotherapy treatment
SBS36 Defective DNA base excision repair due to MUTYH mutations
SBS37 Unknown
SBS38 Indirect effect of ultraviolet light
SBS39 Unknown
SBS40a Unknown
SBS40b Unknown
SBS40c Unknown
SBS41 Unknown
SBS42 Haloalkane exposure
SBS43 Unknown. Possible sequencing artefact
SBS44 Defective DNA mismatch repair
SBS45 Possible artefact due to 8-oxo-guanine introduced during sequencing
SBS46 Possible sequencing artefact
SBS47 Possible sequencing artefact
SBS48 Possible sequencing artefact
SBS49 Possible sequencing artefact
SBS50 Possible sequencing artefact
SBS51 Possible sequencing artefact
SBS52 Possible sequencing artefact
SBS53 Possible sequencing artefact
SBS54 Possible sequencing artefact. Possible contamination with germline variants
SBS55 Possible sequencing artefact
SBS56 Possible sequencing artefact
SBS57 Possible sequencing artefact
SBS58 Possible sequencing artefact
SBS59 Possible sequencing artefact
SBS60 Possible sequencing artefact
SBS84 Activity of activation-induced cytidine deaminase (AID)
SBS85 Indirect effects of activation-induced cytidine deaminase (AID)
SBS86 Unknown chemotherapy treatment
SBS87 Thiopurine chemotherapy treatment
SBS88 Colibactin exposure (E.coli bacteria carrying pks pathogenicity island)
SBS89 Unknown
SBS90 Duocarmycin exposure
SBS91 Unknown
SBS92 Tobacco smoking
SBS93 Unknown
SBS94 Unknown
SBS95 Possible sequencing artefact
SBS96 Unknown
SBS97 Unknown
SBS98 Unknown
SBS99 Melphalan exposure
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,7 @@ views:
binned max vaf:
display-mode: hidden
swissprot:
optional: true
display-mode: hidden
hgvsg:
display-mode: hidden
Expand Down Expand Up @@ -388,6 +389,7 @@ views:
ensembl:
url: https://www.ensembl.org/Homo_sapiens/Transcript/Summary?t={feature}
hgvsc:
optional: true
custom: ?read_file(input.linkouts)
consequence:
plot:
Expand Down
2 changes: 1 addition & 1 deletion workflow/rules/calling.smk
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ rule varlociraptor_preprocess:
input:
ref=genome,
ref_idx=genome_fai,
candidates=get_candidate_calls(),
candidates=lambda wc: get_candidate_calls,
bam="results/recal/{sample}.bam",
bai="results/recal/{sample}.bai",
alignment_props="results/alignment-properties/{group}/{sample}.json",
Expand Down
19 changes: 17 additions & 2 deletions workflow/rules/common.smk
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,7 @@ def get_final_output(wildcards):
)
)
final_output.extend(get_mutational_burden_targets())
final_output.extend(get_mutational_signature_targets())

if is_activated("population/db"):
final_output.append(lookup(dpath="population/db/path", within=config))
Expand Down Expand Up @@ -616,6 +617,20 @@ def get_mutational_burden_targets():
return mutational_burden_targets


def get_mutational_signature_targets():
mutational_signature_targets = []
if is_activated("mutational_signatures"):
for group in variants_groups:
mutational_signature_targets.extend(
expand(
"results/plots/mutational_signatures/{group}.{event}.svg",
group=variants_groups,
event=config["mutational_signatures"].get("events"),
)
)
return mutational_signature_targets


def get_scattered_calls(ext="bcf"):
def inner(wildcards):
caller = "arriba" if wildcards.calling_type == "fusions" else variant_caller
Expand Down Expand Up @@ -663,9 +678,9 @@ def get_gather_annotated_calls_input(ext="bcf"):
return inner


def get_candidate_calls():
def get_candidate_calls(wc):
filter = config["calling"]["filter"].get("candidates")
if filter:
if filter and wc.caller != "arriba":
return "results/candidate-calls/{group}.{caller}.{scatteritem}.filtered.bcf"
else:
return "results/candidate-calls/{group}.{caller}.{scatteritem}.bcf"
Expand Down
Loading

0 comments on commit f037305

Please sign in to comment.