Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix custom annotation of breakends and non-supported SVs (exact mode) #1498

Merged

Conversation

nuno-agostinho
Copy link
Contributor

@nuno-agostinho nuno-agostinho commented Sep 14, 2023

ENSVAR-5610, fixes #1397

Changelog

  • Custom exact matching improvements:
    • Avoid matching breakpoints with SNPs when using custom annotation in exact mode
    • Only perform this check when using custom annotation in VCF format (other formats do not contain variant type information in a standardised way)
  • Bug fix: avoid skipping all subsequent SVs after skipping one
  • Improvements to warning messages thrown when skipping variants:
    • Warn when skipping any non-supported SV type
    • Trim input whitespaces to improve readability of messages
    • Bug fix: avoid calling skipped_variant_msg() with ensembl-io's parser

Documentation

Testing

Run VEP with custom annotation in exact mode containing breakends and NON_REF SVs, such as:

1   2642609        .                               G  A  .  .  .  .  .
1 2642609   nonREF  G  <NON_REF>  892  PASS  AN=6  GT  0/1
chr11   99819752        nonREF       C       <NON_REF>        892     PASS    AN=6       GT    0/1
1       713044  DUP1     C       <CN=2>  .       .       END=755966
1       713044  DUP2     C       <CN2>  .       .       END=755966
1       713044  DUP2     C       <INS:FHJDSJH>  .       .       END=755966
chr11   99819752        MantaBND:2716:0:1:0:0:0:0       C       CAACTGAG]chr11:99820576]        892     PASS    SVTYPE=BND;MATEID=MantaBND:2716:0:1:0:0:0:1;SVINSLEN=7;SVINSSEQ=AACTGAG;BND_DEPTH=266;MATE_BND_DEPTH=24;AC=3;AN=6       GT    0/1
chr11   99820576        MantaBND:2716:0:1:0:0:0:1       C       CCTCAGTT]chr11:99819752]        892     PASS    SVTYPE=BND;MATEID=MantaBND:2716:0:1:0:0:0:0;SVINSLEN=7;SVINSSEQ=CTCAGTT;BND_DEPTH=24;MATE_BND_DEPTH=266;AC=3;AN=6       GT    0/1

Run VEP using a command such as:

gnomAD=input/gnomad.genomes.v3.1.2.sites.stripped.vcf.gz
vep --i $vcf --vcf --database --custom $gnomAD,gnomAD,vcf,exact,0,AF,HN

Output should not contain custom annotation for NON_REF and breakend variants.

@nuno-agostinho nuno-agostinho marked this pull request as ready for review September 14, 2023 14:22
Comment on lines +686 to +689
## unsupported SV types
if ($self->isa('Bio::EnsEMBL::VEP::Parser')) {
$self->skipped_variant_msg("$abbrev type is not supported") unless $res;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not also report skipped variant even if it is ensembl-io parser?

Suggested change
## unsupported SV types
if ($self->isa('Bio::EnsEMBL::VEP::Parser')) {
$self->skipped_variant_msg("$abbrev type is not supported") unless $res;
}
## unsupported SV types - always use vep parser
Bio::EnsEMBL::VEP::Parser->skipped_variant_msg("$abbrev type is not supported") unless $res;

Copy link
Contributor Author

@nuno-agostinho nuno-agostinho Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Nakib! We only use ensembl-io directly for reading custom annotation: this means we will warn the user about the multiple types of variants that are skipped in custom annotation (and these can be a lot of them in some cases). Is this information useful by default or just noise?

What about if it's only printed for custom annotation if using --verbose? Something like:

Suggested change
## unsupported SV types
if ($self->isa('Bio::EnsEMBL::VEP::Parser')) {
$self->skipped_variant_msg("$abbrev type is not supported") unless $res;
}
## unsupported SV types
if ($self->isa('Bio::EnsEMBL::VEP::Parser') || $self->param('verbose')) {
Bio::EnsEMBL::VEP::Parser->skipped_variant_msg("$abbrev type is not supported") unless $res;
}

Comment on lines +344 to +345
# do not match if only one of the types is defined
return 0 if defined $ref_class xor defined $vf_class;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant with the next check as it makes sure both are defined.

Suggested change
# do not match if only one of the types is defined
return 0 if defined $ref_class xor defined $vf_class;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not redundant:

  • The other check only returns 0 if both variables are defined and not equal.
  • If one is defined but the other is not, then they are not equal, and should also return 0. This check is required for this condition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, sorry - I missed that it was an if, you can modify the next condition to have unless -

return 0 unless defined $ref_class && defined $vf_class && $ref_class eq $vf_class;

the condition can only be true if both are defined and equal.

Comment on lines +344 to +345
# do not match if only one of the types is defined
return 0 if defined $ref_class xor defined $vf_class;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, sorry - I missed that it was an if, you can modify the next condition to have unless -

return 0 unless defined $ref_class && defined $vf_class && $ref_class eq $vf_class;

the condition can only be true if both are defined and equal.

Copy link
Contributor

@nakib103 nakib103 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOTM !~

@nakib103 nakib103 merged commit d6c71c9 into Ensembl:postreleasefix/111 Sep 29, 2023
1 check passed
@nuno-agostinho nuno-agostinho deleted the fix/custom-skipping branch September 29, 2023 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants