Skip to content

Releases: nanoporetech/medaka

v2.0.0

11 Sep 13:55
Compare
Choose a tag to compare

Switched from tensorflow to pytorch.

Existing models for recent basecallers have been converted to the new format.
Pytorch format models contain a _pt suffix in the filename.

Changed

  • Inference is now performed using PyTorch instead of TensorFlow.
  • The medaka consensus command has been renamed to medaka inference to reflect
    its function in running an arbitrary model and avoid confusion with medaka_consensus.
  • The medaka stitch command has been renamed to medaka sequence to reflect its
    function in creating a consensus sequence.
  • The medaka variant command has been renamed to medaka vcf to reflect its function
    in consolidating variants and avoid confusion with medaka_variant.
  • Order of arguments to medaka vcf has been changed to be more consistent
    with medaka sequence.
  • The helper script medaka_haploid_variant has been renamed medaka_variant to
    save typing.
  • Make --ignore_read_groups option available to more medaka subcommands including inference.

Removed

  • The medaka snp command has been removed. This was long defunct as diploid SNP calling
    had been deprecated, and medaka variant is used to create VCFs for current models.
  • Loading models in hdf format has been deprecated.
  • Deleted minimap2 and racon wrappers in medaka/wrapper.py.

Added

  • Release conda packages for Linux (x86 and aarch64) and macOS (arm64).
  • Option --lr_schedule allows using cosine learning rate schedule in training.
  • Option --max_valid_samples to set number of samples in a training validation batch.

Fixed

  • Training models with DiploidLabelScheme uses categorical cross-entropy loss
    instead of binary cross-entropy.

v1.12.1

12 Jul 13:56
Compare
Choose a tag to compare

(Probably) final version of medaka using tensorflow. Future versions will use
pytorch instead.

Fixed

  • medaka_consensus: only keep bam tags if input file matches joint polishing pipeline.
  • Pin numpy to <2.0.0.

Added

  • Consensus and variant models lookup for v3.5.1 Dorado models.

v1.12.0

20 May 10:04
Compare
Choose a tag to compare

Fixed

  • tandem: Use haplotag 0 in unphased mode.
  • tandem: Don't run consensus if regions set is empty.

Added

  • Models for version 5 basecaller models.
  • Expose sym_indels option for training.
  • Expose --min_mapq minimum mapping quality alignment fitering option for medaka consensus.
  • tandem: Option --ignore_read_groups to ignore read groups present in input file.
  • Wrapper script medaka_consensus_joint and convenience tools (prepare_tagged_bam,
    get_model_dtypes) to facilitate joint polishing with multiple datatypes.

v1.11.3

06 Dec 14:28
Compare
Choose a tag to compare

Added

  • Consensus and variant models for v4.3.0 dorado models.

v1.11.2

29 Nov 22:14
Compare
Choose a tag to compare

Added

  • Parsing model information from fastq headers output by Guppy and MinKNOW.

Changed

  • Additional explanatory information in VCF INFO fields concerning depth calculations.

v1.11.1

24 Oct 14:25
Compare
Choose a tag to compare

Fixed

  • Do not exit if model cannot be interpreted, use the default instead.
  • An issue with co-ordinate handling in computing variants from alignments.

Added

  • Ability to use basecaller model name as --model argument.
  • Better handling or errors when running abpoa.

v1.11.0

23 Oct 14:59
Compare
Choose a tag to compare

Fixed

  • Correct suffix of consensus file when medaka_consensus outputs a fastq.

Added

  • Choice of model file can be introspected from input files. For BAM files the
    read group (RG) headers are searched according to the dorado
    specification,
    whilst for .fastq files the comment section of a number of reads are checked
    for corresponding read group information. In the latter case see README for
    information on correctly converting basecaller output to .fastq whilst
    maintaining the relevant meta information.
  • medaka tools resolve_model can display the model that would automatically
    be used for a given input file.

Changed

  • If no model is provided on command-line interface (medaka consensus,
    medaka_consensus, and medaka_haploid_variant) automatic attempts will be made
    to choose the appropriate model.

v1.10.0

16 Oct 15:22
Compare
Choose a tag to compare

Changed

  • Tensorflow logging level no longer set from Python.
  • spoa and parasail are now strict requirements.

Fixed

  • Sort VCF before annotating in medaka_haploid_variant.
  • Ignore errors when deleting temporary files.
  • The output of the first POA run not being used in the second iteration in smolecule command.

Added

  • Support for Python 3.11.
  • --spoa_min_coverage option to smolecule command.

Removed

  • Support for Python 3.7.

v1.9.1

15 Aug 13:32
Compare
Choose a tag to compare

Fixed

  • A long-standing bug in pileup_counts that manifests for single-position pileups on ARM64.

v1.9.0

09 Aug 16:22
Compare
Choose a tag to compare

Added

  • Added medaka tandem targeted tandem repeat variant calling.