Skip to content

Commit

Permalink
Add files to Nextclade Dataset
Browse files Browse the repository at this point in the history
  • Loading branch information
kimandrews committed May 10, 2024
1 parent 751cb5c commit afbe56e
Show file tree
Hide file tree
Showing 4 changed files with 1,311 additions and 6 deletions.
3 changes: 3 additions & 0 deletions nextclade_dataset/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Unreleased

Initial release.
29 changes: 29 additions & 0 deletions nextclade_dataset/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Measles dataset

| Key | Value |
| ----------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| authors | [Nextstrain](https://nextstrain.org) |
| reference | NC_001498.1 |
| workflow | https://github.com/nextstrain/measles/tree/main/nextclade |
| path | `nextstrain/measles` |


## Scope of this dataset

This dataset assigns genotypes to measles samples based on [criteria outlined by the WHO](https://www.who.int/publications/i/item/WER8709).

The WHO has defined 24 measles genotypes based on N gene and H gene sequences from 28 reference strains. For new measles samples, genotypes can be assigned based on genetic similarity to the reference strains at the "N450" region (a 450 bp region of the N gene).

The tree used in this dataset includes N450 sequences for the 28 reference strains, along with other representative strains for each genotype.

## Features

This dataset supports:

- Assignment of genotypes
- Minimal sequence QC
- Phylogenetic placement

## What are Nextclade datasets

Read more about Nextclade datasets in the Nextclade documentation: https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html
35 changes: 29 additions & 6 deletions nextclade_dataset/pathogen.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,36 @@
"reference": "measles_reference_N450.fasta",
"pathogenJson": "pathogen.json",
"genomeAnnotation": "measles_reference_N450.gff3",
"treeJson": "measles_nextclade.json"
"treeJson": "measles_nextclade.json",
"examples": "sequences.fasta",
"readme": "README.md",
"changelog": "CHANGELOG.md"
},
"attributes": {
"name": "Measles"
"name": "Measles (N450)",
"reference name": "Ichinose-B95a",
"reference accession": "NC_001498.1"
},
"schemaVersion": "3.0.0",
"version": {
"tag": "unreleased"
}
"schemaVersion": "1.0.0",
"alignmentParams": {
"minSeedCover": 0.01,
"minLength": 400
},
"qc": {
"missingData": {
"enabled": true,
"missingDataThreshold": 20,
"scoreBias": 4
},
"mixedSites": {
"enabled": true,
"mixedSitesThreshold": 4
},
"frameShifts": {
"enabled": true
},
"stopCodons": {
"enabled": true
}
}
}
Loading

0 comments on commit afbe56e

Please sign in to comment.