-
Notifications
You must be signed in to change notification settings - Fork 4
Home
Jennifer Chang edited this page Aug 20, 2021
·
31 revisions
2012 FreeBayes
- Garrison, E. and Marth, G., 2012. Haplotype-based variant detection from short-read sequencing. arXiv preprint arXiv:1207.3907.
2013 FALCON, FALCON-unzip, FALCON-Phase
- Chin, C.S., Alexander, D.H., Marks, P., Klammer, A.A., Drake, J., Heiner, C., Clum, A., Copeland, A., Huddleston, J., Eichler, E.E. and Turner, S.W., 2013. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nature methods, 10(6), pp.563-569.
- Chin, C.S., Peluso, P., Sedlazeck, F.J., Nattestad, M., Concepcion, G.T., Clum, A., Dunn, C., O'Malley, R., Figueroa-Balderas, R., Morales-Cruz, A. and Cramer, G.R., 2016. Phased diploid genome assembly with single-molecule real-time sequencing. Nature methods, 13(12), pp.1050-1054.
- Kronenberg, Z.N., Rhie, A., Koren, S., Concepcion, G.T., Peluso, P., Munson, K.M., Porubsky, D., Kuhn, K., Mueller, K.A., Low, W.Y. and Hiendleder, S., 2021. Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C. Nature communications, 12(1), pp.1-10.
- "Thus, we suggest the following genome assembly workflow: (1) partially phased long-read assembly, (2) FALCON-Phase on primary contigs and haplotigs, (3) scaffolding with HI-C data, and (3) FALCON-Phase on scaffolds.
2016 minimap2
- Li, H., 2016. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics, 32(14), pp.2103-2110.
- Li, H., 2018. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34(18), pp.3094-3100.
2018 purge_haplotigs, purge_dups
- Roach, M.J., Schmidt, S.A. and Borneman, A.R., 2018. Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies. BMC bioinformatics, 19(1), pp.1-10.
- purge_haplotigs
- Guan, D., McCarthy, S.A., Wood, J., Howe, K., Wang, Y. and Durbin, R., 2020. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics, 36(9), pp.2896-2898.
- C source code at https://github.com/dfguan/purge_dups
- Pipeline outline: (1) minimap2 (li, 2016), (2) create windows by contigs and self align, (3) remove haplotigs, (4) chain overlaps.. something about the shorter contig. (more detail in Supplementary Material).
- "Following this [Scaff10x] with a round of polishing with Arrow closed a number of gaps, reducing contig number further and increasing contig N50" Wait... arrow merges contigs? or maybe it's Scaff10x.
- "To our knowledge, scaffolders that use long-range information, such as Scaff10X with linked reads or SALSA with Hi-C data, do not handle heterozygous overlaps. We therefore recommend applying purge_dups directly after initial assembly, prior to scaffolding."
- "In conclusion, purge_dups can significantly improve genome assemblies by removing overlaps and haplotigs caused by sequence divergence in heterozygous regions." ... removes false dups, while retaining assembly completeness, improves scaffolding
2020 Merqury
- Rhie, A., Walenz, B.P., Koren, S. and Phillippy, A.M., 2020. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome biology, 21(1), pp.1-27.
2021 merfin, mitoVGP
- Formenti, G., Rhie, A., Walenz, B.P., Thibaud-Nissen, F., Shafin, K., Koren, S., Myers, E.W., Jarvis, E.D. and Phillippy, A.M., 2021. Merfin: improved variant filtering and polishing via k-mer validation. bioRxiv.
- Formenti, G., Rhie, A., Balacco, J., Haase, B., Mountcastle, J., Fedrigo, O., Brown, S., Capodiferro, M.R., Al-Ajli, F.O., Ambrosini, R. and Houde, P., 2021. Complete vertebrate mitogenomes reveal widespread repeats and gene duplications. Genome biology, 22(1), pp.1-22.
- Rhie, A., McCarthy, S.A., Fedrigo, O., Damas, J., Formenti, G., Koren, S., Uliano-Silva, M., Chow, W., Fungtammasan, A., Kim, J. and Lee, C., 2021. Towards complete and error-free genome assemblies of all vertebrate species. Nature, 592(7856), pp.737-746.
2021 ag100pest update
-
Childers, A.K., Geib, S.M., Sim, S.B., Poelchau, M.F., Coates, B.S., Simmonds, T.J., Scully, E.D., Smith, T.P., Childers, C.P., Corpuz, R.L. and Hackett, K., 2021. The USDA-ARS Ag100Pest Initiative: High-Quality Genome Assemblies for Agricultural Pest Arthropod Research. Insects, 12(7), p.626.
- Figure 1: general workflow
- Bioproject: https://www.ncbi.nlm.nih.gov/bioproject/555319
- "Ag100Pest began by using continuous long reads (CLRs) for assembly (details not presented herein) as the improved HiFi procedure [33] had not yet been developed"