Abstract
Genome assembly uses sequence similarity to go from sequencing reads to longer contiguous sequences (contigs). Scaffolds are contigs linked together by gaps where the order and orientation of the contigs is known but the exact sequence connecting two contigs is unknown, represented by Ns which estimate the gap length. Here we describe recommendations for genome assembly for different sequencing technologies, describe organelle assembly, and review how to perform assembly quality control.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Schoch CL, Seifert KA, Huhndorf S, Robert V, Spouge JL, Levesque CA, Chen W, Fungal Barcoding Consortium, Fungal Barcoding Consortium Author List (2012) Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for fungi. Proc Natl Acad Sci U S A 109(16):6241–6246. https://doi.org/10.1073/pnas.1117018109
Roper M, Ellison C, Taylor JW, Glass NL (2011) Nuclear and genome dynamics in multinucleate ascomycete fungi. Curr Biol 21(18):R786–R793. https://doi.org/10.1016/j.cub.2011.06.042
Nagarajan N, Pop M (2013) Sequence assembly demystified. Nat Rev Genet 14(3):157–167. https://doi.org/10.1038/nrg3367
Illumina. http://support.illumina.com/sequencing/sequencing_software/casava/documentation.html. Accessed 30 Nov 2016
Joint Genome Institute (1997) BBTools. http://jgi.doe.gov/data-and-tools/bbtools/. Accessed 30 Nov 2016
Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18(5):821–829. https://doi.org/10.1101/gr.074492.107
The European Bioinformatics Institute. https://www.ebi.ac.uk/~zerbino/velvet/Manual.pdf. Accessed 30 Nov 2016
O’Leary NA, Wright MW, Brister JR, Ciufo S, Haddad D, McVeigh R, Rajput B, Robbertse B, Smith-White B, Ako-Adjei D, Astashyn A, Badretdin A, Bao Y, Blinkova O, Brover V, Chetvernin V, Choi J, Cox E, Ermolaeva O, Farrell CM, Goldfarb T, Gupta T, Haft D, Hatcher E, Hlavina W, Joardar VS, Kodali VK, Li W, Maglott D, Masterson P, McGarvey KM, Murphy MR, O’Neill K, Pujar S, Rangwala SH, Rausch D, Riddick LD, Schoch C, Shkeda A, Storz SS, Sun H, Thibaud-Nissen F, Tolstoy I, Tully RE, Vatsan AR, Wallin C, Webb D, Wu W, Landrum MJ, Kimchi A, Tatusova T, DiCuccio M, Kitts P, Murphy TD, Pruitt KD (2016) Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res 44(D1):D733–D745. https://doi.org/10.1093/nar/gkv1189
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14):1754–1760. https://doi.org/10.1093/bioinformatics/btp324
Boratyn GM, Camacho C, Cooper PS, Coulouris G, Fong A, Ma N, Madden TL, Matten WT, McGinnis SD, Merezhuk Y, Raytselis Y, Sayers EW, Tao T, Ye J, Zaretskaya I (2013) BLAST: a more efficient report with usability improvements. Nucleic Acids Res 41(Web Server issue):W29–W33. https://doi.org/10.1093/nar/gkt282
Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, Sharpe T, Hall G, Shea TP, Sykes S, Berlin AM, Aird D, Costello M, Daza R, Williams L, Nicol R, Gnirke A, Nusbaum C, Lander ES, Jaffe DB (2011) High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A 108(4):1513–1518. https://doi.org/10.1073/pnas.1017351108
The Broad Institute. http://software.broadinstitute.org/allpaths-lg/blog/?page_id=12. Accessed 30 Nov 2016
GitHub wgsim. https://github.com/lh3/wgsim/. Accessed 30 Nov 2016
GitHub VelvetOptimiser. https://github.com/tseemann/VelvetOptimiser. Accessed 30 Nov 2016
PACBIO®. http://www.pacb.com/products-and-services/analytical-software/smrt-analysis/. Accessed 30 Nov 2016
Chin CS, Peluso P, Sedlazeck FJ, Nattestad M, Concepcion GT, Clum A, Dunn C, O'Malley R, Figueroa-Balderas R, Morales-Cruz A, Cramer GR, Delledonne M, Luo C, Ecker JR, Cantu D, Rank DR, Schatz MC (2016) Phased diploid genome assembly with single-molecule real-time sequencing. Nat Methods 13(12):1050–1054. https://doi.org/10.1038/nmeth.4035
Github. https://github.com/PacificBiosciences/FALCON/wiki/Manual. Accessed 30 Nov 2016
Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, Flanigan MJ, Kravitz SA, Mobarry CM, Reinert KH, Remington KA, Anson EL, Bolanos RA, Chou HH, Jordan CM, Halpern AL, Lonardi S, Beasley EM, Brandon RC, Chen L, Dunn PJ, Lai Z, Liang Y, Nusskern DR, Zhan M, Zhang Q, Zheng X, Rubin GM, Adams MD, Venter JC (2000) A whole-genome assembly of Drosophila. Science 287(5461):2196–2204
Sourceforge. http://wgs-assembler.sourceforge.net/wiki/index.php?title=Main_Page. Accessed 30 Nov 2016
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing Subgroup (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079. https://doi.org/10.1093/bioinformatics/btp352
Koljalg U, Larsson KH, Abarenkov K, Nilsson RH, Alexander IJ, Eberhardt U, Erland S, Hoiland K, Kjoller R, Larsson E, Pennanen T, Sen R, Taylor AF, Tedersoo L, Vralstad T, Ursing BM (2005) UNITE: a database providing web-based methods for the molecular identification of ectomycorrhizal fungi. New Phytol 166(3):1063–1068. https://doi.org/10.1111/j.1469-8137.2005.01376.x
Xue W, Lee W-J, Tseng C-W (2005) ESTmapper: efficiently aligning DNA sequences to genomes. IPDPS 7(8):196a. https://doi.org/10.1109/IPDPS.2005.204
Parra G, Bradnam K, Korf I (2007) CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23(9):1061–1067. https://doi.org/10.1093/bioinformatics/btm071
Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, Clum A, Copeland A, Huddleston J, Eichler EE, Turner SW, Korlach J (2013) Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods 10(6):563–569. https://doi.org/10.1038/nmeth.2474
Osawa S, Jukes TH, Watanabe K, Muto A (1992) Recent evidence for evolution of the genetic code. Microbiol Rev 56(1):229–264
Jukes TH, Osawa S (1993) Evolutionary changes in the genetic code. Comp Biochem Physiol B 106(3):489–494
Aguileta G, de Vienne DM, Ross ON, Hood ME, Giraud T, Petit E, Gabaldon T (2014) High variability of mitochondrial gene order among fungi. Genome Biol Evol 6(2):451–465. https://doi.org/10.1093/gbe/evu028
Alexopolous CJ, Mims CW, Blackwell M (2004) Introductory mycology, 4th edn. Wiley, Hoboken, NJ. ISBN 0-471-52229-5
Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31(19):3210–3212. https://doi.org/10.1093/bioinformatics/btv351
Kriventseva EV, Tegenfeldt F, Petty TJ, Waterhouse RM, Simao FA, Pozdnyakov IA, Ioannidis P, Zdobnov EM (2015) OrthoDB v8: update of the hierarchical catalog of orthologs and the underlying free software. Nucleic Acids Res 43(Database issue):D250–D256. https://doi.org/10.1093/nar/gku1220
Noble PA, Citek RW, Ogunseitan OA (1998) Tetranucleotide frequencies in microbial genomes. Electrophoresis 19(4):528–535. https://doi.org/10.1002/elps.1150190412
Acknowledgment
The work conducted by the US Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported under Contract No. DE-AC02-05CH11231.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Clum, A. (2018). Genome Assembly. In: de Vries, R., Tsang, A., Grigoriev, I. (eds) Fungal Genomics. Methods in Molecular Biology, vol 1775. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7804-5_13
Download citation
DOI: https://doi.org/10.1007/978-1-4939-7804-5_13
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7803-8
Online ISBN: 978-1-4939-7804-5
eBook Packages: Springer Protocols