Skip to main content

Genome Assembly

  • Protocol
  • First Online:
Fungal Genomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1775))

Abstract

Genome assembly uses sequence similarity to go from sequencing reads to longer contiguous sequences (contigs). Scaffolds are contigs linked together by gaps where the order and orientation of the contigs is known but the exact sequence connecting two contigs is unknown, represented by Ns which estimate the gap length. Here we describe recommendations for genome assembly for different sequencing technologies, describe organelle assembly, and review how to perform assembly quality control.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Schoch CL, Seifert KA, Huhndorf S, Robert V, Spouge JL, Levesque CA, Chen W, Fungal Barcoding Consortium, Fungal Barcoding Consortium Author List (2012) Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for fungi. Proc Natl Acad Sci U S A 109(16):6241–6246. https://doi.org/10.1073/pnas.1117018109

    Article  PubMed  PubMed Central  Google Scholar 

  2. Roper M, Ellison C, Taylor JW, Glass NL (2011) Nuclear and genome dynamics in multinucleate ascomycete fungi. Curr Biol 21(18):R786–R793. https://doi.org/10.1016/j.cub.2011.06.042

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  3. Nagarajan N, Pop M (2013) Sequence assembly demystified. Nat Rev Genet 14(3):157–167. https://doi.org/10.1038/nrg3367

    Article  PubMed  CAS  Google Scholar 

  4. Illumina. http://support.illumina.com/sequencing/sequencing_software/casava/documentation.html. Accessed 30 Nov 2016

  5. Joint Genome Institute (1997) BBTools. http://jgi.doe.gov/data-and-tools/bbtools/. Accessed 30 Nov 2016

  6. Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18(5):821–829. https://doi.org/10.1101/gr.074492.107

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  7. The European Bioinformatics Institute. https://www.ebi.ac.uk/~zerbino/velvet/Manual.pdf. Accessed 30 Nov 2016

  8. O’Leary NA, Wright MW, Brister JR, Ciufo S, Haddad D, McVeigh R, Rajput B, Robbertse B, Smith-White B, Ako-Adjei D, Astashyn A, Badretdin A, Bao Y, Blinkova O, Brover V, Chetvernin V, Choi J, Cox E, Ermolaeva O, Farrell CM, Goldfarb T, Gupta T, Haft D, Hatcher E, Hlavina W, Joardar VS, Kodali VK, Li W, Maglott D, Masterson P, McGarvey KM, Murphy MR, O’Neill K, Pujar S, Rangwala SH, Rausch D, Riddick LD, Schoch C, Shkeda A, Storz SS, Sun H, Thibaud-Nissen F, Tolstoy I, Tully RE, Vatsan AR, Wallin C, Webb D, Wu W, Landrum MJ, Kimchi A, Tatusova T, DiCuccio M, Kitts P, Murphy TD, Pruitt KD (2016) Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res 44(D1):D733–D745. https://doi.org/10.1093/nar/gkv1189

    Article  PubMed  CAS  Google Scholar 

  9. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14):1754–1760. https://doi.org/10.1093/bioinformatics/btp324

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  10. Boratyn GM, Camacho C, Cooper PS, Coulouris G, Fong A, Ma N, Madden TL, Matten WT, McGinnis SD, Merezhuk Y, Raytselis Y, Sayers EW, Tao T, Ye J, Zaretskaya I (2013) BLAST: a more efficient report with usability improvements. Nucleic Acids Res 41(Web Server issue):W29–W33. https://doi.org/10.1093/nar/gkt282

    Article  PubMed  PubMed Central  Google Scholar 

  11. Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, Sharpe T, Hall G, Shea TP, Sykes S, Berlin AM, Aird D, Costello M, Daza R, Williams L, Nicol R, Gnirke A, Nusbaum C, Lander ES, Jaffe DB (2011) High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A 108(4):1513–1518. https://doi.org/10.1073/pnas.1017351108

    Article  PubMed  CAS  Google Scholar 

  12. The Broad Institute. http://software.broadinstitute.org/allpaths-lg/blog/?page_id=12. Accessed 30 Nov 2016

  13. GitHub wgsim. https://github.com/lh3/wgsim/. Accessed 30 Nov 2016

  14. GitHub VelvetOptimiser. https://github.com/tseemann/VelvetOptimiser. Accessed 30 Nov 2016

  15. PACBIO®. http://www.pacb.com/products-and-services/analytical-software/smrt-analysis/. Accessed 30 Nov 2016

  16. Chin CS, Peluso P, Sedlazeck FJ, Nattestad M, Concepcion GT, Clum A, Dunn C, O'Malley R, Figueroa-Balderas R, Morales-Cruz A, Cramer GR, Delledonne M, Luo C, Ecker JR, Cantu D, Rank DR, Schatz MC (2016) Phased diploid genome assembly with single-molecule real-time sequencing. Nat Methods 13(12):1050–1054. https://doi.org/10.1038/nmeth.4035

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  17. Github. https://github.com/PacificBiosciences/FALCON/wiki/Manual. Accessed 30 Nov 2016

  18. Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, Flanigan MJ, Kravitz SA, Mobarry CM, Reinert KH, Remington KA, Anson EL, Bolanos RA, Chou HH, Jordan CM, Halpern AL, Lonardi S, Beasley EM, Brandon RC, Chen L, Dunn PJ, Lai Z, Liang Y, Nusskern DR, Zhan M, Zhang Q, Zheng X, Rubin GM, Adams MD, Venter JC (2000) A whole-genome assembly of Drosophila. Science 287(5461):2196–2204

    Article  CAS  PubMed  Google Scholar 

  19. Sourceforge. http://wgs-assembler.sourceforge.net/wiki/index.php?title=Main_Page. Accessed 30 Nov 2016

  20. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing Subgroup (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079. https://doi.org/10.1093/bioinformatics/btp352

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  21. Koljalg U, Larsson KH, Abarenkov K, Nilsson RH, Alexander IJ, Eberhardt U, Erland S, Hoiland K, Kjoller R, Larsson E, Pennanen T, Sen R, Taylor AF, Tedersoo L, Vralstad T, Ursing BM (2005) UNITE: a database providing web-based methods for the molecular identification of ectomycorrhizal fungi. New Phytol 166(3):1063–1068. https://doi.org/10.1111/j.1469-8137.2005.01376.x

    Article  PubMed  CAS  Google Scholar 

  22. Xue W, Lee W-J, Tseng C-W (2005) ESTmapper: efficiently aligning DNA sequences to genomes. IPDPS 7(8):196a. https://doi.org/10.1109/IPDPS.2005.204

    Article  Google Scholar 

  23. Parra G, Bradnam K, Korf I (2007) CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23(9):1061–1067. https://doi.org/10.1093/bioinformatics/btm071

    Article  PubMed  CAS  Google Scholar 

  24. Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, Clum A, Copeland A, Huddleston J, Eichler EE, Turner SW, Korlach J (2013) Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods 10(6):563–569. https://doi.org/10.1038/nmeth.2474

    Article  PubMed  CAS  Google Scholar 

  25. Osawa S, Jukes TH, Watanabe K, Muto A (1992) Recent evidence for evolution of the genetic code. Microbiol Rev 56(1):229–264

    PubMed  PubMed Central  CAS  Google Scholar 

  26. Jukes TH, Osawa S (1993) Evolutionary changes in the genetic code. Comp Biochem Physiol B 106(3):489–494

    Article  CAS  PubMed  Google Scholar 

  27. Aguileta G, de Vienne DM, Ross ON, Hood ME, Giraud T, Petit E, Gabaldon T (2014) High variability of mitochondrial gene order among fungi. Genome Biol Evol 6(2):451–465. https://doi.org/10.1093/gbe/evu028

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  28. Alexopolous CJ, Mims CW, Blackwell M (2004) Introductory mycology, 4th edn. Wiley, Hoboken, NJ. ISBN 0-471-52229-5

    Google Scholar 

  29. Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31(19):3210–3212. https://doi.org/10.1093/bioinformatics/btv351

    Article  PubMed  CAS  Google Scholar 

  30. Kriventseva EV, Tegenfeldt F, Petty TJ, Waterhouse RM, Simao FA, Pozdnyakov IA, Ioannidis P, Zdobnov EM (2015) OrthoDB v8: update of the hierarchical catalog of orthologs and the underlying free software. Nucleic Acids Res 43(Database issue):D250–D256. https://doi.org/10.1093/nar/gku1220

    Article  PubMed  CAS  Google Scholar 

  31. Noble PA, Citek RW, Ogunseitan OA (1998) Tetranucleotide frequencies in microbial genomes. Electrophoresis 19(4):528–535. https://doi.org/10.1002/elps.1150190412

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgment

The work conducted by the US Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported under Contract No. DE-AC02-05CH11231.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alicia Clum .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Clum, A. (2018). Genome Assembly. In: de Vries, R., Tsang, A., Grigoriev, I. (eds) Fungal Genomics. Methods in Molecular Biology, vol 1775. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7804-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7804-5_13

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7803-8

  • Online ISBN: 978-1-4939-7804-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics