Loading...
Thumbnail Image
Publication

A Snakemake Toolkit for the Batch Assembly, Annotation and Phylogenetic Analysis of Mitochondrial Genomes and Ribosomal Genes From Genome Skims of Museum Collections

Citations
Altmetric:
Advisors
Editors
Other Contributors
Affiliation
EPub Date
Issue Date
2024-10-28
Submitted Date
Subject Terms
assembly
bioinformatics
DNA barcode
genome skim
iBOL
museomics
phylogenetics
snakemake
Research Projects
Organizational Units
Journal Issue
Other Titles
Abstract
ABSTRACT - Low coverage ‘genome‐skims’ are often used to assemble organelle genomes and ribosomal gene sequences for cost‐effective phylogenetic and barcoding studies. Natural history collections hold invaluable biological information, yet poor preservation resulting in degraded DNA often hinders polymerase chain reaction‐based analyses. However, it is possible to generate libraries and sequence the short fragments typical of degraded DNA to generate genome‐skims from museum collections. Here we introduce a snakemake toolkit comprised of three pipelines <jats:italic>skim2mito</jats:italic>, <jats:italic>skim2rrna</jats:italic> and <jats:italic>gene2phylo</jats:italic>, designed to unlock the genomic potential of historical museum specimens using genome skimming. Specifically, <jats:italic>skim2mito</jats:italic> and <jats:italic>skim2rrna</jats:italic> perform the batch assembly, annotation and phylogenetic analysis of mitochondrial genomes and nuclear ribosomal genes, respectively, from low‐coverage genome skims. The third pipeline <jats:italic>gene2phylo</jats:italic> takes a set of gene alignments and performs phylogenetic analysis of individual genes, partitioned analysis of concatenated alignments and a phylogenetic analysis based on gene trees. We benchmark our pipelines with simulated data, followed by testing with a novel genome skimming dataset from both recent and historical solariellid gastropod samples. We show that the toolkit can recover mitochondrial and ribosomal genes from poorly preserved museum specimens of the gastropod family Solariellidae, and the phylogenetic analysis is consistent with our current understanding of taxonomic relationships. The generation of bioinformatic pipelines that facilitate processing large quantities of sequence data from the vast repository of specimens held in natural history museum collections will greatly aid species discovery and exploration of biodiversity over time, ultimately aiding conservation efforts in the face of a changing planet.
Citation
White, O.W., Hall, A., Price, B.W., Williams, S.T. and Clark, M.D. (2025), A Snakemake Toolkit for the Batch Assembly, Annotation and Phylogenetic Analysis of Mitochondrial Genomes and Ribosomal Genes From Genome Skims of Museum Collections. Mol Ecol Resour, 25: e14036. https://doi.org/10.1111/1755-0998.14036
Publisher
Research Unit
PubMed ID
PubMed Central ID
Embedded videos
Type
Journal Article
Item Description
Copyright © 2024 The Author(s). Molecular Ecology Resources published by John Wiley & Sons Ltd. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. The attached file is the published version of the article.
NHM Repository
Series/Report no.
ISSN
1755-098X
EISSN
1755-0998
ISBN
ISMN
GovDoc
Test Link
Sponsors