Molecular and Cellular Biochemistry Faculty Publications

Long-Read Sequencing of the Zebrafish Genome Reorganizes Genomic Architecture

Yelena Chernyavskaya, University of KentuckyFollow
Xiaofei Zhang, University of KentuckyFollow
Jinze Liu, Virginia Commonwealth University
Jessica S. Blackburn, University of KentuckyFollow

Abstract

BACKGROUND: Nanopore sequencing technology has revolutionized the field of genome biology with its ability to generate extra-long reads that can resolve regions of the genome that were previously inaccessible to short-read sequencing platforms. Over 50% of the zebrafish genome consists of difficult to map, highly repetitive, low complexity elements that pose inherent problems for short-read sequencers and assemblers.

RESULTS: We used long-read nanopore sequencing to generate a de novo assembly of the zebrafish genome and compared our assembly to the current reference genome, GRCz11. The new assembly identified 1697 novel insertions and deletions over one kilobase in length and placed 106 previously unlocalized scaffolds. We also discovered additional sites of retrotransposon integration previously unreported in GRCz11 and observed the expression of these transposable elements in adult zebrafish under physiologic conditions, implying they have active mobility in the zebrafish genome and contribute to the ever-changing genomic landscape.

CONCLUSIONS: We used nanopore sequencing to improve upon and resolve the issues plaguing the current zebrafish reference assembly, GRCz11. Zebrafish is a prominent model of human disease, and our corrected assembly will be useful for studies relying on interspecies comparisons and precise linkage of genetic events to disease phenotypes.

Document Type

Article

Publication Date

2-10-2022

Notes/Citation Information

Published in BMC Genomics, v. 23, article no. 116.

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (https://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Digital Object Identifier (DOI)

https://doi.org/10.1186/s12864-022-08349-3

Funding Information

Funding supporting this project was provided by the National Institutes of Health DP2CA228043 and the Kentucky Pediatric Cancer Research Trust Foundation (to JSB). This research was also supported by the Biostatistics and Bioinformatics Shared Resource Facility of the University of Kentucky Markey Cancer Center (P30CA177558) and the VCU Massey Cancer Center Bioinformatics Core (P30CA016059).

Repository Citation

Chernyavskaya, Yelena; Zhang, Xiaofei; Liu, Jinze; and Blackburn, Jessica S., "Long-Read Sequencing of the Zebrafish Genome Reorganizes Genomic Architecture" (2022). Molecular and Cellular Biochemistry Faculty Publications. 196.
https://uknowledge.uky.edu/biochem_facpub/196

12864_2022_8349_MOESM1_ESM.zip (272 kB)
Additional file 1: Figure S1. Read length distribution and sequenced bases generated by each group across all libraries used in assembly generation. Figure S2. Tukey box and whiskers plot of average depth at the telomeric regions of all chromosomes in the zebrafish genome. Figure S3. Association plot of Chr 4 in ZF1 and GRCz11 assemblies illustrating many small sequence differences between the two builds. Figure S4. BUSCO analysis of GRCz11 reference assembly and ZF1 assembly using vertebrate-specific single-copy orthologs. Table S1. Chromosomal location of GRCz11 unlocalized scaffolds bearing > 99% coverage in GRCz11. Table S2. Deletions mapped to insertions in ZF1 assembly. Table S3. Primers used for RT-qPCR.

Download

Additional files available below

Included in

Biochemistry, Biophysics, and Structural Biology Commons, Bioinformatics Commons, Genomics Commons

COinS

Molecular and Cellular Biochemistry Faculty Publications

Long-Read Sequencing of the Zebrafish Genome Reorganizes Genomic Architecture

Abstract

Document Type

Publication Date

Notes/Citation Information

Digital Object Identifier (DOI)

Funding Information

Related Content

Repository Citation

Included in

Search

Browse by Author

Author Corner

Connect

Molecular and Cellular Biochemistry Faculty Publications

Long-Read Sequencing of the Zebrafish Genome Reorganizes Genomic Architecture

Authors

Abstract

Document Type

Publication Date

Notes/Citation Information

Digital Object Identifier (DOI)

Funding Information

Related Content

Repository Citation

Included in

Share

Search

Browse by Author

Author Corner

Connect