Kristen A. Panfilio, University of Cologne, Germany
Iris M. Vargas Jentzsch, University of Cologne, Germany
Joshua B. Benoit, University of Cincinnati
Deniz Erezyilmaz, University of Oxford, UK
Yuichiro Suzuki, Wellesley College
Stefano Colella, University of Montpellier, France
Hugh M. Robertson, University of Illinois at Urbana-Champaign
Monica F. Poelchau, National Agricultural Library
Robert M. Waterhouse, University of Lausanne, Switzerland
Panagiotis Ioannidis, University of Geneva, Switzerland
Matthew T. Weirauch, University of Cincinnati
Daniel S. T. Hughes, Baylor College of Medicine
Shwetha C. Murali, University of Washington
John H. Werren, University of Rochester
Chris G. C. Jacobs, Leiden University, The Netherlands
Elizabeth J. Duncan, University of Otago, New Zealand
David Armisén, Université de Lyon, France
Barbara M. I. Vreede, The Hebrew University of Jerusalem, Israel
Patrice Baa-Puyoulet, Université de Lyon, France
Chloé S. Berger, Université de Lyon, France
Chun-Che Chang, National Taiwan University, Taiwan
Hsu Chao, Baylor College of Medicine
Mei-Ju M. Chen, National Agricultural Library
Yen-Ta Chen, University of Cologne, Germany
Christopher P. Childers, National Agricultural Library
Ariel D. Chipman, The Hebrew University of Jerusalem, Israel
Andrew G. Cridge, University of Otago, New Zealand
Antonin J. J. Crumière, Université de Lyon, France
Peter K. Dearden, University of Otago, New Zealand
Elise M. Didion, University of Cincinnati
Subba Reddy Palli, University of KentuckyFollow
Jayendra Nath Shukla, University of KentuckyFollow


Background: The Hemiptera (aphids, cicadas, and true bugs) are a key insect order, with high diversity for feeding ecology and excellent experimental tractability for molecular genetics. Building upon recent sequencing of hemipteran pests such as phloem-feeding aphids and blood-feeding bed bugs, we present the genome sequence and comparative analyses centered on the milkweed bug Oncopeltus fasciatus, a seed feeder of the family Lygaeidae.

Results: The 926-Mb Oncopeltus genome is well represented by the current assembly and official gene set. We use our genomic and RNA-seq data not only to characterize the protein-coding gene repertoire and perform isoform-specific RNAi, but also to elucidate patterns of molecular evolution and physiology. We find ongoing, lineage-specific expansion and diversification of repressive C2H2 zinc finger proteins. The discovery of intron gain and turnover specific to the Hemiptera also prompted the evaluation of lineage and genome size as predictors of gene structure evolution. Furthermore, we identify enzymatic gains and losses that correlate with feeding biology, particularly for reductions associated with derived, fluid nutrition feeding.

Conclusions: With the milkweed bug, we now have a critical mass of sequenced species for a hemimetabolous insect order and close outgroup to the Holometabola, substantially improving the diversity of insect genomics. We thereby define commonalities among the Hemiptera and delve into how hemipteran genomes reflect distinct feeding ecologies. Given Oncopeltus’s strength as an experimental model, these new sequence resources bolster the foundation for molecular research and highlight technical considerations for the analysis of medium-sized invertebrate genomes.

Document Type


Publication Date


Notes/Citation Information

Published in Genome Biology, v. 20, article no. 64, p. 1-26.

© The Author(s). 2019

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Digital Object Identifier (DOI)

Funding Information

Funding for genome sequencing, assembly and automated annotation was provided by the National Institutes of Health (NIH) grant U54 HG003273 (NHGRI) to RAG. We also acknowledge funding for the project from German Research Foundation (DFG) grants PA 2044/1-1 and SFB 680 project A12 to KAP. Support for specific analyses was provided by the Swiss National Science Foundation with grant 31003A_143936 to EMZ and PP00P3_170664 to RMW; the European Research Council grant ERC-CoG #616346 to AK; DFG grant SFB 680 project A1 to SiR; the National Science Foundation with grant US NSF DEB1257053 to JHW; NIH grant R01GM113230 (NIGMS) to LP; and by NIH grants 5R01GM080203 (NIGMS) and 5R01HG004483 (NHGRI) and by the Director, Office of Science, Office of Basic Energy Sciences, U.S. Department of Energy, Contract No. DE-AC02-05CH11231 to MCMT.

Related Content

All sequence data are publically available at the NCBI, bioproject number PRJNA229125 and in the USDA Ag Data Commons data access system. In addition, assembled scaffolds, gene models, and a browser are available at the National Agricultural Library ( The OncfaCyc metabolism database is available within the ArthropodaCyc collection (

13059_2019_1660_MOESM1_ESM.pdf (6142 kB)
Additional file 1: Supplementary notes, figures, and small tables.

13059_2019_1660_MOESM2_ESM.xlsx (2226 kB)
Additional file 2: Large supporting tables.

13059_2019_1660_MOESM3_ESM.fasta (128 kB)
Additional file 3: Chemoreceptor sequences in FASTA format.