Abstract
A major challenge to integrating public metabolic resources is the use of different nomenclatures by individual databases. This paper presents md_harmonize, an open-source Python package for harmonizing compounds and metabolic reactions across various metabolic databases. The md_harmonize package utilizes a neighborhood-specific graph coloring method for generating a unique identifier for each compound via atom identifiers based on a compound’s chemical structure. The resulting harmonized compounds and reactions can be used for various downstream analyses, including the construction of atom-resolved metabolic networks and models for metabolic flux analysis. Parts of the md_harmonize package have been optimized using a variety of computational techniques to allow certain NP-complete problems handled by the software to be tractable for these specific use-cases. The software is available on GitHub and through the Python Package Index, with end-user documentation hosted on GitHub Pages.
Document Type
Article
Publication Date
12-2023
Digital Object Identifier (DOI)
https://doi.org/10.3390/metabo13121199
Funding Information
The research was funded by the United States National Science Foundation (NSF), grant number 2020026.
Repository Citation
Jin, Huan and Moseley, Hunter N. B., "md_harmonize: A Python Package for Atom-Level Harmonization of Public Metabolic Databases" (2023). Markey Cancer Center Faculty Publications. 365.
https://uknowledge.uky.edu/markey_facpub/365
Included in
Biochemistry Commons, Endocrinology, Diabetes, and Metabolism Commons, Molecular Biology Commons, Oncology Commons
Notes/Citation Information
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).