Abstract
Background/Objectives: Pathway annotations of non-macromolecular (relatively small) biomolecules facilitate biological and biomedical interpretation of metabolomics datasets. However, low pathway annotation levels of detected biomolecules hinder this type of interpretation. Thus, predicting the pathway involvement of detected but unannotated biomolecules has a high potential to improve metabolomics data analysis and omics integration. Past publications have only made use of the Kyoto Encyclopedia of Genes and Genomes-derived datasets to develop machine learning models to predict pathway involvement. However, to our knowledge, the Reactome knowledgebase has not been utilized to develop these types of predictive models.
Methods: We created a dataset ready for machine learning using chemical representations of all pathway-annotated compounds available from the Reactome knowledgebase. Next, we trained and evaluated a multilayer perceptron binary classifier using combined metabolite-pathway paired feature vectors engineered from this new dataset.
Results: While models trained on a prior corresponding KEGG dataset with 502 pathways scored a mean Matthew’s correlation coefficient (MCC) of 0.847 and a 0.0098 standard deviation, the models trained on the Reactome dataset with 3985 pathways demonstrated improved performance with a mean MCC of 0.916, but with a higher standard deviation of 0.0149.
Conclusions: These results indicate that the pathways in Reactome can also be effectively predicted, greatly increasing the number of human-defined pathways available for prediction.
Document Type
Article
Publication Date
3-2025
Digital Object Identifier (DOI)
https://doi.org/10.3390/metabo15030161
Funding Information
This research was funded by the National Science Foundation, grant number 2020026 (PI Moseley), and by the National Institutes of Health, grant number P42 ES007380 (University of Kentucky Superfund Research Program Grant; PI Pennell). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Science Foundation or the National Institute of Environmental Health Sciences.
Repository Citation
Huckvale, Erik D. and Moseley, Hunter N. B., "Predicting the Pathway Involvement of Compounds Annotated in the Reactome Knowledgebase" (2025). Markey Cancer Center Faculty Publications. 402.
https://uknowledge.uky.edu/markey_facpub/402
Included in
Biochemistry Commons, Endocrinology, Diabetes, and Metabolism Commons, Molecular Biology Commons, Oncology Commons

Notes/Citation Information
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/ licenses/by/4.0/).