Formulation is very important in drug delivery. The wrong formulation can render a drug product useless. The amount of preclinical (animal and in vitro) work that must be done before a new drug candidate can be tested in humans can be a problem. The cost of these cGxP studies is typically $3-$5 million. If the wrong drug product formulation is tested, new iterations of the formulation must be tested with additional costs.

Data-driven computational science can help reduce this cost. In the absence of existing human exposure, a battery of preclinical tests must be performed in at least two species before FDA will permit testing in humans. However, for many drugs (such as those beginning with natural products) there is a history of human exposure. In these cases, computer modeling of a population to determine human exposure may be adequate to permit phase 1 studies with a candidate formulation in humans.

The CDC’s National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews and physical examinations including laboratory results. The NHANES database can be mined to determine exposure to a food additive, and early human formulation testing conducted at levels beneath those to which the US population is ordinarily exposed through food. These data can be combined with data mined from international chemical shipments to validate an exposure model. This paper describes the data driven formulation testing process using a new candidate Ebola treatment that, unlike vaccines, can be used after a person has contracted the disease. This drug candidate’s mechanism of action permits it to be potentially used against all strains of the virus, a characteristic that vaccines might not share.

Document Type


Publication Date


Notes/Citation Information

Published in Procedia Computer Science, v. 108C, p. 1612-1621.

© 2017 The Authors

Under a Creative Commons license.

Digital Object Identifier (DOI)


Funding Information

The project described was supported by the NIH National Center for Advancing Translational Sciences through grant number UL1TR001998. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1053575.