Author ORCID Identifier
Year of Publication
Doctor of Philosophy (PhD)
Agriculture, Food and Environment
Plant and Soil Sciences
Dr. Seth DeBolt
Cellular expansion in plants is a complex process driven by the constraint of internal cellular turgor pressure by an expansible cell wall. The main structural element of the cell wall is cellulose. Cellulose is vital to plant fitness and the protein complex that creates it is an excellent target for small molecule inhibition to create herbicides. In the following thesis many small molecules (SMs) from a diverse library were screened in search of new cellulose biosynthesis inhibitors (CBI). Loss of cellular expansion was the primary phenotype used to search for putative CBIs. As such, this was approached in a forward chemical genetics manner. Reverse chemical genetics would require one of the variants of the proteins responsible for cellulose biosynthesis, a CELLULOSE SYNTHASE (CESA) variant, to be expressed and capable of screening. Unfortunately, it is a very large protein and quite recalcitrant to in vitro assay. To advance the forward genetics paradigm this thesis explores two main pieces of technology: (1) the capacity to increase high throughput screening using robotics and in parallel, (2) the use of in silico methodologies to reduce false positive rates and streamline mechanism of action discovery.
Within in silico modeling and drug discovery a major goal is to allow the interpretation of how a SM might act with a protein, an enzyme, or as an inhibitor of a protein-protein interaction. One approach is to screen multiple SMs against a target of interest upon/within the protein surface and estimate binding energies as well as top ranking SM confirmations. As noted above, it is technically implausible to isolate variants of CESA for a reverse chemical genetics in vitro assay. Due to this, it was my goal to perform a reverse in silico chemical genetics screen to drive efficient bench top biology via the information derived from computational refinements.
Forward chemical genetic screening and virtual screening form a circle of continuous refinement and reiteration, one leading to the next, in either direction. In this circle virtual screening can be used before or after bench top biology. If virtual screening is performed before benchtop biology, these results can aid in the purchasing of portions of libraries that enable benchtop biology to have a higher hit percentage. In a reverse fashion, virtual screening can be done after benchtop biology has determined the SM protein interaction. This approach can help elucidate the mechanism of action of the SM against the protein. In addition to elucidating mechanism of action, virtual screening after experimental validation can afford the search for chemical space that allows the identification of additional molecules/variations of the hit molecule that might have higher activity.
Within this body of work forward chemical genetics was applied to Arabidopsis thaliana (referred to herein as a proper name Arabidopsis) to investigate cellular expansion via the screening of 50,000 SMs with a liquid handling robot for suppression of seedling expansion. The use of liquid handling robotics to aliquot and screen these compounds in working concentrations emerged as a stand-alone publication but is generally integrated into the ‘screening’ portion of the discovery pipeline. Exploiting the rapid radical (root) cell expansion observed in plant seedling development allowed for the use of 96 well plates to observe the influence of individual chemicals on expansion. Light microscopy was used to score whole plates of 80 chemicals at a time. Results from the initial screen for expansion inhibition identified roughly 3,000 of the 50,000 screened SMs as bioactive at 100 µM for a 6% hit rate.
Phenotypic effects of SMs on Arabidopsis were placed in one of eight categories based on phenotypic aberrations: (1) normal growth, (2) stunted roots, (3) severely stunted roots, (4) bleached, (5) colored root hairs, (6) other, (7) incomplete germination, and (8) no germination. Two of the eight categories, stunted root and severely stunted root, were of interest as they were the first line of evidence that the SM could potentially be a CBI. One SM was identified as a CBI and named fluopipamine which forms the focal point for much of the thesis. Other compounds were identified as probable CBIs but could not be characterized in as much detail. Lines of evidence including (1) etiolation prevention, (2) ectopic lignification, (3) ectopic lignification at or below 100 µM, (4) decreases in radiolabeled glucose uptake, (5) loss of anisotropic cellular growth, (6) decrease cellulose synthase complex accumulation and movement in the plasma membrane, (7) bred resistance verified with a cleaved polymorphic sequence assays, (8) cross resistance to a known Arabidopsis mutant, and (9) in silico docking supported fluopipamine as a cellulose synthase 1 (CESA1) antagonist.
In hopes of casting a larger net over chemical space an inhouse method was developed to create a pairwise similarity matrix based on SM structures being converted into bit vectors. Initially, the DUDE database containing roughly 22,000 SMs and 102 of their protein targets was used as a truth set. This matrix of SMs from the DUDE database was clustered via Markov Clustering and the resultant clusters were assessed for quality. Quality of clusters was crudely measured due to grouping SMs based on protein target. The purpose of this approach was to identify optimal parameters within a data truth set so that when this method is applied to new SMs, they would optimally cluster based on protein target. Scripts were written that allow for SM extraction from the clustered results based on a list of anchor SMs. For example, in reverse chemical genetics screens SMs could form clusters that are centered around being associated with the same protein target by shared SM structural similarity.
Additionally, a reverse ligand and structural based virtual screening approach was taken to probe all 111 million PubChem compounds in search of putative CBIs. Three SMs: quinoxyphen, flupoxam, and fluopipamine, were screened against all PubChem Compound across four different fingerprint types and the top percentage of Dice similarity comparisons were retained. This resulted in roughly 75,000 SMs of high similarity to either flupoxam, quinoxyphen, or fluopipamine. Roughly 53,000 SMs obey Lipinski’s rule of five and roughly 1,600 are lead like. Modeling was performed across roughly 72,000 SMs against a wild type and 6 mutant CESA1 proteins using AutoDockGPU. This results in SMs that have equal or better binding affinity than known CBIs. For example, 42 SMs were lead like with better binding affinity than fluopipamine in CESA1 model G1009S.
This work is an example of how in silico, in vitro, and in vivo biology can be combined to yield insight into how a SM interacts with a protein target. This body of work also explores how active compounds can be used to generate lists of SMs that could have high affinity in vivo with protein targets of interest. It is imperative that the future of biology, due to the vast amount of data present within an organism, enlist the help of computational biologists.
Digital Object Identifier (DOI)
Funding support was achieved by Cooperative Agreement National Science Foundation 1849213 years 2016 to 2021.
This study was also supported by the Department of Energy Office of Science Graduate Student Research Program from years 2019 to 2020.
Amos, B. Kirtley, "LEVERAGING CHEMICAL AND COMPUTATIONAL BIOLOGY TO PROBE THE CELLULOSE SYNTHASE COMPLEX" (2021). Theses and Dissertations--Plant and Soil Sciences. 148.
Program_that_comapres_a_SM_to_PubChem.py (14 kB)
Program_that_creates_a_network_from_comparisons.py (18 kB)
Program_to_assess_FP_distribution_within_DUDE.py (11 kB)
Sample_curated_list_of_plant_active_compounds.xlsx (18 kB)
Script_to_assess_cluster_purity.py (6 kB)
Script_to_move_files_to_sub_directories.py (1 kB)
Script_to_perform_a_BFS.py (5 kB)