Computational de-orphanization of the olive oil biophenol oleacein: Discovery of new metabolic and epigenetic targets

The health promoting effects of extra virgin olive oil (EVOO) relate to its unique repertoire of phenolic compounds. Here, we used a chemoinformatics approach to computationally identify endogenous ligands and assign putative biomolecular targets to oleacein, one of the most abundant secoiridoids in EVOO. Using a structure-based virtual profiling software tool and reference databases containing more than 9,000 binding sites protein cavities, we identified 996 putative oleacein targets involving more than 700 proteins. We subsequently identified the high-level functions of oleacein in terms of biomolecular interactions, signaling pathways, and protein-protein interaction (PPI) networks. Delineation of the oleacein target landscape revealed that the most significant modules affected by oleacein were associated with metabolic processes (e.g., glucose and lipid metabolism) and chromatin-modifying enzymatic activities (i.e., histone post-translational modifications). We experimentally confirmed that, in a low-micromolar physiological range (< 20 m mol/L), oleacein was capable of inhibiting the catalytic activities of predicted metabolic and epigenetic targets including nicotinamide N-methyltransferase, ATP-citrate lyase, lysine-specific demethylase 6A, and N-methyltransferase 4. Our computational de-orphanization of oleacein provides new mechanisms through which EVOO biophenols might operate as chemical prototypes capable of modulating the biologic machinery of healthy aging.


M A N U S C R I P T
A C C E P T E D ACCEPTED MANUSCRIPT

INTRODUCTION
The ability of the "Mediterranean diet", which reflects the dietary patterns found in olive-growing areas of the Mediterranean basin, to significantly reduce aging-related morbidity and promote increased life expectancy can be largely attributed to the unique nutraceutical properties of extra virgin olive oil (EVOO) (Colomer and Menendez, 2006;Menendez and Lupu, 2006;Escrich et al., 2007;López-Miranda et al., 2010;Fernández del Río et al., 2016;Piroddi et al., 2017). The positive influence of EVOO on human health has been historically ascribed to its high content of monounsaturated fatty acids (e.g., oleic acid; 18:1n-9). However, it has been shown that other oleic acid-rich oils but lacking the characteristic functional components of EVOO (e.g., biophenols) such as high-oleic canola or high-oleic sunflower oils do not share the same ability to improve, for example, cardiovascular prognosis, and to concurrently lowering the incidence of cancer and neurodegeneration (Pérez-Jiménez et al., 2007;Visioli and Bernardini, 2011;Menendez et al., 2013;Reboredo-Rodríguez et al., 2018). Accordingly, the pharma-nutritional properties of EVOO are now largely attributed to its unique repertoire of EVOO biophenolic compounds including simple phenols (e.g., tyrosol and hydroxytyrosol) and secoiridoids (e.g., oleuropein aglycone, ligstroside aglycone, oleocanthal, and oleacein).
Although significant efforts have been made to determine the physiological mechanisms regulated by EVOO phenolics, their actual molecular interactors remain largely unknown. Indeed, only a few studies have explored in depth the molecular mechanisms through which EVOO-specific phenolics may provide health benefits (Beauchamp et al., 2005;Vazquez-Martin et al., 2012;Menendez et al., 2013;Corominas-Faja et al., 2018;Pang and Chin, 2018;Verdura et al., 2018). In this regard, we recently envisioned that in silico, computational approaches might be M A N U S C R I P T A C C E P T E D ACCEPTED MANUSCRIPT 4 particularly useful to elucidate hypothesis-generating pharmacological effects, mechanisms of action, and targets underlying the health-promoting activities of EVOO-related biophenols. Taking advantage of the so-called biological activity spectra (BAS), an intrinsic property of a compound that is largely dependent on its structure and reflects pharmacological effects, physiological, and biochemical mechanisms of action, and specific toxicities, we recently employed PASS (Prediction of Activity Spectra for Substances) software (Lagunin et al., 2000(Lagunin et al., , 2010Stepanchikova et al., 2003) to analyze the BAS of the complex EVOO secoiridoids oleuropein aglycone and decarboxymethyl oleuropein aglycone (oleacein) (Corominas-Faja et al., 2014). This approach highlighted a number of putative pharmacological effects (e.g., anti-oxidant, anti-inflammatory, anti-neoplastic, and chemopreventive) that might underlie the health-promoting effects of some EVOO biophenols, but the overall picture remained incomplete with regards to their mechanism of action at the molecular level.
We here hypothesized that a reverse pharmacology computational approach might be helpful to "de-orphanize" endogenous ligands and assign molecular functions to EVOO biophenols. Because the oleuropein degradation product oleacein (dialdehydic form of decarboxymethyl elenolic acid linked to hydroxytyrosol; 3,4-DHPEA-EDA) is one of the most abundant EVOO secoiridoids and has been proposed as the key effector of the nutritional and beneficial effects of EVOO given its capacity to survive to the acidic conditions of stomach and be available for absorption into the systemic circulation (Nardi et al., 2014;Costanzo et al., 2017;Corominas-Faja et al., 2018;Lombardo et al., 2018;Celano et al., 2019), we decided to use a systematic chemoinformatics approach coupled to laboratory-based confirmatory testing to discover new biomolecular targets through which oleacein M A N U S C R I P T A C C E P T E D ACCEPTED MANUSCRIPT 5 might operate as a poly-therapeutic tool capable of modulating the biologic machinery of healthy aging.

RESULTS
We used a virtual profiling approach employing the structure-based software Ixchel (Cuyàs et al., 2018), which can perform docking calculations of a given molecule against a database of ~9,000 binding sites protein cavities (curated from RSCB PDB according to the UnitProtKB human entries) and returns the binding energy of every possible interaction. Through this approach we identified 1,251 putative targets.
Using a binding energy "significance" threshold of -6.0 kcal/mol to filter the dockingbased inverse virtual screening, which was chosen considering the size of the molecule and to ensuring 50% inhibitory concentrations (IC 50 ) values within the physiological micromolar range of oleacein (Xu et al., 2018;Gimeno et al., 2019), we finally selected 996 putative oleacein targets involving >700 different proteins (supplementary Table S1).

Gene enrichment analyses and construction of the oleacein target
landscape. To quantitatively assess whether the selected group of targets was more enriched with genes belonging to a specific Gene Ontology (GO) term, or with genes involved in a particular pathway more than would be expected by chance, we utilized Cytoscape, a network visualization and analysis tool, for GO enrichment analyses (Shannon et al., 2003). This included also enrichment analyses of biological pathways and reactions supplied by KEGG, Reactome, and Wikipathways (supplementary Table S2 Because the aforementioned approach generated a large number of enriched GO terms, we then used ClueGO, a Cytoscape plug-in that strongly improves biological interpretation of large lists of genes (Bindea et al, 2009) and integrates GO terms as well as KEGG pathways to create functionally organized GO/pathway term networks maintaining the most significant term as the group representative (supplementary Tables S4, S5, S6, S7).
A visualization of the GO enrichment analysis for putative oleacein targets regarding biological process (BP), cellular compartment (CC), and molecular function (MF) is shown in Fig.1. Among these terms, peptidyl-amino acid modification, membrane raft, and protein kinase activity were the most significant in the BP, MF, and CC groups, respectively. Also, candidate proteins were enriched in terms of histone phosphorylation (BP), caveola (CC), and histone kinase activity (MF). We then performed pathway enrichment analysis to assess the gene-associated pathways potentially targeted by oleacein (Fig. 2). Overall, the greatest number of proteins were involved in central carbon metabolism in cancer ( Fig. 2; supplementary File S8) (Luo and Brouwer, 2013). A circus plot summarizing the relationships between effectors within KEGG pathways and the functions triggered by them is shown in Fig. 3. Figure 4 illustrates the GO maps for BP, CC, and MF terms, with each node representing a specific GO term and links between nodes illustrating the interaction M A N U S C R I P T A C C E P T E D ACCEPTED MANUSCRIPT 7 between GO terms. The GO map for BP included numerous metabolic processes, whereas caveolae/raft membrane microdomains and cytoplasmic vesicle dominated the CC and MF GO maps, respectively. Figure 5 shows a visualization of the oleacein target landscape using a circle-packing graph of core proteins grouped into higher order functions, in which the circles are sized by relative abundances of proteins contributing to each function. This analysis supports the notion that oleacein preferentially targets metabolic-related processes and histone-related functions.

Experimental validation of oleacein-targeted metabolic and epigenetic
targets. We finally selected four different metabolic and epigenetic enzymes, namely nicotinamide N-methyltransferase (NNMT), which catalyzes the N-methylation of nicotinamide, pyridines, and other analogs using S-adenosyl-l-methionine (SAM) as donor (Hoshino et al., 1982), ATP-citrate lyase (ACLY), a cytoplasmic enzyme that catalyzes the production of acetyl-CoA from citrate, CoA, and ATP (Srere, 1959), lysine-specific demethylase 6A/UTX (KDM6A) (Agger et al., 2007), which acts as a component of the COMPASS complex to control gene activation by catalyzing demethylation of H3K27me2/3 in the chromatin of genes and enhancers, and DOT1L (also called lysine N-methyltransferase 4), a SAM-dependent methyltransferase that can add up to three methyl groups to histone H3 lysine 79 (H3K79) (Feng et al., 2002), to computationally rationalize the binding mode of oleacein and experimentally verify in vitro its inhibitory activity towards them.  Table 2).

DISCUSSION
EVOO biophenols constitute biologically pre-validated chemical prototypes capable of interacting with health-promoting molecular targets. However, although it is widely accepted that the chemical architectures of natural products might offer unique opportunities for drug discovery, the bulk of the chemical environment specifically present in the phenolic fraction of EVOO is essentially uncharted. We here provide biocomputational evidence to suggest that the capacity of the EVOO secoiridoid oleacein to provide health benefits likely involves a significant number of metabolic and epigenetic targets.
Proteomics is the technology most commonly employed for identifying molecular binding counterparts for naturally occurring chemotypes. However, target prediction software tools are becoming increasingly useful to assign putative M A N U S C R I P T A C C E P T E D ACCEPTED MANUSCRIPT 9 biomolecular targets to natural products. In this line, the virtual profiling structurebased software tool Ixchel allowed us to predict that the oleacein scaffold displays potential for engaging almost 1,000 putative targets involving more than 700 proteins.
Our integrative analysis initially aimed to provide evidence-based statements associating the predicted oleacein targets with specific ontology terms, and to interpret their high-level functions in terms of signaling pathways. Regarding GO annotation, the most significantly enriched terms were associated with cellular protein modification processes. It is noteworthy that protein post-translational modifications (PPMs), either spontaneous or physiological/pathological, are epigenetic mechanisms emerging as critical contributors to aging and aging-related diseases (Santos and Lindner, 2017;Vanhooren et al., 2015). Furthermore, the finding of lipid raft microdomains (caveolae) as the preferential cellular compartment of oleacein, adds a spatial dimension to the PPM nature of oleacein because the key function of caveolae is to concentrate signaling molecules, allowing their rapid activation upon demand by PPMs (Ortegren et al., 2007;Layne et al., 2011). These findings support the notion that plant-derived phenolics can target and change the activation status of caveolae-associated signaling proteins, and agree with earlier studies from our group suggesting that EVOO phenolics such as oleacein could accumulate in lipid raft microdomains and subsequently interfere with the dimerization and activation (e.g., auto-phosphorylation) of cancer-related protein tyrosine kinases such as HER2 (Menendez et al., 2008(Menendez et al., , 2009. Regarding pathway enrichment, the majority of significantly enriched terms were associated with pathways in cancer, whereas the majority of predicted targets were involved in central carbon metabolism in cancer, which is known to involve those proteins providing specific adaptations of cellular metabolism to support growth and survival in cancer cells.
We constructed functional protein association networks with the aim of interpreting the impact of oleacein on complex biological processes. A bird's-eye view of the oleacein target landscape revealed that the most significant modules were associated with metabolic processes (e.g., glucose and lipid metabolism) and chromatin-modifying enzymatic activities (i.e., histone post-translational modifications). These findings agree with and further expand on a previously described cell-based phenotypic drug discovery strategy coupled to mechanism-ofaction profiling and target deconvolution demonstrating the ability of oleacein to target cancer stem cells via metabolo-epigenetic mechanisms (Corominas-Faja et al., 2018). Nonetheless, because experimental confirmation is required for these predictions, we selected two metabolic (i.e., NNMT and ACLY) and two epigenetic (KDM6A/UTX and DOT1L) processes to directly evaluate the regulatory effects of oleacein. The acetyl-CoA-synthesizing enzyme ACLY, a strategic enzyme linking glycolytic and lipidic metabolism, is known to be aberrantly expressed in many types of tumors, and also to couple changes of intermediate metabolism to the chromatinmediated regulation of transcription during aging (Wellen et al., 2009;Zaidi et al., 2012;Peleg et al., 2016;Zhao et al., 2016;Granchi, 2018). NNMT impairs the methylation potential of cells and brings about an altered epigenetic state by consuming methyl units from SAM to create the stable metabolic product 1methylnicotinamide (MNA), which is associated with obesity, type-2 diabetes, cancer, and aging (Ulanovskaya et al., 2013;Kraus et al., 2014;Hong et al., 2015Hong et al., , 2018.

CONCLUSIONS
In summary, although significant efforts have been made to clarify the physiological mechanisms regulated by EVOO phenolics, their actual molecular interactors remain largely unknown. Here, we have outlined a first-in-class systematic chemoinformatics approach to computationally predict biomolecular targets of the EVOO secoiridoid oleacein. We provide computational evidence coupled to experimental validation to suggest that one of the most representative EVOO biophenols can operate as a polytherapeutic tool via targeting of metabolic and epigenetic mechanisms. From a pharmacophore perspective, our target de-orphanization of oleacein might illuminate innovative scaffolds that could be employed in rational chemical biology and drug discovery approaches aimed to design new EVOO phenolics-inspired chemical entities.

M A N U S C R I P T
A C C E P T E D

Virtual profiling.
Virtual profiling was performed with the structure-based software tool Ixchel (http://www.mindthebyte.com), using the chemical structure of oleacein as a seed. Ixchel is a structure-based VP tool that performs docking calculations of a molecule (SDF or SMILE file) against an in-house developed database of almost 9,000 binding sites protein cavities curated from RSCB PDB according to UniProtKB human entries. Ixchel returns the binding energy of every possible interaction, which allows the classification and prediction of the targets.

Gene Enrichment Analysis. Gene enrichment analysis was performed using
ClueGO, a Cytoscape plug-in, interrogating GO terms (Biological Process, Cellular Component and Molecular Function; www.geneontology.org) and KEGG pathways (www.genome.jp/kegg). The selection criteria comprised a two-sided hypergeometric

M A N U S C R I P T A C C E P T E D ACCEPTED MANUSCRIPT
13 test (enrichment/depletion) followed by a p-value correction Bonferroni test, filtering for q-values < 0.05. A final step involved grouping terms with more than 25% of common genes with a level of agreement greater than 0.40 based on the Cohen's kappa coefficient. The term with the lowest q-value was designated as group representative.

NNMT.
NNMT activity was measured using the N′-Nicotinamide After the enzymatic reaction, the reaction mixtures were discarded and each well was washed three times with TBST buffer and slowly shaken with blocking buffer (BPS#52100) for 10 minutes. The wells were emptied and 100 µL of diluted primary M A N U S C R I P T

A C C E P T E D ACCEPTED MANUSCRIPT
15 antibody in blocking buffer (BPS#52140Y) was added, and the plate was slowly shaken for 10 minutes at RT. After removal of the primary antibody mix, 100 µL of diluted HRP-labeled secondary antibody (BPS#52131H) was added and the plate was slowly shaken for 30 minutes at RT. As before, the plate was emptied and washed three times with blocking buffer for 10 minutes at RT. Blocking buffer was finally discarded and 100 µL of freshy prepared HRP chemiluminiscent substrate was added to each well. Luminescence was immediately measured using a BioTek Synergy 2 microplate reader. Luminescence data were converted to DOT1L activity (%) and the IC 50 value was determined as mentioned above.
Serial dilution of the compounds was first performed in 3.3% DMSO/assay buffer.
From this step, 3 µL of compound is added to 4 µL of enzyme and is incubated for 30 minutes at RT. After this incubation, 3 µL of substrate is added to initiate the reaction.
The final DMSO concentration is 1%. After enzymatic reactions, 5 µL of anti-mouse acceptor beads (PerkinElmer, diluted 1:500 with 1× detection buffer) or 5 µL of antirabbit acceptor beads (PerkinElmer, diluted 1:500 with 1× detection buffer) and 5 µL of primary antibody (BPS#52140E,F, diluted 1:200 with 1x detection buffer) were added to the reaction mix. After brief shaking, the plate was incubated for 30 minutes. The AlphaScreen intensity data were analyzed and compared using Graphpad Prism software (GraphPad Software Inc.). In the absence of oleacein, the A-screen or fluorescence intensity (F t ) was defined as 100% activity. In the absence of enzyme, the intensity (F b ) was defined as 0% activity. The percent activity in the presence of oleacein was calculated according to the following equation: %activity = where F=the A-screen intensity in the presence of oleacein. A-screen data were converted to KDM6A/UTX activity (%) and the IC 50 value was determined as mentioned above.

Statistical analysis. All statistical analyses were performed using GraphPad
Prism software (GraphPad Software Inc.). Data are presented as mean ± S.D.
Comparisons of means of ≥ 3 groups were performed by ANOVA with the post-hoc Bonferroni correction when appropriate. P values < 0.05 were considered to be statistically significant (denoted as *). All statistical tests were two-sided.       and PHE245. This group of residues was maintained after docking and MD simulations, and also LYS187 and PHE223 were revealed to be key residues as both are involved in TT8 binding. Another group of residues interacting with oleacein after the MD simulations were GLY163, GLN168 and VAL240, and all stabilized the molecule by establishing hydrogen bonds interactions.