Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Imaging-AMARETTO: An Imaging Genomics Software Tool to Interrogate Multiomics Networks for Relevance to Radiography and Histopathology Imaging Biomarkers of Clinical Outcomes

Imaging-AMARETTO: An Imaging Genomics Software Tool to Interrogate Multiomics Networks for... original reports abstract SPECIAL SERIES: INFORMATICS TOOLS FOR CANCER RESEARCH AND CARE Imaging-AMARETTO: An Imaging Genomics Software Tool to Interrogate Multiomics Networks for Relevance to Radiography and Histopathology Imaging Biomarkers of Clinical Outcomes 1,2 2,3 1 2,3 1 Olivier Gevaert, PhD ; Mohsen Nabian, PhD ; Shaimaa Bakr, MS ; Celine Everaert, PhD ; Jayendra Shinde, PhD ; 2,3 4 4 2,5 6 2 Artur Manukyan, PhD ; Ted Liefeld, SM ; Thorin Tabor, MS ; Jishu Xu, PhD ; Joachim Lupberger, PhD ; Brian J. Haas, MS ; 6 7 4 2,3 3 Thomas F. Baumert, MD ; Mikel Hernaez, PhD ; Michael Reich, BS ; Francisco J. Quintana, PhD ; Erik J. Uhlmann, MD ; 3 4 2,8 2,3 Anna M. Krichevsky, PhD ; Jill P. Mesirov, PhD ; Vincent Carey, PhD ; and Nathalie Pochet, PhD PURPOSE The availability of increasing volumes of multiomics, imaging, and clinical data in complex diseases such as cancer opens opportunities for the formulation and development of computational imaging genomics methods that can link multiomics, imaging, and clinical data. METHODS Here, we present the Imaging-AMARETTO algorithms and software tools to systematically interrogate regulatory networks derived from multiomics data within and across related patient studies for their relevance to radiography and histopathology imaging features predicting clinical outcomes. RESULTS To demonstrate its utility, we applied Imaging-AMARETTO to integrate three patient studies of brain tumors, specifically, multiomics with radiography imaging data from The Cancer Genome Atlas (TCGA) glio- blastoma multiforme (GBM) and low-grade glioma (LGG) cohorts and transcriptomics with histopathology imaging data from the Ivy Glioblastoma Atlas Project (IvyGAP) GBM cohort. Our results show that Imaging- AMARETTO recapitulates known key drivers of tumor-associated microglia and macrophage mechanisms, mediated by STAT3, AHR, and CCR2, and neurodevelopmental and stemness mechanisms, mediated by OLIG2. Imaging-AMARETTO provides interpretation of their underlying molecular mechanisms in light of imaging biomarkers of clinical outcomes and uncovers novel master drivers, THBS1 and MAP2, that establish relationships across these distinct mechanisms. CONCLUSION Our network-based imaging genomics tools serve as hypothesis generators that facilitate the interrogation of known and uncovering of novel hypotheses for follow-up with experimental validation studies. We anticipate that our Imaging-AMARETTO imaging genomics tools will be useful to the community of bio- medical researchers for applications to similar studies of cancer and other complex diseases with available multiomics, imaging, and clinical data. JCO Clin Cancer Inform 4:421-435. © 2020 by American Society of Clinical Oncology Licensed under the Creative Commons Attribution 4.0 License ASSOCIATED CONTENT 4,5 INTRODUCTION patients with glioblastoma multiforme (GBM) and Appendix approximately 500 patients with low-grade glioma Author affiliations Major collaborative initiatives have unleashed a myriad and support (LGG). The TCIA Visually AcceSAble Rembrandt of multiomics, clinical, and imaging data for large 7-9 information (if Images (VASARI) project curated a feature set of patient cohorts in studies of cancer, such as multio- applicable) appear at approximately 30 magnetic resonance imaging mics and clinical data from The Cancer Genome Atlas the end of this 1 (MRI)–derived features on the basis of specialists’ article. (TCGA) and the Clinical Proteomic Tumor Analysis 2 review that is available for approximately 200 patients Accepted on March Consortium (CPTAC) and radiographic and histopa- with GBM and approximately 180 patients with LGG. 16, 2020 and thology imaging data from The Cancer Imaging Ar- published at 3 chive (TCIA). For example, the brain tumor section of A trade-off exists between the number of patients in- ascopubs.org/journal/ TCGA provides multiomics profiles, including RNA cluded in a data set and the depth of analysis that has cci on May 8, 2020: sequencing (RNA-seq), DNA copy number variation, moved to increasing levels of refinement, ranging from DOI https://doi.org/10. 10-12 1200/CCI.19.00125 and DNA methylation data for approximately 500 studying tissues, to cell populations, to single-cell 421 Gevaert et al CONTEXT Key Objective Imaging-AMARETTO provides software tools for imaging genomics through multiomics, clinical, and imaging data fusion within and across multiple patient studies of cancer, toward better diagnostic and prognostic models of cancer. Knowledge Generated Our network-based imaging genomics tools serve as powerful hypothesis generators that facilitate the testing of known hypotheses and uncovering of novel hypotheses for follow-up with experimental validation studies. Our case study that integrated multiple studies of brain cancer illustrates how Imaging-AMARETTO can be used for imaging diagnostics and prognostics by interrogating multimodal and multiscale networks for imaging biomarkers to identify their clinically relevant underlying molecular mechanisms. Relevance We anticipate that our Imaging-AMARETTO tools for network-based fusion of multiomics, clinical, and imaging data will directly lead to better diagnostic and prognostic models of cancer. In addition, our tools for network biology and medicine will open new avenues for drug discovery by integrating pharmacogenomics data into these networks, toward better therapeutics of cancer. sequencing. For example, the Ivy Glioblastoma Atlas Project mechanisms, with interpretation of the underlying mo- 13-15 (IvyGAP) provides 270 transcriptomic profiles refined lecular mechanisms in light of imaging biomarkers of clinical outcomes. through histopathology imaging and annotated by a con- sensus of histopathologists for studying anatomic structures METHODS and cancer stem cells for a subset of approximately 30 The *AMARETTO Software Architecture patients from the TCGA GBM cohort. The *AMARETTO framework (Fig 1) provides tools for In parallel, quantitative imaging provides tools that are network-based fusion of multiomics, clinical, and imaging capable of processing large volumes of radiography and data within and across multiple patient studies of cancer. histopathology images, such as deep convolutional neural Specifically, this framework offers modular and comple- networks. The promising field of radiogenomics is based on mentary solutions to multimodal and multiscale aspects the idea that entities at different scales, such as molecules, of network-based modeling within and across studies cells, and tissues, are linked to one another and, therefore, of cancer through the AMARETTO and Community- may be modeled as a whole. Studies have shown that AMARETTO algorithms, respectively. In this work, we quantitative image features extracted from radiography present an imaging genomics software toolbox that com- imaging data are associated and predictive of gene ex- prises the newly formulated Imaging-AMARETTO and 17-20 pression patterns from tissues of matched tumors. Imaging-Community-AMARETTO algorithms that together Recent efforts further expand this work to link multio- facilitate interpretation of patient-derived multiomics net- mics data with both radiography and histopathology works for their relevance to radiography and histopathology 21,22 imaging, toward developing methods for imaging imaging biomarkers of clinical outcomes. Resources of the genomics. *AMARETTO software toolbox are available through These large archives of multimodal and multiscale data GitHub, Bioconductor, R Jupyter Notebook, GenePattern, sources provide complementary insights into the mech- GenomeSpace, and GenePattern Notebook. anistic basis of cancer, toward better diagnosis and Imaging-AMARETTO From a User’s Perspective treatment, and open unprecedented modeling opportu- nities to link multiomics data with clinical and imaging The workflow for multiomics, clinical, and imaging data phenotypes. As a solution to imaging genomics, we in- fusion includes utilities to learn networks from individual troduce the Imaging-AMARETTO software tools to sys- patient cohorts using Imaging-AMARETTO and to link tematically interrogate networks derived from multiomics networks across multiple related patient cohorts using data for relevance to imaging biomarkers of clinical out- Imaging-Community-AMARETTO. Together, these work- comes. We demonstrate the utility of these imaging ge- flows allow users to integrate patient tumor-derived mul- nomics tools by integrating three patient brain tumor tiomics or transcriptional profiles with clinical and databases, including the TCGA GBM and LGG and the molecular characteristics and radiography and histopa- IvyGAP GBM cohorts. We uncover known and novel thology imaging features within and across related patient drivers of tumor-associated microglia and macrophage cohorts. The Imaging-AMARETTO source code in R is mechanisms, and neurodevelopmental and stemness available from GitHub. An R Jupyter Notebook that 422 © 2020 by American Society of Clinical Oncology Community 2 Community M Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer Regulatory Mechanisms of Drivers from Multiomics Subnetworks of Regulatory Circuits RNA TCGA GBM MET GENE GENE GENE CNV IvyGAP GBM Networks of Regulatory Circuits Circuit 1 Circuit 2 TCGA LGG Circuits Target genes Driver genes Activators Circuit 3 Circuit N Repressors ... (2) Community-AMARETTO (1) AMARETTO Clinical Disease Molecular Disease Radiography Imaging Histopathology Imaging Biomarker Discovery Biomarker Discovery Biomarker Discovery Biomarker Discovery (3) Imaging-AMARETTO and Imaging-Community-AMARETTO FIG 1. The Imaging-AMARETTO and Imaging-Community-AMARETTO software architecture. The overall framework offers modular and com- plementary solutions to multimodal and multiscale aspects of network-based modeling within and across multiple studies of cancer. Specifically, (1) The AMARETTO algorithm learns networks of regulatory modules or circuits (circuits of drivers and target genes) from functional genomics or multiomics data (eg, DNA copy number variation [CNV], DNA methylation [MET], RNA gene expression [RNA]) within each study of cancer (eg, within The Cancer Genome Atlas [TCGA] glioblastoma multiforme [GBM], Ivy Glioblastoma Atlas Project [IvyGAP] GBM, or TCGA low-grade glioma [LGG] cohorts separately); (2) the Community-AMARETTO algorithm learns communities or subnetworks of regulatory circuits that are shared or distinct across networks derived from multiple studies of cancer (eg, across the TCGA GBM, IvyGAP GBM, and TCGA LGG cohorts); and (3) the Imaging-AMARETTO and Imaging-Community-AMARETTO algorithms associate these circuits (AMARETTO) and subnetworks (Community- AMARETTO) to clinical, molecular, and imaging-derived biomarkers by mapping radiography and histopathology imaging data onto the networks and assessing their clinical relevance for imaging diagnostics. provides stepwise guidelines for running the source code MSigDB Hallmark). When genetic or epigenetic data are directly from GitHub for its application to brain cancer is also available, they can help to guide the selection of available from Google Colaboratory. candidate driver genes. Potential cancer drivers are identified as somatic recurrent cancer aberrations from Imaging-AMARETTO supports multiple workflows: a pa- genetic and epigenetic data sources using GISTIC and tient cohort with only transcriptional profiles and a patient 32,33 MethylMix. The GISTIC algorithm is used to identify cohort with multiomics profiles. When only RNA-seq data copy number amplifications and deletions from DNA copy are available for a cohort, a predefined list of candidate number variation data. The MethylMix algorithm is used to driver genes is required, which can be selected or uploaded identify hyper- and hypomethylated sites from DNA by the user. Predefined lists of known drivers are available 26,27 methylation data. The user also specifies the number of for collections of transcription factors (TFutils and 28,29 regulatory modules to be learned from the data and the Molecular Signatures Database [MSigDB] C3) and cancer driver genes (COSMIC Cancer Gene Census, percentage of most varying genes to be included in the JCO Clinical Cancer Informatics 423 Community 1 Gevaert et al analysis. Networks of regulatory modules are inferred from To validate the predicted drivers as regulators of their RNA-seq data using an iterative optimization procedure. targets in modules and communities, we assess whether 34-37 The algorithm starts with an initialization step that activator or repressor drivers have a direct or indirect im- clusters the genes into modules of co-expressed target pact on their targets using experimental genetic pertur- genes. For each of these modules, we learn the regulatory bation data. The user can test signatures derived from programs as a linear combination of candidate driver genes genetic perturbation experiments, such as signatures of that best predict their target genes’ expression using Elastic target genes bound to transcription factors measured in Net–regularized regression. Target genes are then reas- protein-DNA–binding chromatin immunoprecipitation signed to the regulatory programs that best explain their sequencing (ChIP-Seq) experiments (Encyclopedia of 46 47 gene expression levels as estimated by the predictive power DNA Elements, ChIP-X Enrichment Analysis, of the regulatory programs’ respective regularized re- Harmonizome )ordefined by motif binding (MSigDB gression models when predicting the target genes’ C3), or signatures of genes induced or repressed in re- 34-37 expression. The algorithm iterates over these two steps sponse to genetic knockdown or overexpression experi- until convergence. This analysis generates a network of ments of drivers (Library of Integrated Network-Based 49,50 regulatory modules, defined as a group of target genes Cellular Signatures [LINCS]/Connectivity Map [CMAP], collectively activated or repressed by their associated Harmonizome). drivers. To characterize regulatory modules for clinical outcomes and molecular biomarkers, the user can submit pheno- Imaging-Community-AMARETTO: Linking types known for all or subsets of samples and specify the Imaging-AMARETTO Networks Across Cohorts statistical hypothesis tests to use for each phenotype. Ex- To compare and integrate networks of regulatory modules amples of clinical and molecular phenotypes include across multiple cohorts, the user can submit two or more survival data, molecular subclasses (eg, mesenchymal, Imaging-AMARETTO networks and optionally add known proneural, or classical GBM, astrocytoma or oligoden- networks as collections of signatures to guide subnet- droglioma LGG) and biomarkers (eg, IDH mutation, EGFR work learning and interpretation, such as immune cell amplification, MGMT methylation status). Our imple- 39,40 41-43 (CIBERSORT ) and stemness signatures. The mentation supports survival analysis using Cox proportional algorithm creates a module map of all pairwise com- hazards regression, nominal two-class and multiclass parisons between modules across multiple networks to analysis using the Wilcoxon rank sum and Kruskal-Wallis assess the extent of overlapping genes between all pairs tests, and continuous or ordinal analysis using the Pearson of modules (−log P value, hypergeometric test). This linear and Spearman rank correlation tests. These clinical module map is partitioned using an edge betweenness and molecular phenotype associations are assessed for community detection algorithm (Girvan-Newman ) that each of the regulatory modules in individual cohorts and groups the modules into subnetworks or communities combined for communities across cohorts. across the multiple networks. These communities repre- Finally, to interpret regulatory modules for relevance to sent shared behavior across two or more cohorts, and radiography or histopathology imaging features, associa- modules not assigned to communities are reported as tions with these imaging features can be assessed. Ex- distinct behavior specific for each cohort. This analysis amples of radiography and histopathology phenotypes generates subnetworks or communities of regulatory include the 30 TCIA VASARI MRI features defined by expert modules that are shared or distinct across multiple consensus for the TCGA GBM and LGG cohorts, the Imaging-AMARETTO networks derived from multiple co- 13-15 IvyGAP histopathology imaging features characterizing horts and further refines shared and distinct behavior of RNA-seq samples refined for anatomic structures and modules with respect to their specific drivers. cancer stem cells defined by expert consensus for the Downstream Utility for Interpreting Clinical and IvyGAP GBM cohort, and radiography and histopathology Experimental Outcomes imaging features derived using quantitative imaging 20,52-54 methods. We developed several downstream utilities that facilitate interpretation of the Imaging-AMARETTO networks, in- Users are provided with all results in the form of hypertext cluding functional characterization, driver validation, clin- markup language (HTML) reports that are generated in an ical correlation, and imaging association. To functionally automated manner for individual cohorts using Imaging- characterize modules and communities, we provide sig- AMARETTO and multiple cohorts using Imaging-Commu- natures from known gene sets databases (MSigDB) that nity-AMARETTO. These reports include searchable tables can be augmented with user-defined signatures, such as within and across modules and communities, including 39,40 45 41-43 immune cell, stromal cell, and stemness signa- statistics (ie, coefficients, P values, false discovery rate tures. Regulatory modules and communities are assessed [FDR] values) for functional enrichment, driver validation, for enrichment in these known functional categories clinical and molecular biomarkers, and radiography and (hypergeometric test). histopathology imaging features. These reports also include 424 © 2020 by American Society of Clinical Oncology Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer heat map visualizations for modules (Figs 2-6) and graph Diametrically opposed, higher expression levels of OLIG2 visualizations for communities (Appendix Fig A1). Source modules (Figs 4-6) are associated with better survival in code is also provided to convert Imaging-AMARETTO and GBM and LGG, and these modules also distinguish be- Imaging-Community-AMARETTO networks for depositing tween molecular subclasses of GBM and LGG but in the networks in the NDEx network database, taking advan- opposite direction. In GBM the classical and proneural tage of its interactive features. subclasses are represented by higher expression of these modules compared with the mesenchymal subclass. In RESULTS LGG, the oligodendroglioma subtype is characterized by To demonstrate its utility, we applied the Imaging- higher expression of these modules compared with the AMARETTO workflow to three studies of brain tumors: astrocytoma subtype. multiomics profiles from approximately 500 patients and approximately 30 radiography MRI features for Imaging-AMARETTO Deciphers Histopathology Imaging approximately 200 patients from the TCGA GBM Biomarkers of Key Driver Mechanisms cohort, multiomics profiles from approximately 500 Histopathology imaging features of anatomic structures patients and approximately 30 radiography MRI fea- show that higher expression of STAT3, AHR, and CCR2 tures for approximately 180 patients from the TCGA modules (Fig 3) distinguishes between samples derived LGG cohort, and for a subset of approximately 30 from the cellular tumor compared with those from leading patients from the TCGA GBM patient cohort 270 tran- edge and infiltrating tumor regions. Higher expression of scriptomic profiles refined through histopathology im- OLIG2 modules distinguishes infiltrating tumor from cel- aging and annotated with imaging features that lular tumor samples. characterize anatomic structures for 122 samples and Features representative of cancer stem cells show that cancer stem cells for 148 samples were used from the higher expression of STAT3, AHR, and CCR2 modules IvyGAP GBM project. (Fig 3) distinguishes cancer stem-cell samples from their Disease progression in glioma is characterized by in- non–stem cancer cell counterparts. This observation is filtration of resident microglia and peripheral macrophages consistent across the distinct substructures of the cellular in the tumor microenvironment and by pervasive infiltration tumor, including hyperplastic blood vessels, microvascular of tumor cells in the healthy surroundings of the tumor. proliferation, perinecrotic zone, and pseudopalisading cells Understanding microglia and macrophage physiology and around necrosis. Diametrically opposed, higher expression its complex interactions with tumor cells can elucidate their of OLIG2 modules distinguishes non–stem cancer cells roles in glioma progression and uncover potentially in- from cancer stem cells consistently across these micro- teresting druggable targets. vascular and necrosis substructures. Our results show that Imaging-AMARETTO captures these hallmarks of glioma, for example, key drivers of tumor- Imaging-AMARETTO Deciphers Radiography Imaging associated microglia and macrophage mechanisms Biomarkers of Key Driver Mechanisms mediated by STAT3, AHR, and CCR2, and neuro- Radiographic image features of STAT3, AHR, and CCR2 developmental and stemness mechanisms that involve modules (Figs 2 and 6) are highly consistent across GBM 59,60 OLIG2. Our findings recapitulate recent discoveries and and LGG. Higher expression is associated with a higher provide interpretation of the molecular mechanisms in light proportion of enhancing tumor, lower proportion of non- of imaging biomarkers of clinical outcomes. Of note, Im- enhancing tumor, and less cortical involvement. These aging-Community-AMARETTO also uncovers novel key modules also distinguish between measures of thickness of master drivers that are shared by these distinct key enhancing margin in both GBM and LGG. In GBM these mechanisms. STAT3, AHR, and CCR2 modules show higher expression in association with eloquent cortex, while in LGG, they show Imaging-AMARETTO Deciphers Clinical Relevance of higher expression in association with enhancement Multiomics Modules of Key Driver Mechanisms intensity. Of clinical relevance, higher expression levels of STAT3, AHR,and CCR2 modules (Figs 2, 3,and 6)are asso- Features of OLIG2 modules (Figs 4-6) are also consistent ciated with shorter survival in GBM and LGG, and these across GBM and LGG and diametrically opposed to those of modules also distinguish between molecular subclasses STAT3, AHR, and CCR2. In both GBM and LGG, higher of GBM and LGG. In GBM, the mesenchymal subclass is expression is associated with a higher proportion of non- represented by higher expression of these modules enhancing tumor and lower proportion of enhancing tumor. compared with the classical and proneural subclasses. In GBM, higher expression is also associated within speech In LGG, the astrocytoma subtype is characterized by receptive eloquent cortex, while in LGG, higher expression higher expression of these modules compared with the is associated with cortical involvement and the presence oligodendroglioma subtype. of cysts. JCO Clinical Cancer Informatics 425 Gevaert et al Driver Genes Expression Target Genes Phenotype Methylation Expression Associations STAT3 STAT3 STAT3 AHR Normalized Methylation state gene expression CNV or methylation alterations Predefined driver list Driver genes weights Phenotype associations Hypermethylated Methylation aberrations COSMIC Cancer Gene Clinical and molecular phenotypes Census Hypomethylated Not altered Imaging phenotypes −4 −2 0 2 4 −0.2 −0.1 0 0.1 0.2 Not altered Histological subtype Astrocytoma Oligodendroglioma IDH 1p19q subtype No No IDHmut-codel Astrocytoma Oligoastrocytoma Yes Yes IDHmut-non-codel IDHwt Oligodendroglioma f4 f5 f6 f11 f20 No Yes 01234 02468 02468 1234 FIG 2. Imaging-AMARETTO predicts STAT3 and AHR as known drivers of tumor-associated microglia and macrophage mechanisms in low-grade glioma (LGG). These heat maps present module 125 from The Cancer Genome Atlas (TCGA) LGG cohort that is a member of community 5 shared across the three cohorts (TCGA glioblastoma multiforme [GBM], Ivy Glioblastoma Atlas Project GBM, and TCGA LGG). For all patient-derived samples (rows), the heat maps show driver genes’ multiomics profiles (columns), including DNA methylation and RNA gene expression data (left panels); target genes’ (columns) RNA gene expression levels (middle panel); and relevant biomarkers (columns), including clinical and molecular and imaging phenotypes (right panels). This module includes eight driver genes (FNDC3B, IQGAP1, ANO6, ELK3, STAT3, TMOD3, CASP8, and ITPRIPL2) that jointly act as activator drivers of the 114 target genes in this module, including AHR and STAT3. Six driver genes are methylation driven (hypomethylation of ANO6, CASP8, ITPRIPL2, STAT3, and TMOD3 and hypermethylation of ANO6 and ELK3, inversely associated with their gene expression levels). Survival analysis reveals that increased expression of the genes in this module is associated with shorter survival (P = 7.3e-11; false discovery rate [FDR] = 1.0e-9). This module distinguishes between the histological subtypes (P = 1.6e-15; FDR = 2.1e-14), with lower expression representing the oligodendroglioma subtype (P = 1.0e-14; FDR = 1.1e-13) and higher expression reflecting the astrocytoma subtype (P = 1.5e-11; FDR = 2.8e-10). IDH mutation (mut) status and 1p19q subtypes are associated with the module expression (P = 3.9e-22; FDR = 1.5e-21), with higher expression levels representing the wild-type (wt) status. Association of Visually AcceSAble Rembrandt Images magnetic resonance imaging features with this module shows that the proportion of enhancing tumor (f5; P = 8.96e-7; FDR = 0.0000134) and enhancement intensity (f4; P = .0000395; FDR = 0.000494) are correlated with gene expression, while the proportions of nonenhancing tumor (f6; P = .0000452; FDR = 0.00036) and cortical involvement (f20; P = .0175; FDR = 0.118) are inversely correlated with expression. Module gene expression levels also distinguish between the thicknesses of enhancing margin (f11; P = .00137; FDR = 0.02). CNV, copy number variation. 426 © 2020 by American Society of Clinical Oncology ANO6 CASP8 ELK3 ITPRIPL2 STAT3 TMOD3 ELK3 TMOD3 IQGAP1 FNDC3B CASP8 ITPRIPL2 STAT3 ANO6 ERAP1 GALNT10 SLC24A1 DDR2 WWTR1 HELB SFT2D2 ATL3 PTPN13 LYST IGFBP5 ITGA11 SLC40A1 LAX1 TWSG1 CD2AP ZNF217 S1PR3 MAP7D3 ABCA13 MATN2 CRISPLD1 TNFAIP3 TNFRSF10A CFLAR LRCH1 CHSY1 RAG1 HEG1 ITGAV FEM1C TNFRSF19 CUBN RNFT1 BTN2A3 TEP1 AFF1 NFKB1 FZD5 RIPK1 FAM46A PPP1R3B ZFP36L2 MAGT1 TMED10 RB1 CDH11 MOBKL1B SOAT1 FNDC3B SPPL2A SLC35F5 FLNA VCL TRAM2 IQGAP1 FYCO1 REST ELK3 TMOD3 PCDH18 ANTXR2 PALLD CALD1 ERI1 TRAM1 CMTM6 PI4K2B TTC26 CALU PLCE1 IQGAP2 STAT3 ANO6 ADAM9 CLIC4 FRRS1 CREB3L2 TLN1 GNS PGM2 ROD1 ZNF468 CRLF3 TTF2 DNAJC10 DCBLD2 AHR CD274 ITGB3 MYOF RAB27A OSMR RBMS1 SEC24D FAM114A1 LTBP2 BTBD19 COL27A1 SEPN1 GALNT2 LOC100129034 TNC SH3PXD2B PHEX LOC653653 CROT PLEKHA9 PLA2G4A LAMA4 SIPA1L2 FRMD3 NRP2 C4orf36 Histological subtype Astrocytoma Oligodendroglioma IDH 1p19q subtype f4 f5 f6 f11 f20 Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer Driver Genes Target Genes Phenotype Expression Expression Associations THBS1 CCR2 AHR THBS1 CCR2 Normalized gene expression CNV or methylation alterations Predefined driver list Driver genes weights Phenotype associations Not altered COSMIC Cancer Gene Census Clinical and molecular phenotypes Imaging phenotypes −10 −5 0 5 10 −0.2 −0.1 0 0.1 0.2 Mesenchymal subclass Classical subclass Proneural subclass No No No Yes Yes Yes Anatomic structures CT IT LE region CTmvp CTpan CT No No No No No IT Yes Yes Yes Yes Yes LE Cancer stem cells CT stem cells CThbv stem cells CTmvp stem cells CT controls No No No CT reference genes Yes Yes Yes CThbv reference genes CTmvp reference genes CTpan stem cells CTpnz stem cells No No Yes Yes FIG 3. Imaging-AMARETTO predicts AHR and CCR2 as known drivers and identifies TBHS1 as a novel driver of tumor-associated microglia and macrophage mechanisms in glioblastoma multiforme (GBM). These heat maps present module 64 from the Ivy Glioblastoma Atlas Project (IvyGAP) GBM cohort that is a member of community 5 shared across the three cohorts (The Cancer Genome Atlas [TCGA] GBM, IvyGAP GBM, and TCGA low-grade glioma). For all patient- derived samples (rows), the heat maps show driver genes’ functional genomics profiles (columns), specifically RNA gene expression profiles (left panel); target genes’ (columns) RNA gene expression levels (middle panel); and relevant biomarkers (columns), including clinical and molecular and imaging phenotypes (right panels). This module includes nine driver genes (THBS1, CLEC2B, TNFAIP3, DSE, RNF149, MGP, CCR2, CSPG5,and CKB) that jointly act as activators (eg, CCR2, a squamous cell carcinoma tumor-rejection antigen recognized by T lymphocytes and candidate for specific immunotherapy) and repressors (CSPG5 and CKB) that drive the 87 target genes in this module, including AHR, CCR2,and THBS1. Association analyses confirm that higher expression of module genes reflects samples derived from patients in the mesenchymal subclass (P = .000047; false discovery rate [FDR] = 0.00015), while lower expression represents the classical subclass (P = .032; FDR = 0.067). Association of histopathology imaging features that study anatomic structures reveals that genes in this module distinguish between the samples derived from distinct anatomic structures (P = 1.8e-15; FDR = 4.3e-15), (continued on following page) JCO Clinical Cancer Informatics 427 CKB CSPG5 RNF149 CLEC2B TNFAIP3 DSE THBS1 MGP CCR2 RNF217 APLF TNIP3 PI15 C9orf3 LMCD1 MTA3 TNFRSF10A STX3 BICC1 ITGBL1 ZNF185 MICALCL CXCL1 MME FBXO22−AS1 BRE−AS1 SH2D2A SLC39A8 IL1R1 TGM2 BDKRB2 THBD CCL8 CYP1B1 CXCL6 COL13A1 HMGA2 AHR CLIC6 GEM CLEC2B THBS1 SEC24D ADAMTS1 PRSS23 PTBP3 CD2AP SLFN5 CD1D AMICA1 CCR2 ACP5 TM4SF1 SLFN11 FAM176A NABP1 RAB38 RUNX2 FCER1A TACSTD2 CCL3 C10orf125 CD52 EDEM1 P4HA3 CCL26 FCGR3B FHL2 SUSD1 SLC4A7 CPD DCBLD2 KCTD9 GCLM MAF MIR22HG REXO2 SRGN CREG1 RNF149 MRC1 IL7R SUMF1 IL13RA2 SEL1L3 C13orf33 SLC20A1 RARRES1 IGJ C11orf70 ITK CD200R1 INHBA C3orf52 IL18R1 LTB Mesenchymal subclass Classical subtype Proneural subclass Anatomic structures CT region IT region LE region CTmvp CTpan Cancer stem cells CT stem cells CThbv stem cells CTmvp stem cells CTpan stem cells CTpnz stem cells Gevaert et al Imaging-Community-AMARETTO Uncovers Known Key testing novel hypotheses of THBS1 and MAP2 as master and Novel Master Drivers Linking Mechanisms regulators of shared mechanisms that involve macrophage infiltration, vascularization, tumorigenesis, invasion, stem- Recent discoveries of STAT3, AHR,and CCR2 as drivers ness, and neurogenesis in glioma. of tumor-associated microglia and macrophage mechanisms are captured by modules in communities 1 DISCUSSION and 5: TCGA LGG module 125 (Fig 2) shows hypo- We developed the Imaging-AMARETTO algorithms and methylation of STAT3 as activator driver of AHR,with software tools for imaging genomics to facilitate systematic higher expression associated with shorter survival and interrogation of regulatory networks derived from multio- astrocytoma LGG, and IvyGAP GBM module 64 (Fig 3) mics data within and across related patient studies for their shows that higher expression of AHR and CCR2 is as- relevance to radiography and histopathology imaging fea- sociated with the presence of cancer stem cells and tures that predict clinical outcomes. We demonstrated its microvascular substructures and suggests as novel acti- utility through application to three patient studies of brain vator driver, THBS1, that plays important roles in mac- tumors, including multiomics and radiography imaging 61 62 rophage infiltration and angiogenesis, vascularization, data from the TCGA GBM and LGG studies and tran- and tumorigenesis in glioma. OLIG2 as a driver of scriptional and histopathology imaging data from the Ivy- 60,63,64 neurodevelopmental and stemness mechanisms is GAP GBM study. captured by modules in community 2: (1) TCGA LGG Our results show that Imaging-AMARETTO recapitulates module 91 (Fig 4) shows hypomethylation of OLIG2 as known key drivers of tumor-associated microglia and activator driver of this module, with higher expression macrophage mechanisms (STAT3, AHR, and CCR2) and associated with better survival, oligodendrocyte LGG, neurodevelopmental and stemness mechanisms (OLIG2). and IDH1 wild-type status; (2) TCGA GBM module 75 Imaging-AMARETTO provides interpretation of the un- (Fig 5)isdrivenby OLIG2, with higher expression asso- derlying molecular mechanisms in light of imaging bio- ciated with proneural and classical versus mesenchymal markers of clinical outcomes, and Imaging-Community- GBM, and suggests as novel repressor driver THBS1 and AMARETTO also uncovered novel master drivers THBS1 as novel activator driver hypomethylation of neuronal and MAP2 that establish relationships across these distinct marker MAP2 that plays important roles in microtubule- 65,66 67 mechanisms. associated neurogenesis and reduces invasiveness and stemness in glioma; and (3) TCGA GBM module 98 Of note, the querying of the Imaging-AMARETTO networks (Fig 6)shows CCR2 and OLIG2 co-acting as activator and for modules whose elevated expression is inversely asso- repressor drivers, respectively, highlighting their di- ciated with proportions of enhancing tumor and cancer ametrically opposed behavior, with higher CCR2 and stem cells on radiography and histopathology imaging, lower OLIG2 expression associated with mesenchymal respectively, shows that these modules are putatively versus proneural and classical GBM and suggesting as coregulated by activator drivers OLIG2 and MAP2 and novel repressor driver hypomethylation of MAP2,con- repressor drivers STAT3, AHR, CCR2, and THBS1. Thus, sistent with observations in TCGA GBM module 75 (Fig 5). we hypothesize that restoration of the function of OLIG2 and Using knockdown experiments of THBS1 from LINCS/ MAP2 and attenuation of the expression of STAT3, AHR, CMAP, we confirmed that THBS1 acts as activator and CCR2, and THBS1 potentially shift their target genes’ ex- repressor of its targets in IvyGAP GBM module 64 pression to more benign functional states associated with (Fig 3) and TCGA GBM module 75 (Fig 5), respectively. better survival in GBM and LGG. Thus, Imaging-Community-AMARETTO (Appendix Fig A1) This case study illustrates how Imaging-AMARETTO can be identified THBS1 and MAP2 as novel master drivers across used for imaging diagnostics and prognostics by in- the three STAT3, AHR, CCR2, and OLIG2 communities that terrogating multimodal and multiscale networks for imaging provide new insights into how these distinct key mecha- biomarkers to identify their clinically relevant underlying nisms are linked in glioma. Interesting avenues for further molecular mechanisms. Our network-based imaging ge- exploration with experimental validation studies include nomics tools are powerful hypothesis generators that FIG 3. (Continued). where cellular tumor (CT; P = .0000078; FDR = 0.000013) samples and, in particular, with substructure microvascular proliferation (CTmvp; P = 8.7e-14; FDR = 6.2e-13) and pseudopalisading cells around necrosis (CTpan; P = .018; FDR = 0.027) show elevated expression of module genes, while leading edge (LE; P = .018; FDR = 0.023) and infiltrating tumor (IT; P = .0014; FDR = 0.0045) samples show lower expression. Association of histopathology imaging features targeting cancer stem cells reveals that samples derived from cancer stem cells are generally associated with higher module gene expression compared with nonstem cells (P = 4.0e-16; FDR = 8.6e-15) and, specifically, elevated expression in stem-cell v control samples from substructures of the CT (P = .0000059; FDR = 0.00018), including hyperplastic blood vessels (CThbv; P = 1.2e-11; FDR = 5.2e-10), perinecrotic zone (CTpnz; P = 3.4e-9; FDR = 3.4e-8), CTpan (P = .000087; FDR = 0.00032), and CTmvp (P = .0039; FDR = 0.029). CNV, copy number variation. 428 © 2020 by American Society of Clinical Oncology Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer Driver Genes Phenotype Target Genes Expression Associations Expression Methylation OLIG2 OLIG2 OLIG2 Normalized Methylation state gene expression CNV or methylation drivers Predefined driver list Driver genes weights Phenotype associations Hypermethylated Methylation aberrations Driver not predefined Clinical and molecular phenotypes Hypomethylated Not altered COSMIC Cancer Gene Census Imaging phenotypes −4 −2 0 2 4 −0.4 −0.2 0 0.2 0.4 Histological subtype Astrocytoma Oligodendroglioma IDH 1p19q subtype Astrocytoma No No IDHmut-codel Oligoastrocytoma Yes Yes IDHmut-non-codel Oligodendroglioma IDHwt f5 f6 f8 f20 No No Yes Yes 02468 02468 FIG 4. Imaging-AMARETTO predicts OLIG2 as known driver of neurodevelopmental and stemness mechanisms in low-grade glioma (LGG). These heat maps present module 91 from The Cancer Genome Atlas (TCGA) LGG cohort that is a member of community 2 shared across the three cohorts (TCGA glioblastoma multiforme [GBM], Ivy Glioblastoma Atlas Project GBM, and TCGA LGG). For all patient-derived samples (rows), the heat maps show driver genes’ multiomics profiles (columns), including DNA methylation and RNA gene expression data (left panels); target genes’ (columns) RNA gene expression levels (middle panel); and relevant biomarkers (columns), including clinical and molecular and imaging phenotypes (right panels). This module includes nine driver genes (SOX8, OLIG2, EBF4, NLGN2, SHD, RANBP17, FERMT1, LOC254559, and C11orf63) that jointly act as activator and repressor (C11orf63) drivers of the 94 target genes in this module, including OLIG2. Six driver genes are methylation driven (hypomethylation of FERMT1 and OLIG2 and hypermethylation of C11orf63, NLGN2, RANBP17, and SOX8, inversely associated with their gene expression levels). Survival analysis reveals that increased expression of the genes in this module is associated with better survival (P = 7.1e-11; false discovery rate [FDR] = 1.0e-9). This module distinguishes between the histological subtypes (P = .00047; FDR = 0.00084), with lower expression representing the astrocytoma subtype (P = .00016; FDR = 0.00039) and higher expression reflecting the oligodendroglioma subtype (P = .0027; FDR = 0.0047). IDH mutation (mut) status and 1p19q subtypes are associated with the module expression (P = 2.0e-26; FDR = 1.1e-25), with lower expression levels representing the wild-type (wt) status. Association of Visually AcceSAble Rembrandt Images magnetic resonance imaging features with this module shows that the proportion of enhancing tumor (f5; P = .016; FDR = 0.032) is inversely correlated with gene expression, while the proportion of nonenhancing tumor (f6; P = .032; FDR = 0.066), cortical involvement (f20; P = .018; FDR = 0.12), and the presence of cysts (f8; P = .0095; FDR = 0.14) are correlated with expression. CNV, copy number variation. JCO Clinical Cancer Informatics 429 C11orf63 FERMT1 NLGN2 OLIG2 RANBP17 SOX8 C11orf63 FERMT1 LOC254559 OLIG2 SHD SOX8 NLGN2 EBF4 RANBP17 POM121L10P NXN FAM22D FAM22A TMEM121 ACAP3 ANKRD13B NLGN2 EBF4 GPSM1 TNK2 TBKBP1 LOC283174 NPPA DGCR2 PTCHD2 HES6 BCAN DLL1 DLL3 UPK2 VPS37D FAM110B RCOR2 SHD SOX8 CSNK1E PHF21B KDM4B C17orf69 PLK1S1 SNHG1 MGC21881 KIF26A TFAP4 ZNF34 BOP1 FRAT1 FRAT2 FXYD2 SPATA9 SYNPO2L DAPL1 LPPR1 MYT1 ACCN4 ZCCHC24 RTKN OLIG1 OLIG2 NEU4 BCAR1 AMOTL2 C12orf34 RBP3 ZC4H2 H2AFY2 HMX1 C2orf27A ASB13 EFS PCBP4 H1F0 LOC84989 HIST3H2A PCGF6 C6orf134 VAX2 POLR2F MARCKS SULF2 KLRK1 KLRC3 KLRC2 PAX1 SOX3 MDFI MARCKSL1 KIAA0114 FLJ10038 ZSCAN2 C17orf100 C22orf27 PHLPP1 AKAP7 EHD1 TLE6 CORO7 AG2 FAM110A LOC100132288 TRIM62 REC8 MXD4 Histological subtype Astrocytoma Oligodendroglioma IDH 1p19q. subtype f5 f6 f8 f20 Gevaert et al Driver Genes Expression Target Genes Phenotype Expression Associations MAP2 OLIG2 MAP2 THBS1 Normalized CNV state Methylation state gene expression CNV or methylation drivers Predefined driver list Driver genes weights Phenotype associations Deleted Hypomethylated Copy number alterations COSMIC Cancer Gene Census Clinical and molecular phenotypes Amplified Not altered Methylation aberrations Imaging phenotypes (VASARI) −10 −5 0 5 10 −0.4 −0.2 0 0.2 0.4 Not altered Not altered Molecular subclasses Mesenchymal Classical Proneural Classical No No No G-CIMP Yes Yes Yes Mesenchymal Neural Proneural f3 f6 f6 (< 5%) f6 (< 33%) No No No Yes 1234 5 Yes Yes FIG 5. Imaging-AMARETTO predicts OLIG2 as known driver and identifies MAP2 and THBS1 as novel drivers of neurodevelopmental and stemness mechanisms in glioblastoma multiforme (GBM). These heat maps present module 75 from The Cancer Genome Atlas (TCGA) GBM cohort that is a member of community 2 shared across the three cohorts (TCGA GBM, Ivy Glioblastoma Atlas Project GBM, and TCGA low-grade glioma). For all patient- derived samples (rows), the heat maps show driver genes’ multiomics profiles (columns), including DNA copy number variation (CNV), DNA methylation, and RNA genes’ expression data (left panels); target genes’ (columns) RNA gene expression levels (middle panel); and relevant biomarkers (columns), including clinical and molecular and imaging phenotypes (right panels). This module includes 10 driver genes (CSPG5, MAP2, OLIG2, CKB, GCSH, PPP2R2B, HEPACAM, CDH10, CTNND2, and THBS1) that jointly act as activator and repressor (THBS1) drivers of the 84 target genes in this module. One driver gene, GCSH, is copy number driven (somatic recurrent copy number deletions and amplifications associated with its gene expression), and three driver genes are methylation driven (hypomethylation of CDH10, MAP2, and PPP2R2B, inversely associated with their gene expression). Higher expression of genes in this module comprises the classical (P = 9.9e-13; false discovery rate [FDR] = 5.1e-12) and proneural (P = .0000024; FDR = 0.0000056) molecular subclasses, while lower expression of genes in this module represents the mesenchymal molecular subclass (P = 6.1e-40; FDR = 2.3e-38). Association of Visually AcceSAble Rembrandt Images (VASARI) magnetic resonance imaging features with this module shows that the proportion of nonenhancing tumor (f6; P = .013; FDR = 0.060, with nonenhancing proportion , 33%, P = .0014; FDR = 0.027, and nonenhancing proportion , 5%, P = .032; FDR = 0.17) and speech receptive eloquent cortex (f3; P = .046; FDR = 0.75) are correlated with the module gene expression. facilitate the testing of known hypotheses and uncovering of to better diagnostic and prognostic models of cancer and novel hypotheses for follow-up with experimental validation will open new avenues for drug discovery by integrating studies. We anticipate that our tools for network-based pharmacogenomic data into these networks, toward better 69,70 fusion of multiomics, clinical, and imaging data will lead therapeutics of cancer. 430 © 2020 by American Society of Clinical Oncology GCSH CNV CDH10 Methylation MAP2 PPP2R2B CTNND2 HEPACAM CKB PPP2R2B GCSH CDH10 OLIG2 MAP2 CSPG5 THBS1 MGC35440 CDKN2AIPNL C1orf96 HMGCR ARL6IP1 FIBIN NCALD C20orf58 CNTFR NCAN MMD2 LOC388419 NLGN3 CSPG5 SEZ6L ASCL1 ABAT FAM77D APCDD1 LDHB ACOX3 HAPLN1 C9orf58 KCNJ16 LRRC3B C8orf46 LOC389073 RLBP1 CIDEC PPP2R2B TMEM58 BCAN TTYH1 C1orf61 TUBB2B GPM6A METRN BMP7 ATP6V0E2 LRRC4B SYT17 BAI2 CDK2AP1 DBI KLHL25 FGFBP3 C18orf51 SLC24A3 CLEC4F LMO1 APBB2 LRRN1 CLCN2 ASTN1 FAM5B TRIM9 PCDH10 CPNE5 C1orf21 TSPAN12 SEZ6 MPPED2 RIC3 KCNA2 BAI3 C17orf76 ROBO2 CA14 CDH1 COL9A2 HAPLN4 CCDC12 ACAD8 MANEAL SOX2 DCLK2 C3orf39 ZNF501 BHLHB9 MOSC1 LRRN2 MAGED4B LOC402573 PKD1L1 Molecular subclasses Mesenchymal Classical Proneural f3 f6 f6 ( < 5 % ) f6 ( < 30 % ) Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer Driver Genes Expression Target Genes Phenotype Expression Associations MAP2 CCR2 OLIG2 MAP2 Normalized Methylation state gene expression CNV or methylation drivers Predefined driver list Driver genes weights Phenotype associations Hypermethylated Methylation aberrations COSMIC Cancer Gene Census Clinical and molecular phenotypes Hypomethylated Not altered Imaging phenotypes −4 −2 0 2 4 −0.04 −0.02 0 0.02 0.04 Not altered Molecular subclasses Mesenchymal Classical Proneural IDH1 status Classical No No No Wild type G-CIMP Yes Yes Yes R132C Mesenchymal R132G Neural R132H Proneural f5 f6 (< 5%) f6 (< 33%) No 1234 12345 Yes FIG 6. Imaging-AMARETTO predicts CCR2 and OLIG2 as co-acting known activator and repressor drivers and MAP2 as novel repressor driver linking tumor-associated microglia and macrophage mechanisms with neurodevelopmental and stemness mechanisms in glioblastoma multiforme (GBM). These heat maps present module 98 from The Cancer Genome Atlas (TCGA) GBM cohort that is a member of community 2 shared across the three cohorts (TCGA GBM, Ivy Glioblastoma Atlas Project GBM, and TCGA low-grade glioma). For all patient-derived samples (rows), the heat maps show driver genes’ multiomics profiles (columns), including DNA methylation and RNA gene expression data (left panels); target genes’ (columns) RNA gene expression levels (middle panel); and relevant biomarkers (columns), including clinical and molecular and imaging phenotypes (right panels). This module includes 10 driver genes (TEC, SPINT1, CCR2, MAN1A1, OLIG2, NPAS3, MAP2, RUFY3, NTRK3, and CSPG5) that jointly act as activator (TEC, SPINT1, CCR2, MAN1A1) and repressor (OLIG2, NPAS3, MAP2, RUFY3, NTRK3, CSPG5) drivers of the 98 target genes in this module. Two driver genes, MAP2 and RUFY3, are methylation driven (hypomethylation of MAP2 and hyper- and hypomethylation of RUFY3, inversely associated with their gene expression levels). Higher expression of the genes in this module represents the mesenchymal molecular subclass (P = 5.1e-35; false discovery rate [FDR] = 5.9e-34), while lower expression of genes in this module comprises the classical (P = 6.1e-15; FDR = 3.8e-14), G-CIMP (P = .000012; FDR = 0.000030), and proneural (P = .00034; FDR = 0.00062) molecular subclasses. IDH1 mutation status is associated with the module expression (P = .00069; FDR = 0.0019), with higher expression levels representing the wild-type status. Association of Visually AcceSAble Rembrandt Images magnetic resonance imaging features with this module shows that the proportion of nonenhancing tumor is inversely correlated (f6; P = .045; FDR = 0.13) with nonenhancing proportion , 33% (P = .0086; FDR = 0.067) and the proportion of enhancing tumor is correlated (f5; P = .020; FDR = 0.28) with the module gene expression. JCO Clinical Cancer Informatics 431 Methylation MAP2 RUFY3 CCR2 MAN1A1 SPINT1 TEC RUFY3 OLIG2 MAP2 CSPG5 NTRK3 NPAS3 NLRP13 WDR4 EIF2S3 GSR TACSTD2 TM4SF19 TRPV4 ATP8B1 EDARADD INDOL1 SULT1E1 PRRX2 CRABP2 ASAM ZNF185 TPBG CAPN6 CYP26A1 GNAT2 WT1 ENOSF1 VANGL1 RHOD STAP2 C4orf7 CSF2 BDKRB1 MOCOS TNFRSF9 C3orf52 LGALS12 PAX8 ROR2 PTGER2 MME NTRK1 TNIP3 CHMP4C STEAP4 PSTPIP2 MCTP2 TEC GIPC2 TNFRSF10A ARHGEF5 TFAP2C MTL5 IL11 ADAMTSL1 AKR1B10 TMEM166 HGF CD70 RAB38 MGC16075 KCTD9 BFSP1 C6orf150 ELF4 MBOAT1 GCHFR TIFA H2AFJ SLC38A2 SLC38A4 TCN1 MBNL3 FRRS1 GPR1 FRK PTGFR ENPP1 SULT1C2 LONRF3 HIST1H2BK HIST2H2AA3 DNAJC1 ITLN2 WBSCR28 LAMA3 RNF168 C1orf211 DEFB1 TUFT1 DMRTA1 CPA6 XKRX EPHA3 RP6−213H19.1 FLJ21986 ASB9 PMAIP1 HOXB5 HOXB6 MGC45800 DKKL1 OSR2 WNT5B Molecular subclasses Mesenchymal Classical Proneural IDH1 status f5 f6 ( < 5 % ) f6 ( < 33 % ) Gevaert et al Financial support: Olivier Gevaert, Anna M. Krichevsky, Jill P. Mesirov, AFFILIATIONS Vincent Carey, Nathalie Pochet Stanford Center for Biomedical Informatics Research, Department of Collection and assembly of data: Olivier Gevaert, Mohsen Nabian, Artur Medicine and Biomedical Data Science, Stanford University, Stanford, Manukyan, Jishu Xu, Nathalie Pochet CA Data analysis and interpretation: Olivier Gevaert, Mohsen Nabian, Celine Cell Circuits Program, Broad Institute of MIT and Harvard, Cambridge, Everaert, Jayendra Shinde, Artur Manukyan, Ted Liefeld, Thorin Tabor, MA Joachim Lupberger, Brian J. Haas, Thomas F. Baumert, Mikel Hernaez, Ann Romney Center for Neurologic Diseases, Department of Neurology, Michael Reich, Francisco J. Quintana, Anna M. Krichevsky, Vincent Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 4 Carey, Nathalie Pochet Department of Medicine, University of California, San Diego, San Diego, Manuscript writing: All authors CA 5 Final approval of manuscript: All authors Rush University Medical Center, Chicago, IL 6 Accountable for all aspects of the work: All authors INSERM, U1110, Institut de Recherche sur les Maladies Virales et ´ ´ Hepatiques, Universite de Strasbourg, Institut Hopitalo-Universitaire, Hopitaux Universitaires de Strasbourg, Strasbourg, France AUTHORS’ DISCLOSURES OF POTENTIAL CONFLICTS OF Carl R. Woese Institute for Genomic Biology, University of Illinois at INTEREST Urbana-Champaign, Champaign, IL The following represents disclosure information provided by authors of Channing Division of Network Medicine, Brigham and Women’s this manuscript. All relationships are considered compensated unless Hospital, Harvard Medical School, Boston, MA otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the CORRESPONDING AUTHOR subject matter of this manuscript. For more information about ASCO’s Nathalie Pochet, PhD, Brigham and Women’s Hospital and Harvard conflict of interest policy, please refer to www.asco.org/rwc or ascopubs. Medical School, 60 Fenwood Rd, Boston, MA 02115, Broad Institute of org/cci/author-center. MIT and Harvard, 415 Main Street, Cambridge, MA 02142; e-mail: Open Payments is a public database containing information reported by npochet@broadinstitute.org. companies about payments made to US-licensed physicians (Open Payments). EQUAL CONTRIBUTION Olivier Gevaert O.G. and M.N. are equally contributing authors. Research Funding: Paragon Development Systems (Inst), Lucence J.P.M., V.C., and N.P. are equally contributing authors. Diagnostics (Inst) Brian J. Haas SUPPORT Consulting or Advisory Role: Immuneering, Diamond Age Data Science Supported by the National Cancer Institute (NCI) Informatics Technology Thomas F. Baumert for Cancer Research (R21CA209940 [O.G., T.F.B., J.P.M., N.P.], Stock and Other Ownership Interests: Alentis Therapeutics, Alentis U01CA214846 [V.C.], U01CA214846 Collaborative Set-aside [O.G., Therapeutics (Inst) A.M.K., V.C., N.P.], U24CA194107 [J.P.M.], U24CA220341 [J.P.M.], Research Funding: Janssen Research & Development (Inst), Alentis U24CA180922 [B.J.H., N.P., A. Regev]), NCI (R01CA215072 [A.M.K.], Therapeutics (Inst) U01CA217851 [O.G.], U01CA199241 [O.G.], Stanford CTD [O.G.]), Patents, Royalties, Other Intellectual Property: Patent applications on National Institute of Allergy and Infectious Diseases (R03AI131066 Claudin-1 targeting monoclonal antibodies for liver disease, fibrosis, and [T.F.B., N.P.]), and National Institute of Biomedical Imaging and cancer (Inst); patent applications on liver disease drug discovery systems Bioengineering (R01EB020527 [O.G.]). The content is solely the (Inst) responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Vincent Carey Employment: CleanSlate (I) SOFTWARE AND RESOURCES Honoraria: Gilead Sciences (I) The source code of Imaging-AMARETTO is available from GitHub, and Research Funding: Bayer AG a Notebook for its application to the three brain cancer cohorts is No other potential conflicts of interest were reported. available from Google Colaboratory, including links to interactive HTML reports and NDEx networks: http://portals.broadinstitute.org/pochetlab/ JCO_CCI_Imaging-AMARETTO/Imaging-AMARETTO_Software_ ACKNOWLEDGMENT Resources.html. We thank Aviv Regev, PhD, for helpful discussions on developing network biology and medicine approaches for studying complex human diseases. We also thank Howard Weiner, MD, and Vijay Kuchroo, DVM, PhD, for AUTHOR CONTRIBUTIONS helpful insights into deciphering the mechanistic basis of gliomas. Conception and design: Olivier Gevaert, Mohsen Nabian, Shaimaa Bakr, Artur Manukyan, Erik J. Uhlmann, Jill P. Mesirov, Vincent Carey, Nathalie Pochet REFERENCES 1. National Cancer Institute: The Cancer Genome Atlas Program, 2018. https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga 2. National Cancer Institute Office of Cancer Clinical Proteomics Research: OCCPR: A leader in cancer proteomics and proteogenomics, 2019. https://proteomics. cancer.gov 3. The Cancer Imaging Archive: Welcome to The Cancer Imaging Archive, 2019. https://www.cancerimagingarchive.net 432 © 2020 by American Society of Clinical Oncology Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer 4. Cancer Genome Atlas Research Network: Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455:1061- 1068, 2008 [Erratum: Nature 494:506, 2013] 5. Brennan CW, Verhaak RGW, McKenna A, et al: The somatic genomic landscape of glioblastoma. Cell 155:462-477, 2013 [Erratum: Cell 157:753, 2014] 6. Brat DJ, Verhaak RGW, Aldape KD, et al: Comprehensive, integrative genomic analysis of diffuse lower-grade gliomas. N Engl J Med 372:2481-2498, 2015 7. Bakas S, Akbari H, Sotiras A, et al: Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci Data 4:170117, 2017 8. Cancer Imaging Archive: VASARI Research Project. https://wiki.cancerimagingarchive.net/display/Public/VASARI+Research+Project 9. Cancer Imaging Archive: TCGA Glioma Phenotype Research Group. https://wiki.cancerimagingarchive.net/display/Public/TCGA+Glioma+Phenotype+Research+Group 10. Patel AP, Tirosh I, Trombetta JJ, et al: Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344:1396-1401, 2014 11. Tirosh I, Venteicher AS, Hebert C, et al: Single-cell RNA-seq supports a developmental hierarchy in human oligodendroglioma. Nature 539:309-313, 2016 12. Filbin MG, Tirosh I, Hovestadt V, et al: Developmental and oncogenic programs in H3K27M gliomas dissected by single-cell RNA-seq. Science 360:331-335, 13. Puchalski RB, Shah N, Miller J, et al: An anatomic transcriptional atlas of human glioblastoma. Science 360:660-663, 2018 14. IvyGAP: Ivy Glioblastoma Atlas Project, 2019. https://glioblastoma.alleninstitute.org 15. IvyGAP: Data download. https://glioblastoma.alleninstitute.org/static/download.html 16. Rutman AM, Kuo MD: Radiogenomics: Creating a link between molecular diagnostics and diagnostic imaging. Eur J Radiol 70:232-241, 2009 17. Segal E, Sirlin CB, Ooi C, et al: Decoding global gene expression programs in liver cancer by noninvasive imaging. Nat Biotechnol 25:675-680, 2007 18. Diehn M, Nardini C, Wang DS, et al: Identification of noninvasive imaging surrogates for brain tumor gene-expression modules. Proc Natl Acad Sci U S A 105:5213-5218, 2008 19. Gevaert O, Xu J, Hoang CD, et al: Non-small cell lung cancer: Identifying prognostic imaging biomarkers by leveraging public gene expression microarray data-- methods and preliminary results. Radiology 264:387-396, 2012 20. Itakura H, Achrol AS, Mitchell LA, et al: Magnetic resonance image features identify glioblastoma phenotypic subtypes with distinct molecular pathway activities. Sci Transl Med 7:303ra138, 2015 21. Cheerla A, Gevaert O: Deep learning with multimodal representation for pancancer prognosis prediction. Bioinformatics 35:i446-i454, 2019 22. Momeni A, Thibault M, Gevaert O: Dropout-enabled ensemble learning for multi-scale biomedical data. in Crimi A, Bakas S, Kuijf H, et al (eds): Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. Cham, Switzerland, Springer International Publishing, 2019, pp 407-415 23. Broad Institute: The *AMARETTO framework for network biology and medicine: Linking disease, drivers, targets and drugs via graph-based fusion of multi- omics, clinical, imaging, and perturbation data. http://portals.broadinstitute.org/pochetlab/amaretto.html 24. GitHub: broadinstitute/ImagingAMARETTO. https://github.com/broadinstitute/ImagingAMARETTO 25. Google Colaboratory: Imaging-AMARETTO: An imaging genomics software tool to systematically interrogate multi-omics networks for relevance to radiography and histopathology imaging biomarkers of clinical outcomes with application to studies of brain tumors. https://colab.research.google.com/drive/14u1KZJ3Gf- 9qjDycyBKzBiN5VzzOa2xU#scrollTo=LujO14znmO0J 26. Stubbs BJ, Gopaulakrishnan S, Glass K, et al: TFutils: Data structures for transcription factor bioinformatics. F1000Res 8:152, 2019 27. Carey V, Gopaulakrishnan S: TFutils: Bioconductor version release (3.10). https://bioconductor.org/packages/TFutils 28. Subramanian A, Tamayo P, Mootha VK, et al: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102:15545-15550, 2005 29. Liberzon A, Birger C, Thorvaldsdottir H, et al: The Molecular Signatures Database (MSigDB) Hallmark gene set collection. Cell Syst 1:417-425, 2015 30. COSMIC: COSMIC, the Catalogue of Somatic Mutations in Cancer. https://cancer.sanger.ac.uk/cosmic 31. Mermel CH, Schumacher SE, Hill B, et al: GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol 12:R41, 2011 32. Cedoz PL, Prunello M, Brennan K, et al: MethylMix 2.0: An R package for identifying DNA methylation genes. Bioinformatics 34:3044-3046, 2018 33. Gevaert O, Tibshirani R, Plevritis SK: Pancancer analysis of DNA methylation-driven genes using MethylMix. Genome Biol 16:17, 2015 34. Champion M, Brennan K, Croonenborghs T, et al: Module analysis captures pancancer genetically and epigenetically deregulated cancer driver genes for smoking and antiviral response. EBioMedicine 27:156-166, 2018 35. Shinde J, Everaert C, Bakr S, et al: AMARETTO: Regulatory network inference and driver gene evaluation using integrative multi-omics analysis and penalized regression: Bioconductor version release 3.10, 2019. https://bioconductor.org/packages/AMARETTO 36. Gevaert O, Villalobos V, Sikic BI, et al: Identification of ovarian cancer driver genes by using module network integration of multi-omics data. Interface Focus 3: 20130013, 2013 [Erratum Interface Focus 4:20140023, 2014] 37. Gevaert O, Plevritis S: Identifying master regulators of cancer and their downstream targets by integrating genomic and epigenomic features. Pac Symp Biocomput 123-134, 2013 38. Zou H, Hastie T: Regularization and variable selection via the elastic net. J R Stat Soc Series B Stat Methodol 67:301-320, 2005 39. Newman AM, Liu CL, Green MR, et al: Robust enumeration of cell subsets from tissue expression profiles. Nat Methods 12:453-457, 2015 40. Stanford University: CIBERSORT. https://cibersort.stanford.edu 41. Ben-Porath I, Thomson MW, Carey VJ, et al: An embryonic stem cell-like gene expression signature in poorly differentiated aggressive human tumors. Nat Genet 40:499-507, 2008 42. Marson A, Levine SS, Cole MF, et al: Connecting microRNA genes to the core transcriptional regulatory circuitry of embryonic stem cells. Cell 134:521-533, 43. Kim J, Woo AJ, Chu J, et al: A Myc network accounts for similarities between embryonic stem and cancer cell transcription programs. Cell 143:313-324, 2010 44. Newman MEJ, Girvan M: Finding and evaluating community structure in networks. Phys Rev E Stat Nonlin Soft Matter Phys 69:026113, 2004 45. Baryawno N, Przybylski D, Kowalczyk MS, et al: A cellular taxonomy of the bone marrow stroma in homeostasis and leukemia. Cell 177:1915-1932.e16, 2019 46. ENCODE Project Consortium: The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306:636-640, 2004 47. Lachmann A, Xu H, Krishnan J, et al: ChEA: Transcription factor regulation inferred from integrating genome-wide ChIP-X experiments. Bioinformatics 26:2438-2444, 2010 48. Rouillard AD, Gundersen GW, Fernandez NF, et al: The Harmonizome: A collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database (Oxford) 2016:baw100, 2016 JCO Clinical Cancer Informatics 433 Gevaert et al 49. Lamb J, Crawford ED, Peck D, et al: The Connectivity Map: Using gene-expression signatures to connect small molecules, genes, and disease. Science 313:1929-1935, 2006 50. Subramanian A, Narayan R, Corsello SM, et al: A next generation Connectivity Map: L1000 platform and the first 1,000,000 profiles. Cell 171:1437-1452.e17, 51. Verhaak RGW, Hoadley KA, Purdom E, et al: Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 17:98-110, 2010 52. Gevaert O, Mitchell LA, Achrol AS, et al: Glioblastoma multiforme: Exploratory radiogenomic analysis by using quantitative image features. Radiology 273:168-174, 2014 53. Liu TT, Achrol AS, Mitchell LA, et al: Magnetic resonance perfusion image features uncover an angiogenic subgroup of glioblastoma patients with poor survival and better response to antiangiogenic treatment. Neuro-oncol 19:997-1007, 2017 54. Nicolasjilwan M, Hu Y, Yan C, et al: Addition of MR imaging features and genetic biomarkers strengthens glioblastoma survival prediction in TCGA patients. J Neuroradiol 42:212-221, 2015 55. NDEx: Welcome to the Network Data Exchange. https://home.ndexbio.org 56. GenePattern: AMARETTO supporting data files. https://datasets.genepattern.org/?prefix=data/module_support_files/Amaretto 57. Carey V: ivygapSE: A SummarizedExperiment for Ivy-GAP data: Bioconductor version release 3.10, 2019. https://bioconductor.org/packages/ivygapSE 58. Sevenich L: Brain-resident microglia and blood-borne macrophages orchestrate central nervous system inflammation in neurodegenerative disorders and brain cancer. Front Immunol 9:697, 2018 59. Takenaka MC, Gabriely G, Rothhammer V, et al: Control of tumor-associated macrophages and T cells in glioblastoma via AHR and CD39. Nat Neurosci 22: 729-740, 2019 [Erratum: Nat Neurosci 22:1533, 2019] 60. Krichevsky AM, Uhlmann EJ: Oligonucleotide therapeutics as a new class of drugs for malignant brain tumors: Targeting mRNAs, regulatory RNAs, mutations, combinations, and beyond. Neurotherapeutics 16:319-347, 2019 61. Offer S, Menard JA, Perez JE, et al: Extracellular lipid loading augments hypoxic paracrine signaling and promotes glioma angiogenesis and macrophage infiltration. J Exp Clin Cancer Res 38:241, 2019 62. Daubon T, Leon C, Clarke K, et al: Deciphering the complex role of thrombospondin-1 in glioblastoma development. Nat Commun 10:1146, 2019 63. Suva` ML, Rheinbay E, Gillespie SM, et al: Reconstructing and reprogramming the tumor-propagating potential of glioblastoma stem-like cells. Cell 157:580-594, 2014 64. Ceccarelli M, Barthel FP, Malta TM, et al: Molecular profiling reveals biologically discrete subsets and pathways of progression in diffuse glioma. Cell 164:550-563, 2016 65. Gao L, Huang S, Zhang H, et al: Suppression of glioblastoma by a drug cocktail reprogramming tumor cells into neuronal like cells. Sci Rep 9:3462, 2019 [Erratum: Sci Rep 10:2971, 2020] 66. Yuan J, Zhang F, Hallahan D, et al: Reprogramming glioblastoma multiforme cells into neurons by protein kinase inhibitors. J Exp Clin Cancer Res 37:181, 2018 67. Zhou Y, Wu S, Liang C, et al: Transcriptional upregulation of microtubule-associated protein 2 is involved in the protein kinase A-induced decrease in the invasiveness of glioma cells. Neuro-oncol 17:1578-1588, 2015 68. Yi R, Feng J, Yang S, et al: miR-484/MAP2/c-Myc-positive regulatory loop in glioma promotes tumor-initiating properties through ERK1/2 signaling. J Mol Histol 49:209-218, 2018 69. Emmert-Streib F, Dehmer M, Haibe-Kains B: Gene regulatory networks and their applications: Understanding biological and medical problems in terms of networks. Front Cell Dev Biol 2:38, 2014 70. Wooden B, Goossens N, Hoshida Y, et al: Using big data to discover diagnostics and therapeutics for gastrointestinal and liver diseases. Gastroenterology 152:53-67.e3, 2017 71. Imaging-AMARETTO HTML report of module 125 from the TCGA LGG cohort. http://portals.broadinstitute.org/pochetlab/JCO_CCI_Imaging-AMARETTO/ Imaging-AMARETTO_HTML_Report_TCGA-GBM_IVYGAP-GBM_TCGA-LGG/TCGA_LGG/AMARETTOhtmls/modules/module125.html 72. Imaging-AMARETTO HTML report of module 64 from the IvyGAP GBM cohort. http://portals.broadinstitute.org/pochetlab/JCO_CCI_Imaging-AMARETTO/ Imaging-AMARETTO_HTML_Report_TCGA-GBM_IVYGAP-GBM_TCGA-LGG/Ivygap_GBM/AMARETTOhtmls/modules/module64.html 73. Imaging-AMARETTO HTML report of module 91 from the TCGA LGG cohort. http://portals.broadinstitute.org/pochetlab/JCO_CCI_Imaging-AMARETTO/ Imaging-AMARETTO_HTML_Report_TCGA-GBM_IVYGAP-GBM_TCGA-LGG/TCGA_LGG/AMARETTOhtmls/modules/module91.html 74. Imaging-AMARETTO HTML report of module 75 from the TCGA GBM cohort. http://portals.broadinstitute.org/pochetlab/JCO_CCI_Imaging-AMARETTO/ Imaging-AMARETTO_HTML_Report_TCGA-GBM_IVYGAP-GBM_TCGA-LGG/TCGA_GBM/AMARETTOhtmls/modules/module75.html 75. Imaging-AMARETTO HTML report of module 98 from the TCGA GBM cohort. http://portals.broadinstitute.org/pochetlab/JCO_CCI_Imaging-AMARETTO/ Imaging-AMARETTO_HTML_Report_TCGA-GBM_IVYGAP-GBM_TCGA-LGG/TCGA_GBM/AMARETTOhtmls/modules/module98.html nn n 434 © 2020 by American Society of Clinical Oncology Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer APPENDIX Communities 1 11 21 31 2 12 22 32 3 13 23 33 4 14 24 34 5 15 25 35 6 16 26 36 7 17 27 37 8 18 28 38 9 19 29 10 20 30 C2 C5 C1 Networks TCGA GBM cohort TCGA LGG cohort IvyGAP GBM cohort Stemness signatures Immune signatures FIG A1. Imaging-Community-AMARETTO identifies known drivers AHR, STAT3, CCR2,and OLIG2 and uncovers novel master drivers THBS1 and MAP2 that link distinct key mechanisms that underlie glioma. In this Imaging-Community- AMARETTO graph, the nodes represent the regulatory modules or circuits that are learned from the three studies of brain cancer (The Cancer Genome Atlas [TCGA] glioblastoma multiforme [GBM], Ivy Glioblastoma Project [IvyGAP] GBM, and TCGA low-grade glioma [LGG]; node sizes are scaled by the number of driver and target genes in the modules), the edges represent the extent of overlapping genes between the modules across the three cohorts (edge thickness is scaled with the significance of the overlapping genes between modules), and the clouds represent how the modules across the three cohorts are grouped into the communities or subnetworks that are learned using the Girvan-Newman edge betweenness community detection algorithm. Imaging-Community-AMARETTO organized modules regulated by known drivers of tumor-associated microglia and macrophage mechanisms STAT3, AHR,and CCR2 and neurodevelopmental and stemness mechanism OLIG2 into three communities. Community 5 (C5) links STAT3, AHR,and CCR2, and C1 links AHR and CCR2 as activators of shared modules. Modules regulated by OLIG2 are represented in C2 with OLIG2 as activator of its modules. Of note, C2 also contains a module that links OLIG2 and CCR2 co-acting as repressor and activator, respectively (Fig 6). Imaging-Community-AMARETTO also uncovered THBS1 and MAP2 as novel master drivers across the three STAT3, AHR, CCR2,and OLIG2 communities that provide new insights into how these distinct key mechanisms are linked in glioma. THBS1 is an activator driver of three modules in C1 and C5, and a repressor driver of three modules in C2. MAP2 is a repressor driver of three modules in C1 and C5 and an activator driver of six modules in C2 (except repressor of TCGA GBM module 98). Taken together, C1 links AHR and THBS1 (TCGA GBM module 79). C2 links OLIG2, MAP2, and THBS1 (TCGA GBM module 75; Fig 5); OLIG2 and MAP2 (IvyGAP GBM module 38, TCGA GBM module 61); CCR2, OLIG2,and MAP2 (TCGA GBM module 98; Fig 6). C5 links CCR2, AHR,and THBS1 (IvyGAP GBM module 64; Fig 3)and AHR and STAT3 (TCGA LGG module 125; Fig 2). Using genetic knockdown experiments of THBS1 from LINCS/Connectivity Map, we confirmed that THBS1 acts as activator and repressor drivers of its targets in IvyGAP GBM module 64 (Fig 3) and TCGA GBM module 75 (Fig 5), respectively (http://portals.broadinstitute.org/ pochetlab/JCO_CCI_Imaging-AMARETTO/Imaging-AMARETTO_HTML_Report_TCGA-GBM_IVYGAP-GBM_TCGA- LGG/; www.ndexbio.org/#/network/16820740-d7ea-11e9-bb65-0ac135e8bacf). JCO Clinical Cancer Informatics 435 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png JCO Clinical Cancer Informatics Wolters Kluwer Health

Imaging-AMARETTO: An Imaging Genomics Software Tool to Interrogate Multiomics Networks for Relevance to Radiography and Histopathology Imaging Biomarkers of Clinical Outcomes

Loading next page...
 
/lp/wolters-kluwer-health/imaging-amaretto-an-imaging-genomics-software-tool-to-interrogate-5RejDU5Nm5

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Wolters Kluwer Health
Copyright
(C) 2020 American Society of Clinical Oncology
ISSN
2473-4276
DOI
10.1200/CCI.19.00125
Publisher site
See Article on Publisher Site

Abstract

original reports abstract SPECIAL SERIES: INFORMATICS TOOLS FOR CANCER RESEARCH AND CARE Imaging-AMARETTO: An Imaging Genomics Software Tool to Interrogate Multiomics Networks for Relevance to Radiography and Histopathology Imaging Biomarkers of Clinical Outcomes 1,2 2,3 1 2,3 1 Olivier Gevaert, PhD ; Mohsen Nabian, PhD ; Shaimaa Bakr, MS ; Celine Everaert, PhD ; Jayendra Shinde, PhD ; 2,3 4 4 2,5 6 2 Artur Manukyan, PhD ; Ted Liefeld, SM ; Thorin Tabor, MS ; Jishu Xu, PhD ; Joachim Lupberger, PhD ; Brian J. Haas, MS ; 6 7 4 2,3 3 Thomas F. Baumert, MD ; Mikel Hernaez, PhD ; Michael Reich, BS ; Francisco J. Quintana, PhD ; Erik J. Uhlmann, MD ; 3 4 2,8 2,3 Anna M. Krichevsky, PhD ; Jill P. Mesirov, PhD ; Vincent Carey, PhD ; and Nathalie Pochet, PhD PURPOSE The availability of increasing volumes of multiomics, imaging, and clinical data in complex diseases such as cancer opens opportunities for the formulation and development of computational imaging genomics methods that can link multiomics, imaging, and clinical data. METHODS Here, we present the Imaging-AMARETTO algorithms and software tools to systematically interrogate regulatory networks derived from multiomics data within and across related patient studies for their relevance to radiography and histopathology imaging features predicting clinical outcomes. RESULTS To demonstrate its utility, we applied Imaging-AMARETTO to integrate three patient studies of brain tumors, specifically, multiomics with radiography imaging data from The Cancer Genome Atlas (TCGA) glio- blastoma multiforme (GBM) and low-grade glioma (LGG) cohorts and transcriptomics with histopathology imaging data from the Ivy Glioblastoma Atlas Project (IvyGAP) GBM cohort. Our results show that Imaging- AMARETTO recapitulates known key drivers of tumor-associated microglia and macrophage mechanisms, mediated by STAT3, AHR, and CCR2, and neurodevelopmental and stemness mechanisms, mediated by OLIG2. Imaging-AMARETTO provides interpretation of their underlying molecular mechanisms in light of imaging biomarkers of clinical outcomes and uncovers novel master drivers, THBS1 and MAP2, that establish relationships across these distinct mechanisms. CONCLUSION Our network-based imaging genomics tools serve as hypothesis generators that facilitate the interrogation of known and uncovering of novel hypotheses for follow-up with experimental validation studies. We anticipate that our Imaging-AMARETTO imaging genomics tools will be useful to the community of bio- medical researchers for applications to similar studies of cancer and other complex diseases with available multiomics, imaging, and clinical data. JCO Clin Cancer Inform 4:421-435. © 2020 by American Society of Clinical Oncology Licensed under the Creative Commons Attribution 4.0 License ASSOCIATED CONTENT 4,5 INTRODUCTION patients with glioblastoma multiforme (GBM) and Appendix approximately 500 patients with low-grade glioma Author affiliations Major collaborative initiatives have unleashed a myriad and support (LGG). The TCIA Visually AcceSAble Rembrandt of multiomics, clinical, and imaging data for large 7-9 information (if Images (VASARI) project curated a feature set of patient cohorts in studies of cancer, such as multio- applicable) appear at approximately 30 magnetic resonance imaging mics and clinical data from The Cancer Genome Atlas the end of this 1 (MRI)–derived features on the basis of specialists’ article. (TCGA) and the Clinical Proteomic Tumor Analysis 2 review that is available for approximately 200 patients Accepted on March Consortium (CPTAC) and radiographic and histopa- with GBM and approximately 180 patients with LGG. 16, 2020 and thology imaging data from The Cancer Imaging Ar- published at 3 chive (TCIA). For example, the brain tumor section of A trade-off exists between the number of patients in- ascopubs.org/journal/ TCGA provides multiomics profiles, including RNA cluded in a data set and the depth of analysis that has cci on May 8, 2020: sequencing (RNA-seq), DNA copy number variation, moved to increasing levels of refinement, ranging from DOI https://doi.org/10. 10-12 1200/CCI.19.00125 and DNA methylation data for approximately 500 studying tissues, to cell populations, to single-cell 421 Gevaert et al CONTEXT Key Objective Imaging-AMARETTO provides software tools for imaging genomics through multiomics, clinical, and imaging data fusion within and across multiple patient studies of cancer, toward better diagnostic and prognostic models of cancer. Knowledge Generated Our network-based imaging genomics tools serve as powerful hypothesis generators that facilitate the testing of known hypotheses and uncovering of novel hypotheses for follow-up with experimental validation studies. Our case study that integrated multiple studies of brain cancer illustrates how Imaging-AMARETTO can be used for imaging diagnostics and prognostics by interrogating multimodal and multiscale networks for imaging biomarkers to identify their clinically relevant underlying molecular mechanisms. Relevance We anticipate that our Imaging-AMARETTO tools for network-based fusion of multiomics, clinical, and imaging data will directly lead to better diagnostic and prognostic models of cancer. In addition, our tools for network biology and medicine will open new avenues for drug discovery by integrating pharmacogenomics data into these networks, toward better therapeutics of cancer. sequencing. For example, the Ivy Glioblastoma Atlas Project mechanisms, with interpretation of the underlying mo- 13-15 (IvyGAP) provides 270 transcriptomic profiles refined lecular mechanisms in light of imaging biomarkers of clinical outcomes. through histopathology imaging and annotated by a con- sensus of histopathologists for studying anatomic structures METHODS and cancer stem cells for a subset of approximately 30 The *AMARETTO Software Architecture patients from the TCGA GBM cohort. The *AMARETTO framework (Fig 1) provides tools for In parallel, quantitative imaging provides tools that are network-based fusion of multiomics, clinical, and imaging capable of processing large volumes of radiography and data within and across multiple patient studies of cancer. histopathology images, such as deep convolutional neural Specifically, this framework offers modular and comple- networks. The promising field of radiogenomics is based on mentary solutions to multimodal and multiscale aspects the idea that entities at different scales, such as molecules, of network-based modeling within and across studies cells, and tissues, are linked to one another and, therefore, of cancer through the AMARETTO and Community- may be modeled as a whole. Studies have shown that AMARETTO algorithms, respectively. In this work, we quantitative image features extracted from radiography present an imaging genomics software toolbox that com- imaging data are associated and predictive of gene ex- prises the newly formulated Imaging-AMARETTO and 17-20 pression patterns from tissues of matched tumors. Imaging-Community-AMARETTO algorithms that together Recent efforts further expand this work to link multio- facilitate interpretation of patient-derived multiomics net- mics data with both radiography and histopathology works for their relevance to radiography and histopathology 21,22 imaging, toward developing methods for imaging imaging biomarkers of clinical outcomes. Resources of the genomics. *AMARETTO software toolbox are available through These large archives of multimodal and multiscale data GitHub, Bioconductor, R Jupyter Notebook, GenePattern, sources provide complementary insights into the mech- GenomeSpace, and GenePattern Notebook. anistic basis of cancer, toward better diagnosis and Imaging-AMARETTO From a User’s Perspective treatment, and open unprecedented modeling opportu- nities to link multiomics data with clinical and imaging The workflow for multiomics, clinical, and imaging data phenotypes. As a solution to imaging genomics, we in- fusion includes utilities to learn networks from individual troduce the Imaging-AMARETTO software tools to sys- patient cohorts using Imaging-AMARETTO and to link tematically interrogate networks derived from multiomics networks across multiple related patient cohorts using data for relevance to imaging biomarkers of clinical out- Imaging-Community-AMARETTO. Together, these work- comes. We demonstrate the utility of these imaging ge- flows allow users to integrate patient tumor-derived mul- nomics tools by integrating three patient brain tumor tiomics or transcriptional profiles with clinical and databases, including the TCGA GBM and LGG and the molecular characteristics and radiography and histopa- IvyGAP GBM cohorts. We uncover known and novel thology imaging features within and across related patient drivers of tumor-associated microglia and macrophage cohorts. The Imaging-AMARETTO source code in R is mechanisms, and neurodevelopmental and stemness available from GitHub. An R Jupyter Notebook that 422 © 2020 by American Society of Clinical Oncology Community 2 Community M Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer Regulatory Mechanisms of Drivers from Multiomics Subnetworks of Regulatory Circuits RNA TCGA GBM MET GENE GENE GENE CNV IvyGAP GBM Networks of Regulatory Circuits Circuit 1 Circuit 2 TCGA LGG Circuits Target genes Driver genes Activators Circuit 3 Circuit N Repressors ... (2) Community-AMARETTO (1) AMARETTO Clinical Disease Molecular Disease Radiography Imaging Histopathology Imaging Biomarker Discovery Biomarker Discovery Biomarker Discovery Biomarker Discovery (3) Imaging-AMARETTO and Imaging-Community-AMARETTO FIG 1. The Imaging-AMARETTO and Imaging-Community-AMARETTO software architecture. The overall framework offers modular and com- plementary solutions to multimodal and multiscale aspects of network-based modeling within and across multiple studies of cancer. Specifically, (1) The AMARETTO algorithm learns networks of regulatory modules or circuits (circuits of drivers and target genes) from functional genomics or multiomics data (eg, DNA copy number variation [CNV], DNA methylation [MET], RNA gene expression [RNA]) within each study of cancer (eg, within The Cancer Genome Atlas [TCGA] glioblastoma multiforme [GBM], Ivy Glioblastoma Atlas Project [IvyGAP] GBM, or TCGA low-grade glioma [LGG] cohorts separately); (2) the Community-AMARETTO algorithm learns communities or subnetworks of regulatory circuits that are shared or distinct across networks derived from multiple studies of cancer (eg, across the TCGA GBM, IvyGAP GBM, and TCGA LGG cohorts); and (3) the Imaging-AMARETTO and Imaging-Community-AMARETTO algorithms associate these circuits (AMARETTO) and subnetworks (Community- AMARETTO) to clinical, molecular, and imaging-derived biomarkers by mapping radiography and histopathology imaging data onto the networks and assessing their clinical relevance for imaging diagnostics. provides stepwise guidelines for running the source code MSigDB Hallmark). When genetic or epigenetic data are directly from GitHub for its application to brain cancer is also available, they can help to guide the selection of available from Google Colaboratory. candidate driver genes. Potential cancer drivers are identified as somatic recurrent cancer aberrations from Imaging-AMARETTO supports multiple workflows: a pa- genetic and epigenetic data sources using GISTIC and tient cohort with only transcriptional profiles and a patient 32,33 MethylMix. The GISTIC algorithm is used to identify cohort with multiomics profiles. When only RNA-seq data copy number amplifications and deletions from DNA copy are available for a cohort, a predefined list of candidate number variation data. The MethylMix algorithm is used to driver genes is required, which can be selected or uploaded identify hyper- and hypomethylated sites from DNA by the user. Predefined lists of known drivers are available 26,27 methylation data. The user also specifies the number of for collections of transcription factors (TFutils and 28,29 regulatory modules to be learned from the data and the Molecular Signatures Database [MSigDB] C3) and cancer driver genes (COSMIC Cancer Gene Census, percentage of most varying genes to be included in the JCO Clinical Cancer Informatics 423 Community 1 Gevaert et al analysis. Networks of regulatory modules are inferred from To validate the predicted drivers as regulators of their RNA-seq data using an iterative optimization procedure. targets in modules and communities, we assess whether 34-37 The algorithm starts with an initialization step that activator or repressor drivers have a direct or indirect im- clusters the genes into modules of co-expressed target pact on their targets using experimental genetic pertur- genes. For each of these modules, we learn the regulatory bation data. The user can test signatures derived from programs as a linear combination of candidate driver genes genetic perturbation experiments, such as signatures of that best predict their target genes’ expression using Elastic target genes bound to transcription factors measured in Net–regularized regression. Target genes are then reas- protein-DNA–binding chromatin immunoprecipitation signed to the regulatory programs that best explain their sequencing (ChIP-Seq) experiments (Encyclopedia of 46 47 gene expression levels as estimated by the predictive power DNA Elements, ChIP-X Enrichment Analysis, of the regulatory programs’ respective regularized re- Harmonizome )ordefined by motif binding (MSigDB gression models when predicting the target genes’ C3), or signatures of genes induced or repressed in re- 34-37 expression. The algorithm iterates over these two steps sponse to genetic knockdown or overexpression experi- until convergence. This analysis generates a network of ments of drivers (Library of Integrated Network-Based 49,50 regulatory modules, defined as a group of target genes Cellular Signatures [LINCS]/Connectivity Map [CMAP], collectively activated or repressed by their associated Harmonizome). drivers. To characterize regulatory modules for clinical outcomes and molecular biomarkers, the user can submit pheno- Imaging-Community-AMARETTO: Linking types known for all or subsets of samples and specify the Imaging-AMARETTO Networks Across Cohorts statistical hypothesis tests to use for each phenotype. Ex- To compare and integrate networks of regulatory modules amples of clinical and molecular phenotypes include across multiple cohorts, the user can submit two or more survival data, molecular subclasses (eg, mesenchymal, Imaging-AMARETTO networks and optionally add known proneural, or classical GBM, astrocytoma or oligoden- networks as collections of signatures to guide subnet- droglioma LGG) and biomarkers (eg, IDH mutation, EGFR work learning and interpretation, such as immune cell amplification, MGMT methylation status). Our imple- 39,40 41-43 (CIBERSORT ) and stemness signatures. The mentation supports survival analysis using Cox proportional algorithm creates a module map of all pairwise com- hazards regression, nominal two-class and multiclass parisons between modules across multiple networks to analysis using the Wilcoxon rank sum and Kruskal-Wallis assess the extent of overlapping genes between all pairs tests, and continuous or ordinal analysis using the Pearson of modules (−log P value, hypergeometric test). This linear and Spearman rank correlation tests. These clinical module map is partitioned using an edge betweenness and molecular phenotype associations are assessed for community detection algorithm (Girvan-Newman ) that each of the regulatory modules in individual cohorts and groups the modules into subnetworks or communities combined for communities across cohorts. across the multiple networks. These communities repre- Finally, to interpret regulatory modules for relevance to sent shared behavior across two or more cohorts, and radiography or histopathology imaging features, associa- modules not assigned to communities are reported as tions with these imaging features can be assessed. Ex- distinct behavior specific for each cohort. This analysis amples of radiography and histopathology phenotypes generates subnetworks or communities of regulatory include the 30 TCIA VASARI MRI features defined by expert modules that are shared or distinct across multiple consensus for the TCGA GBM and LGG cohorts, the Imaging-AMARETTO networks derived from multiple co- 13-15 IvyGAP histopathology imaging features characterizing horts and further refines shared and distinct behavior of RNA-seq samples refined for anatomic structures and modules with respect to their specific drivers. cancer stem cells defined by expert consensus for the Downstream Utility for Interpreting Clinical and IvyGAP GBM cohort, and radiography and histopathology Experimental Outcomes imaging features derived using quantitative imaging 20,52-54 methods. We developed several downstream utilities that facilitate interpretation of the Imaging-AMARETTO networks, in- Users are provided with all results in the form of hypertext cluding functional characterization, driver validation, clin- markup language (HTML) reports that are generated in an ical correlation, and imaging association. To functionally automated manner for individual cohorts using Imaging- characterize modules and communities, we provide sig- AMARETTO and multiple cohorts using Imaging-Commu- natures from known gene sets databases (MSigDB) that nity-AMARETTO. These reports include searchable tables can be augmented with user-defined signatures, such as within and across modules and communities, including 39,40 45 41-43 immune cell, stromal cell, and stemness signa- statistics (ie, coefficients, P values, false discovery rate tures. Regulatory modules and communities are assessed [FDR] values) for functional enrichment, driver validation, for enrichment in these known functional categories clinical and molecular biomarkers, and radiography and (hypergeometric test). histopathology imaging features. These reports also include 424 © 2020 by American Society of Clinical Oncology Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer heat map visualizations for modules (Figs 2-6) and graph Diametrically opposed, higher expression levels of OLIG2 visualizations for communities (Appendix Fig A1). Source modules (Figs 4-6) are associated with better survival in code is also provided to convert Imaging-AMARETTO and GBM and LGG, and these modules also distinguish be- Imaging-Community-AMARETTO networks for depositing tween molecular subclasses of GBM and LGG but in the networks in the NDEx network database, taking advan- opposite direction. In GBM the classical and proneural tage of its interactive features. subclasses are represented by higher expression of these modules compared with the mesenchymal subclass. In RESULTS LGG, the oligodendroglioma subtype is characterized by To demonstrate its utility, we applied the Imaging- higher expression of these modules compared with the AMARETTO workflow to three studies of brain tumors: astrocytoma subtype. multiomics profiles from approximately 500 patients and approximately 30 radiography MRI features for Imaging-AMARETTO Deciphers Histopathology Imaging approximately 200 patients from the TCGA GBM Biomarkers of Key Driver Mechanisms cohort, multiomics profiles from approximately 500 Histopathology imaging features of anatomic structures patients and approximately 30 radiography MRI fea- show that higher expression of STAT3, AHR, and CCR2 tures for approximately 180 patients from the TCGA modules (Fig 3) distinguishes between samples derived LGG cohort, and for a subset of approximately 30 from the cellular tumor compared with those from leading patients from the TCGA GBM patient cohort 270 tran- edge and infiltrating tumor regions. Higher expression of scriptomic profiles refined through histopathology im- OLIG2 modules distinguishes infiltrating tumor from cel- aging and annotated with imaging features that lular tumor samples. characterize anatomic structures for 122 samples and Features representative of cancer stem cells show that cancer stem cells for 148 samples were used from the higher expression of STAT3, AHR, and CCR2 modules IvyGAP GBM project. (Fig 3) distinguishes cancer stem-cell samples from their Disease progression in glioma is characterized by in- non–stem cancer cell counterparts. This observation is filtration of resident microglia and peripheral macrophages consistent across the distinct substructures of the cellular in the tumor microenvironment and by pervasive infiltration tumor, including hyperplastic blood vessels, microvascular of tumor cells in the healthy surroundings of the tumor. proliferation, perinecrotic zone, and pseudopalisading cells Understanding microglia and macrophage physiology and around necrosis. Diametrically opposed, higher expression its complex interactions with tumor cells can elucidate their of OLIG2 modules distinguishes non–stem cancer cells roles in glioma progression and uncover potentially in- from cancer stem cells consistently across these micro- teresting druggable targets. vascular and necrosis substructures. Our results show that Imaging-AMARETTO captures these hallmarks of glioma, for example, key drivers of tumor- Imaging-AMARETTO Deciphers Radiography Imaging associated microglia and macrophage mechanisms Biomarkers of Key Driver Mechanisms mediated by STAT3, AHR, and CCR2, and neuro- Radiographic image features of STAT3, AHR, and CCR2 developmental and stemness mechanisms that involve modules (Figs 2 and 6) are highly consistent across GBM 59,60 OLIG2. Our findings recapitulate recent discoveries and and LGG. Higher expression is associated with a higher provide interpretation of the molecular mechanisms in light proportion of enhancing tumor, lower proportion of non- of imaging biomarkers of clinical outcomes. Of note, Im- enhancing tumor, and less cortical involvement. These aging-Community-AMARETTO also uncovers novel key modules also distinguish between measures of thickness of master drivers that are shared by these distinct key enhancing margin in both GBM and LGG. In GBM these mechanisms. STAT3, AHR, and CCR2 modules show higher expression in association with eloquent cortex, while in LGG, they show Imaging-AMARETTO Deciphers Clinical Relevance of higher expression in association with enhancement Multiomics Modules of Key Driver Mechanisms intensity. Of clinical relevance, higher expression levels of STAT3, AHR,and CCR2 modules (Figs 2, 3,and 6)are asso- Features of OLIG2 modules (Figs 4-6) are also consistent ciated with shorter survival in GBM and LGG, and these across GBM and LGG and diametrically opposed to those of modules also distinguish between molecular subclasses STAT3, AHR, and CCR2. In both GBM and LGG, higher of GBM and LGG. In GBM, the mesenchymal subclass is expression is associated with a higher proportion of non- represented by higher expression of these modules enhancing tumor and lower proportion of enhancing tumor. compared with the classical and proneural subclasses. In GBM, higher expression is also associated within speech In LGG, the astrocytoma subtype is characterized by receptive eloquent cortex, while in LGG, higher expression higher expression of these modules compared with the is associated with cortical involvement and the presence oligodendroglioma subtype. of cysts. JCO Clinical Cancer Informatics 425 Gevaert et al Driver Genes Expression Target Genes Phenotype Methylation Expression Associations STAT3 STAT3 STAT3 AHR Normalized Methylation state gene expression CNV or methylation alterations Predefined driver list Driver genes weights Phenotype associations Hypermethylated Methylation aberrations COSMIC Cancer Gene Clinical and molecular phenotypes Census Hypomethylated Not altered Imaging phenotypes −4 −2 0 2 4 −0.2 −0.1 0 0.1 0.2 Not altered Histological subtype Astrocytoma Oligodendroglioma IDH 1p19q subtype No No IDHmut-codel Astrocytoma Oligoastrocytoma Yes Yes IDHmut-non-codel IDHwt Oligodendroglioma f4 f5 f6 f11 f20 No Yes 01234 02468 02468 1234 FIG 2. Imaging-AMARETTO predicts STAT3 and AHR as known drivers of tumor-associated microglia and macrophage mechanisms in low-grade glioma (LGG). These heat maps present module 125 from The Cancer Genome Atlas (TCGA) LGG cohort that is a member of community 5 shared across the three cohorts (TCGA glioblastoma multiforme [GBM], Ivy Glioblastoma Atlas Project GBM, and TCGA LGG). For all patient-derived samples (rows), the heat maps show driver genes’ multiomics profiles (columns), including DNA methylation and RNA gene expression data (left panels); target genes’ (columns) RNA gene expression levels (middle panel); and relevant biomarkers (columns), including clinical and molecular and imaging phenotypes (right panels). This module includes eight driver genes (FNDC3B, IQGAP1, ANO6, ELK3, STAT3, TMOD3, CASP8, and ITPRIPL2) that jointly act as activator drivers of the 114 target genes in this module, including AHR and STAT3. Six driver genes are methylation driven (hypomethylation of ANO6, CASP8, ITPRIPL2, STAT3, and TMOD3 and hypermethylation of ANO6 and ELK3, inversely associated with their gene expression levels). Survival analysis reveals that increased expression of the genes in this module is associated with shorter survival (P = 7.3e-11; false discovery rate [FDR] = 1.0e-9). This module distinguishes between the histological subtypes (P = 1.6e-15; FDR = 2.1e-14), with lower expression representing the oligodendroglioma subtype (P = 1.0e-14; FDR = 1.1e-13) and higher expression reflecting the astrocytoma subtype (P = 1.5e-11; FDR = 2.8e-10). IDH mutation (mut) status and 1p19q subtypes are associated with the module expression (P = 3.9e-22; FDR = 1.5e-21), with higher expression levels representing the wild-type (wt) status. Association of Visually AcceSAble Rembrandt Images magnetic resonance imaging features with this module shows that the proportion of enhancing tumor (f5; P = 8.96e-7; FDR = 0.0000134) and enhancement intensity (f4; P = .0000395; FDR = 0.000494) are correlated with gene expression, while the proportions of nonenhancing tumor (f6; P = .0000452; FDR = 0.00036) and cortical involvement (f20; P = .0175; FDR = 0.118) are inversely correlated with expression. Module gene expression levels also distinguish between the thicknesses of enhancing margin (f11; P = .00137; FDR = 0.02). CNV, copy number variation. 426 © 2020 by American Society of Clinical Oncology ANO6 CASP8 ELK3 ITPRIPL2 STAT3 TMOD3 ELK3 TMOD3 IQGAP1 FNDC3B CASP8 ITPRIPL2 STAT3 ANO6 ERAP1 GALNT10 SLC24A1 DDR2 WWTR1 HELB SFT2D2 ATL3 PTPN13 LYST IGFBP5 ITGA11 SLC40A1 LAX1 TWSG1 CD2AP ZNF217 S1PR3 MAP7D3 ABCA13 MATN2 CRISPLD1 TNFAIP3 TNFRSF10A CFLAR LRCH1 CHSY1 RAG1 HEG1 ITGAV FEM1C TNFRSF19 CUBN RNFT1 BTN2A3 TEP1 AFF1 NFKB1 FZD5 RIPK1 FAM46A PPP1R3B ZFP36L2 MAGT1 TMED10 RB1 CDH11 MOBKL1B SOAT1 FNDC3B SPPL2A SLC35F5 FLNA VCL TRAM2 IQGAP1 FYCO1 REST ELK3 TMOD3 PCDH18 ANTXR2 PALLD CALD1 ERI1 TRAM1 CMTM6 PI4K2B TTC26 CALU PLCE1 IQGAP2 STAT3 ANO6 ADAM9 CLIC4 FRRS1 CREB3L2 TLN1 GNS PGM2 ROD1 ZNF468 CRLF3 TTF2 DNAJC10 DCBLD2 AHR CD274 ITGB3 MYOF RAB27A OSMR RBMS1 SEC24D FAM114A1 LTBP2 BTBD19 COL27A1 SEPN1 GALNT2 LOC100129034 TNC SH3PXD2B PHEX LOC653653 CROT PLEKHA9 PLA2G4A LAMA4 SIPA1L2 FRMD3 NRP2 C4orf36 Histological subtype Astrocytoma Oligodendroglioma IDH 1p19q subtype f4 f5 f6 f11 f20 Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer Driver Genes Target Genes Phenotype Expression Expression Associations THBS1 CCR2 AHR THBS1 CCR2 Normalized gene expression CNV or methylation alterations Predefined driver list Driver genes weights Phenotype associations Not altered COSMIC Cancer Gene Census Clinical and molecular phenotypes Imaging phenotypes −10 −5 0 5 10 −0.2 −0.1 0 0.1 0.2 Mesenchymal subclass Classical subclass Proneural subclass No No No Yes Yes Yes Anatomic structures CT IT LE region CTmvp CTpan CT No No No No No IT Yes Yes Yes Yes Yes LE Cancer stem cells CT stem cells CThbv stem cells CTmvp stem cells CT controls No No No CT reference genes Yes Yes Yes CThbv reference genes CTmvp reference genes CTpan stem cells CTpnz stem cells No No Yes Yes FIG 3. Imaging-AMARETTO predicts AHR and CCR2 as known drivers and identifies TBHS1 as a novel driver of tumor-associated microglia and macrophage mechanisms in glioblastoma multiforme (GBM). These heat maps present module 64 from the Ivy Glioblastoma Atlas Project (IvyGAP) GBM cohort that is a member of community 5 shared across the three cohorts (The Cancer Genome Atlas [TCGA] GBM, IvyGAP GBM, and TCGA low-grade glioma). For all patient- derived samples (rows), the heat maps show driver genes’ functional genomics profiles (columns), specifically RNA gene expression profiles (left panel); target genes’ (columns) RNA gene expression levels (middle panel); and relevant biomarkers (columns), including clinical and molecular and imaging phenotypes (right panels). This module includes nine driver genes (THBS1, CLEC2B, TNFAIP3, DSE, RNF149, MGP, CCR2, CSPG5,and CKB) that jointly act as activators (eg, CCR2, a squamous cell carcinoma tumor-rejection antigen recognized by T lymphocytes and candidate for specific immunotherapy) and repressors (CSPG5 and CKB) that drive the 87 target genes in this module, including AHR, CCR2,and THBS1. Association analyses confirm that higher expression of module genes reflects samples derived from patients in the mesenchymal subclass (P = .000047; false discovery rate [FDR] = 0.00015), while lower expression represents the classical subclass (P = .032; FDR = 0.067). Association of histopathology imaging features that study anatomic structures reveals that genes in this module distinguish between the samples derived from distinct anatomic structures (P = 1.8e-15; FDR = 4.3e-15), (continued on following page) JCO Clinical Cancer Informatics 427 CKB CSPG5 RNF149 CLEC2B TNFAIP3 DSE THBS1 MGP CCR2 RNF217 APLF TNIP3 PI15 C9orf3 LMCD1 MTA3 TNFRSF10A STX3 BICC1 ITGBL1 ZNF185 MICALCL CXCL1 MME FBXO22−AS1 BRE−AS1 SH2D2A SLC39A8 IL1R1 TGM2 BDKRB2 THBD CCL8 CYP1B1 CXCL6 COL13A1 HMGA2 AHR CLIC6 GEM CLEC2B THBS1 SEC24D ADAMTS1 PRSS23 PTBP3 CD2AP SLFN5 CD1D AMICA1 CCR2 ACP5 TM4SF1 SLFN11 FAM176A NABP1 RAB38 RUNX2 FCER1A TACSTD2 CCL3 C10orf125 CD52 EDEM1 P4HA3 CCL26 FCGR3B FHL2 SUSD1 SLC4A7 CPD DCBLD2 KCTD9 GCLM MAF MIR22HG REXO2 SRGN CREG1 RNF149 MRC1 IL7R SUMF1 IL13RA2 SEL1L3 C13orf33 SLC20A1 RARRES1 IGJ C11orf70 ITK CD200R1 INHBA C3orf52 IL18R1 LTB Mesenchymal subclass Classical subtype Proneural subclass Anatomic structures CT region IT region LE region CTmvp CTpan Cancer stem cells CT stem cells CThbv stem cells CTmvp stem cells CTpan stem cells CTpnz stem cells Gevaert et al Imaging-Community-AMARETTO Uncovers Known Key testing novel hypotheses of THBS1 and MAP2 as master and Novel Master Drivers Linking Mechanisms regulators of shared mechanisms that involve macrophage infiltration, vascularization, tumorigenesis, invasion, stem- Recent discoveries of STAT3, AHR,and CCR2 as drivers ness, and neurogenesis in glioma. of tumor-associated microglia and macrophage mechanisms are captured by modules in communities 1 DISCUSSION and 5: TCGA LGG module 125 (Fig 2) shows hypo- We developed the Imaging-AMARETTO algorithms and methylation of STAT3 as activator driver of AHR,with software tools for imaging genomics to facilitate systematic higher expression associated with shorter survival and interrogation of regulatory networks derived from multio- astrocytoma LGG, and IvyGAP GBM module 64 (Fig 3) mics data within and across related patient studies for their shows that higher expression of AHR and CCR2 is as- relevance to radiography and histopathology imaging fea- sociated with the presence of cancer stem cells and tures that predict clinical outcomes. We demonstrated its microvascular substructures and suggests as novel acti- utility through application to three patient studies of brain vator driver, THBS1, that plays important roles in mac- tumors, including multiomics and radiography imaging 61 62 rophage infiltration and angiogenesis, vascularization, data from the TCGA GBM and LGG studies and tran- and tumorigenesis in glioma. OLIG2 as a driver of scriptional and histopathology imaging data from the Ivy- 60,63,64 neurodevelopmental and stemness mechanisms is GAP GBM study. captured by modules in community 2: (1) TCGA LGG Our results show that Imaging-AMARETTO recapitulates module 91 (Fig 4) shows hypomethylation of OLIG2 as known key drivers of tumor-associated microglia and activator driver of this module, with higher expression macrophage mechanisms (STAT3, AHR, and CCR2) and associated with better survival, oligodendrocyte LGG, neurodevelopmental and stemness mechanisms (OLIG2). and IDH1 wild-type status; (2) TCGA GBM module 75 Imaging-AMARETTO provides interpretation of the un- (Fig 5)isdrivenby OLIG2, with higher expression asso- derlying molecular mechanisms in light of imaging bio- ciated with proneural and classical versus mesenchymal markers of clinical outcomes, and Imaging-Community- GBM, and suggests as novel repressor driver THBS1 and AMARETTO also uncovered novel master drivers THBS1 as novel activator driver hypomethylation of neuronal and MAP2 that establish relationships across these distinct marker MAP2 that plays important roles in microtubule- 65,66 67 mechanisms. associated neurogenesis and reduces invasiveness and stemness in glioma; and (3) TCGA GBM module 98 Of note, the querying of the Imaging-AMARETTO networks (Fig 6)shows CCR2 and OLIG2 co-acting as activator and for modules whose elevated expression is inversely asso- repressor drivers, respectively, highlighting their di- ciated with proportions of enhancing tumor and cancer ametrically opposed behavior, with higher CCR2 and stem cells on radiography and histopathology imaging, lower OLIG2 expression associated with mesenchymal respectively, shows that these modules are putatively versus proneural and classical GBM and suggesting as coregulated by activator drivers OLIG2 and MAP2 and novel repressor driver hypomethylation of MAP2,con- repressor drivers STAT3, AHR, CCR2, and THBS1. Thus, sistent with observations in TCGA GBM module 75 (Fig 5). we hypothesize that restoration of the function of OLIG2 and Using knockdown experiments of THBS1 from LINCS/ MAP2 and attenuation of the expression of STAT3, AHR, CMAP, we confirmed that THBS1 acts as activator and CCR2, and THBS1 potentially shift their target genes’ ex- repressor of its targets in IvyGAP GBM module 64 pression to more benign functional states associated with (Fig 3) and TCGA GBM module 75 (Fig 5), respectively. better survival in GBM and LGG. Thus, Imaging-Community-AMARETTO (Appendix Fig A1) This case study illustrates how Imaging-AMARETTO can be identified THBS1 and MAP2 as novel master drivers across used for imaging diagnostics and prognostics by in- the three STAT3, AHR, CCR2, and OLIG2 communities that terrogating multimodal and multiscale networks for imaging provide new insights into how these distinct key mecha- biomarkers to identify their clinically relevant underlying nisms are linked in glioma. Interesting avenues for further molecular mechanisms. Our network-based imaging ge- exploration with experimental validation studies include nomics tools are powerful hypothesis generators that FIG 3. (Continued). where cellular tumor (CT; P = .0000078; FDR = 0.000013) samples and, in particular, with substructure microvascular proliferation (CTmvp; P = 8.7e-14; FDR = 6.2e-13) and pseudopalisading cells around necrosis (CTpan; P = .018; FDR = 0.027) show elevated expression of module genes, while leading edge (LE; P = .018; FDR = 0.023) and infiltrating tumor (IT; P = .0014; FDR = 0.0045) samples show lower expression. Association of histopathology imaging features targeting cancer stem cells reveals that samples derived from cancer stem cells are generally associated with higher module gene expression compared with nonstem cells (P = 4.0e-16; FDR = 8.6e-15) and, specifically, elevated expression in stem-cell v control samples from substructures of the CT (P = .0000059; FDR = 0.00018), including hyperplastic blood vessels (CThbv; P = 1.2e-11; FDR = 5.2e-10), perinecrotic zone (CTpnz; P = 3.4e-9; FDR = 3.4e-8), CTpan (P = .000087; FDR = 0.00032), and CTmvp (P = .0039; FDR = 0.029). CNV, copy number variation. 428 © 2020 by American Society of Clinical Oncology Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer Driver Genes Phenotype Target Genes Expression Associations Expression Methylation OLIG2 OLIG2 OLIG2 Normalized Methylation state gene expression CNV or methylation drivers Predefined driver list Driver genes weights Phenotype associations Hypermethylated Methylation aberrations Driver not predefined Clinical and molecular phenotypes Hypomethylated Not altered COSMIC Cancer Gene Census Imaging phenotypes −4 −2 0 2 4 −0.4 −0.2 0 0.2 0.4 Histological subtype Astrocytoma Oligodendroglioma IDH 1p19q subtype Astrocytoma No No IDHmut-codel Oligoastrocytoma Yes Yes IDHmut-non-codel Oligodendroglioma IDHwt f5 f6 f8 f20 No No Yes Yes 02468 02468 FIG 4. Imaging-AMARETTO predicts OLIG2 as known driver of neurodevelopmental and stemness mechanisms in low-grade glioma (LGG). These heat maps present module 91 from The Cancer Genome Atlas (TCGA) LGG cohort that is a member of community 2 shared across the three cohorts (TCGA glioblastoma multiforme [GBM], Ivy Glioblastoma Atlas Project GBM, and TCGA LGG). For all patient-derived samples (rows), the heat maps show driver genes’ multiomics profiles (columns), including DNA methylation and RNA gene expression data (left panels); target genes’ (columns) RNA gene expression levels (middle panel); and relevant biomarkers (columns), including clinical and molecular and imaging phenotypes (right panels). This module includes nine driver genes (SOX8, OLIG2, EBF4, NLGN2, SHD, RANBP17, FERMT1, LOC254559, and C11orf63) that jointly act as activator and repressor (C11orf63) drivers of the 94 target genes in this module, including OLIG2. Six driver genes are methylation driven (hypomethylation of FERMT1 and OLIG2 and hypermethylation of C11orf63, NLGN2, RANBP17, and SOX8, inversely associated with their gene expression levels). Survival analysis reveals that increased expression of the genes in this module is associated with better survival (P = 7.1e-11; false discovery rate [FDR] = 1.0e-9). This module distinguishes between the histological subtypes (P = .00047; FDR = 0.00084), with lower expression representing the astrocytoma subtype (P = .00016; FDR = 0.00039) and higher expression reflecting the oligodendroglioma subtype (P = .0027; FDR = 0.0047). IDH mutation (mut) status and 1p19q subtypes are associated with the module expression (P = 2.0e-26; FDR = 1.1e-25), with lower expression levels representing the wild-type (wt) status. Association of Visually AcceSAble Rembrandt Images magnetic resonance imaging features with this module shows that the proportion of enhancing tumor (f5; P = .016; FDR = 0.032) is inversely correlated with gene expression, while the proportion of nonenhancing tumor (f6; P = .032; FDR = 0.066), cortical involvement (f20; P = .018; FDR = 0.12), and the presence of cysts (f8; P = .0095; FDR = 0.14) are correlated with expression. CNV, copy number variation. JCO Clinical Cancer Informatics 429 C11orf63 FERMT1 NLGN2 OLIG2 RANBP17 SOX8 C11orf63 FERMT1 LOC254559 OLIG2 SHD SOX8 NLGN2 EBF4 RANBP17 POM121L10P NXN FAM22D FAM22A TMEM121 ACAP3 ANKRD13B NLGN2 EBF4 GPSM1 TNK2 TBKBP1 LOC283174 NPPA DGCR2 PTCHD2 HES6 BCAN DLL1 DLL3 UPK2 VPS37D FAM110B RCOR2 SHD SOX8 CSNK1E PHF21B KDM4B C17orf69 PLK1S1 SNHG1 MGC21881 KIF26A TFAP4 ZNF34 BOP1 FRAT1 FRAT2 FXYD2 SPATA9 SYNPO2L DAPL1 LPPR1 MYT1 ACCN4 ZCCHC24 RTKN OLIG1 OLIG2 NEU4 BCAR1 AMOTL2 C12orf34 RBP3 ZC4H2 H2AFY2 HMX1 C2orf27A ASB13 EFS PCBP4 H1F0 LOC84989 HIST3H2A PCGF6 C6orf134 VAX2 POLR2F MARCKS SULF2 KLRK1 KLRC3 KLRC2 PAX1 SOX3 MDFI MARCKSL1 KIAA0114 FLJ10038 ZSCAN2 C17orf100 C22orf27 PHLPP1 AKAP7 EHD1 TLE6 CORO7 AG2 FAM110A LOC100132288 TRIM62 REC8 MXD4 Histological subtype Astrocytoma Oligodendroglioma IDH 1p19q. subtype f5 f6 f8 f20 Gevaert et al Driver Genes Expression Target Genes Phenotype Expression Associations MAP2 OLIG2 MAP2 THBS1 Normalized CNV state Methylation state gene expression CNV or methylation drivers Predefined driver list Driver genes weights Phenotype associations Deleted Hypomethylated Copy number alterations COSMIC Cancer Gene Census Clinical and molecular phenotypes Amplified Not altered Methylation aberrations Imaging phenotypes (VASARI) −10 −5 0 5 10 −0.4 −0.2 0 0.2 0.4 Not altered Not altered Molecular subclasses Mesenchymal Classical Proneural Classical No No No G-CIMP Yes Yes Yes Mesenchymal Neural Proneural f3 f6 f6 (< 5%) f6 (< 33%) No No No Yes 1234 5 Yes Yes FIG 5. Imaging-AMARETTO predicts OLIG2 as known driver and identifies MAP2 and THBS1 as novel drivers of neurodevelopmental and stemness mechanisms in glioblastoma multiforme (GBM). These heat maps present module 75 from The Cancer Genome Atlas (TCGA) GBM cohort that is a member of community 2 shared across the three cohorts (TCGA GBM, Ivy Glioblastoma Atlas Project GBM, and TCGA low-grade glioma). For all patient- derived samples (rows), the heat maps show driver genes’ multiomics profiles (columns), including DNA copy number variation (CNV), DNA methylation, and RNA genes’ expression data (left panels); target genes’ (columns) RNA gene expression levels (middle panel); and relevant biomarkers (columns), including clinical and molecular and imaging phenotypes (right panels). This module includes 10 driver genes (CSPG5, MAP2, OLIG2, CKB, GCSH, PPP2R2B, HEPACAM, CDH10, CTNND2, and THBS1) that jointly act as activator and repressor (THBS1) drivers of the 84 target genes in this module. One driver gene, GCSH, is copy number driven (somatic recurrent copy number deletions and amplifications associated with its gene expression), and three driver genes are methylation driven (hypomethylation of CDH10, MAP2, and PPP2R2B, inversely associated with their gene expression). Higher expression of genes in this module comprises the classical (P = 9.9e-13; false discovery rate [FDR] = 5.1e-12) and proneural (P = .0000024; FDR = 0.0000056) molecular subclasses, while lower expression of genes in this module represents the mesenchymal molecular subclass (P = 6.1e-40; FDR = 2.3e-38). Association of Visually AcceSAble Rembrandt Images (VASARI) magnetic resonance imaging features with this module shows that the proportion of nonenhancing tumor (f6; P = .013; FDR = 0.060, with nonenhancing proportion , 33%, P = .0014; FDR = 0.027, and nonenhancing proportion , 5%, P = .032; FDR = 0.17) and speech receptive eloquent cortex (f3; P = .046; FDR = 0.75) are correlated with the module gene expression. facilitate the testing of known hypotheses and uncovering of to better diagnostic and prognostic models of cancer and novel hypotheses for follow-up with experimental validation will open new avenues for drug discovery by integrating studies. We anticipate that our tools for network-based pharmacogenomic data into these networks, toward better 69,70 fusion of multiomics, clinical, and imaging data will lead therapeutics of cancer. 430 © 2020 by American Society of Clinical Oncology GCSH CNV CDH10 Methylation MAP2 PPP2R2B CTNND2 HEPACAM CKB PPP2R2B GCSH CDH10 OLIG2 MAP2 CSPG5 THBS1 MGC35440 CDKN2AIPNL C1orf96 HMGCR ARL6IP1 FIBIN NCALD C20orf58 CNTFR NCAN MMD2 LOC388419 NLGN3 CSPG5 SEZ6L ASCL1 ABAT FAM77D APCDD1 LDHB ACOX3 HAPLN1 C9orf58 KCNJ16 LRRC3B C8orf46 LOC389073 RLBP1 CIDEC PPP2R2B TMEM58 BCAN TTYH1 C1orf61 TUBB2B GPM6A METRN BMP7 ATP6V0E2 LRRC4B SYT17 BAI2 CDK2AP1 DBI KLHL25 FGFBP3 C18orf51 SLC24A3 CLEC4F LMO1 APBB2 LRRN1 CLCN2 ASTN1 FAM5B TRIM9 PCDH10 CPNE5 C1orf21 TSPAN12 SEZ6 MPPED2 RIC3 KCNA2 BAI3 C17orf76 ROBO2 CA14 CDH1 COL9A2 HAPLN4 CCDC12 ACAD8 MANEAL SOX2 DCLK2 C3orf39 ZNF501 BHLHB9 MOSC1 LRRN2 MAGED4B LOC402573 PKD1L1 Molecular subclasses Mesenchymal Classical Proneural f3 f6 f6 ( < 5 % ) f6 ( < 30 % ) Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer Driver Genes Expression Target Genes Phenotype Expression Associations MAP2 CCR2 OLIG2 MAP2 Normalized Methylation state gene expression CNV or methylation drivers Predefined driver list Driver genes weights Phenotype associations Hypermethylated Methylation aberrations COSMIC Cancer Gene Census Clinical and molecular phenotypes Hypomethylated Not altered Imaging phenotypes −4 −2 0 2 4 −0.04 −0.02 0 0.02 0.04 Not altered Molecular subclasses Mesenchymal Classical Proneural IDH1 status Classical No No No Wild type G-CIMP Yes Yes Yes R132C Mesenchymal R132G Neural R132H Proneural f5 f6 (< 5%) f6 (< 33%) No 1234 12345 Yes FIG 6. Imaging-AMARETTO predicts CCR2 and OLIG2 as co-acting known activator and repressor drivers and MAP2 as novel repressor driver linking tumor-associated microglia and macrophage mechanisms with neurodevelopmental and stemness mechanisms in glioblastoma multiforme (GBM). These heat maps present module 98 from The Cancer Genome Atlas (TCGA) GBM cohort that is a member of community 2 shared across the three cohorts (TCGA GBM, Ivy Glioblastoma Atlas Project GBM, and TCGA low-grade glioma). For all patient-derived samples (rows), the heat maps show driver genes’ multiomics profiles (columns), including DNA methylation and RNA gene expression data (left panels); target genes’ (columns) RNA gene expression levels (middle panel); and relevant biomarkers (columns), including clinical and molecular and imaging phenotypes (right panels). This module includes 10 driver genes (TEC, SPINT1, CCR2, MAN1A1, OLIG2, NPAS3, MAP2, RUFY3, NTRK3, and CSPG5) that jointly act as activator (TEC, SPINT1, CCR2, MAN1A1) and repressor (OLIG2, NPAS3, MAP2, RUFY3, NTRK3, CSPG5) drivers of the 98 target genes in this module. Two driver genes, MAP2 and RUFY3, are methylation driven (hypomethylation of MAP2 and hyper- and hypomethylation of RUFY3, inversely associated with their gene expression levels). Higher expression of the genes in this module represents the mesenchymal molecular subclass (P = 5.1e-35; false discovery rate [FDR] = 5.9e-34), while lower expression of genes in this module comprises the classical (P = 6.1e-15; FDR = 3.8e-14), G-CIMP (P = .000012; FDR = 0.000030), and proneural (P = .00034; FDR = 0.00062) molecular subclasses. IDH1 mutation status is associated with the module expression (P = .00069; FDR = 0.0019), with higher expression levels representing the wild-type status. Association of Visually AcceSAble Rembrandt Images magnetic resonance imaging features with this module shows that the proportion of nonenhancing tumor is inversely correlated (f6; P = .045; FDR = 0.13) with nonenhancing proportion , 33% (P = .0086; FDR = 0.067) and the proportion of enhancing tumor is correlated (f5; P = .020; FDR = 0.28) with the module gene expression. JCO Clinical Cancer Informatics 431 Methylation MAP2 RUFY3 CCR2 MAN1A1 SPINT1 TEC RUFY3 OLIG2 MAP2 CSPG5 NTRK3 NPAS3 NLRP13 WDR4 EIF2S3 GSR TACSTD2 TM4SF19 TRPV4 ATP8B1 EDARADD INDOL1 SULT1E1 PRRX2 CRABP2 ASAM ZNF185 TPBG CAPN6 CYP26A1 GNAT2 WT1 ENOSF1 VANGL1 RHOD STAP2 C4orf7 CSF2 BDKRB1 MOCOS TNFRSF9 C3orf52 LGALS12 PAX8 ROR2 PTGER2 MME NTRK1 TNIP3 CHMP4C STEAP4 PSTPIP2 MCTP2 TEC GIPC2 TNFRSF10A ARHGEF5 TFAP2C MTL5 IL11 ADAMTSL1 AKR1B10 TMEM166 HGF CD70 RAB38 MGC16075 KCTD9 BFSP1 C6orf150 ELF4 MBOAT1 GCHFR TIFA H2AFJ SLC38A2 SLC38A4 TCN1 MBNL3 FRRS1 GPR1 FRK PTGFR ENPP1 SULT1C2 LONRF3 HIST1H2BK HIST2H2AA3 DNAJC1 ITLN2 WBSCR28 LAMA3 RNF168 C1orf211 DEFB1 TUFT1 DMRTA1 CPA6 XKRX EPHA3 RP6−213H19.1 FLJ21986 ASB9 PMAIP1 HOXB5 HOXB6 MGC45800 DKKL1 OSR2 WNT5B Molecular subclasses Mesenchymal Classical Proneural IDH1 status f5 f6 ( < 5 % ) f6 ( < 33 % ) Gevaert et al Financial support: Olivier Gevaert, Anna M. Krichevsky, Jill P. Mesirov, AFFILIATIONS Vincent Carey, Nathalie Pochet Stanford Center for Biomedical Informatics Research, Department of Collection and assembly of data: Olivier Gevaert, Mohsen Nabian, Artur Medicine and Biomedical Data Science, Stanford University, Stanford, Manukyan, Jishu Xu, Nathalie Pochet CA Data analysis and interpretation: Olivier Gevaert, Mohsen Nabian, Celine Cell Circuits Program, Broad Institute of MIT and Harvard, Cambridge, Everaert, Jayendra Shinde, Artur Manukyan, Ted Liefeld, Thorin Tabor, MA Joachim Lupberger, Brian J. Haas, Thomas F. Baumert, Mikel Hernaez, Ann Romney Center for Neurologic Diseases, Department of Neurology, Michael Reich, Francisco J. Quintana, Anna M. Krichevsky, Vincent Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 4 Carey, Nathalie Pochet Department of Medicine, University of California, San Diego, San Diego, Manuscript writing: All authors CA 5 Final approval of manuscript: All authors Rush University Medical Center, Chicago, IL 6 Accountable for all aspects of the work: All authors INSERM, U1110, Institut de Recherche sur les Maladies Virales et ´ ´ Hepatiques, Universite de Strasbourg, Institut Hopitalo-Universitaire, Hopitaux Universitaires de Strasbourg, Strasbourg, France AUTHORS’ DISCLOSURES OF POTENTIAL CONFLICTS OF Carl R. Woese Institute for Genomic Biology, University of Illinois at INTEREST Urbana-Champaign, Champaign, IL The following represents disclosure information provided by authors of Channing Division of Network Medicine, Brigham and Women’s this manuscript. All relationships are considered compensated unless Hospital, Harvard Medical School, Boston, MA otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the CORRESPONDING AUTHOR subject matter of this manuscript. For more information about ASCO’s Nathalie Pochet, PhD, Brigham and Women’s Hospital and Harvard conflict of interest policy, please refer to www.asco.org/rwc or ascopubs. Medical School, 60 Fenwood Rd, Boston, MA 02115, Broad Institute of org/cci/author-center. MIT and Harvard, 415 Main Street, Cambridge, MA 02142; e-mail: Open Payments is a public database containing information reported by npochet@broadinstitute.org. companies about payments made to US-licensed physicians (Open Payments). EQUAL CONTRIBUTION Olivier Gevaert O.G. and M.N. are equally contributing authors. Research Funding: Paragon Development Systems (Inst), Lucence J.P.M., V.C., and N.P. are equally contributing authors. Diagnostics (Inst) Brian J. Haas SUPPORT Consulting or Advisory Role: Immuneering, Diamond Age Data Science Supported by the National Cancer Institute (NCI) Informatics Technology Thomas F. Baumert for Cancer Research (R21CA209940 [O.G., T.F.B., J.P.M., N.P.], Stock and Other Ownership Interests: Alentis Therapeutics, Alentis U01CA214846 [V.C.], U01CA214846 Collaborative Set-aside [O.G., Therapeutics (Inst) A.M.K., V.C., N.P.], U24CA194107 [J.P.M.], U24CA220341 [J.P.M.], Research Funding: Janssen Research & Development (Inst), Alentis U24CA180922 [B.J.H., N.P., A. Regev]), NCI (R01CA215072 [A.M.K.], Therapeutics (Inst) U01CA217851 [O.G.], U01CA199241 [O.G.], Stanford CTD [O.G.]), Patents, Royalties, Other Intellectual Property: Patent applications on National Institute of Allergy and Infectious Diseases (R03AI131066 Claudin-1 targeting monoclonal antibodies for liver disease, fibrosis, and [T.F.B., N.P.]), and National Institute of Biomedical Imaging and cancer (Inst); patent applications on liver disease drug discovery systems Bioengineering (R01EB020527 [O.G.]). The content is solely the (Inst) responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Vincent Carey Employment: CleanSlate (I) SOFTWARE AND RESOURCES Honoraria: Gilead Sciences (I) The source code of Imaging-AMARETTO is available from GitHub, and Research Funding: Bayer AG a Notebook for its application to the three brain cancer cohorts is No other potential conflicts of interest were reported. available from Google Colaboratory, including links to interactive HTML reports and NDEx networks: http://portals.broadinstitute.org/pochetlab/ JCO_CCI_Imaging-AMARETTO/Imaging-AMARETTO_Software_ ACKNOWLEDGMENT Resources.html. We thank Aviv Regev, PhD, for helpful discussions on developing network biology and medicine approaches for studying complex human diseases. We also thank Howard Weiner, MD, and Vijay Kuchroo, DVM, PhD, for AUTHOR CONTRIBUTIONS helpful insights into deciphering the mechanistic basis of gliomas. Conception and design: Olivier Gevaert, Mohsen Nabian, Shaimaa Bakr, Artur Manukyan, Erik J. Uhlmann, Jill P. Mesirov, Vincent Carey, Nathalie Pochet REFERENCES 1. National Cancer Institute: The Cancer Genome Atlas Program, 2018. https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga 2. National Cancer Institute Office of Cancer Clinical Proteomics Research: OCCPR: A leader in cancer proteomics and proteogenomics, 2019. https://proteomics. cancer.gov 3. The Cancer Imaging Archive: Welcome to The Cancer Imaging Archive, 2019. https://www.cancerimagingarchive.net 432 © 2020 by American Society of Clinical Oncology Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer 4. Cancer Genome Atlas Research Network: Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455:1061- 1068, 2008 [Erratum: Nature 494:506, 2013] 5. Brennan CW, Verhaak RGW, McKenna A, et al: The somatic genomic landscape of glioblastoma. Cell 155:462-477, 2013 [Erratum: Cell 157:753, 2014] 6. Brat DJ, Verhaak RGW, Aldape KD, et al: Comprehensive, integrative genomic analysis of diffuse lower-grade gliomas. N Engl J Med 372:2481-2498, 2015 7. Bakas S, Akbari H, Sotiras A, et al: Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci Data 4:170117, 2017 8. Cancer Imaging Archive: VASARI Research Project. https://wiki.cancerimagingarchive.net/display/Public/VASARI+Research+Project 9. Cancer Imaging Archive: TCGA Glioma Phenotype Research Group. https://wiki.cancerimagingarchive.net/display/Public/TCGA+Glioma+Phenotype+Research+Group 10. Patel AP, Tirosh I, Trombetta JJ, et al: Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344:1396-1401, 2014 11. Tirosh I, Venteicher AS, Hebert C, et al: Single-cell RNA-seq supports a developmental hierarchy in human oligodendroglioma. Nature 539:309-313, 2016 12. Filbin MG, Tirosh I, Hovestadt V, et al: Developmental and oncogenic programs in H3K27M gliomas dissected by single-cell RNA-seq. Science 360:331-335, 13. Puchalski RB, Shah N, Miller J, et al: An anatomic transcriptional atlas of human glioblastoma. Science 360:660-663, 2018 14. IvyGAP: Ivy Glioblastoma Atlas Project, 2019. https://glioblastoma.alleninstitute.org 15. IvyGAP: Data download. https://glioblastoma.alleninstitute.org/static/download.html 16. Rutman AM, Kuo MD: Radiogenomics: Creating a link between molecular diagnostics and diagnostic imaging. Eur J Radiol 70:232-241, 2009 17. Segal E, Sirlin CB, Ooi C, et al: Decoding global gene expression programs in liver cancer by noninvasive imaging. Nat Biotechnol 25:675-680, 2007 18. Diehn M, Nardini C, Wang DS, et al: Identification of noninvasive imaging surrogates for brain tumor gene-expression modules. Proc Natl Acad Sci U S A 105:5213-5218, 2008 19. Gevaert O, Xu J, Hoang CD, et al: Non-small cell lung cancer: Identifying prognostic imaging biomarkers by leveraging public gene expression microarray data-- methods and preliminary results. Radiology 264:387-396, 2012 20. Itakura H, Achrol AS, Mitchell LA, et al: Magnetic resonance image features identify glioblastoma phenotypic subtypes with distinct molecular pathway activities. Sci Transl Med 7:303ra138, 2015 21. Cheerla A, Gevaert O: Deep learning with multimodal representation for pancancer prognosis prediction. Bioinformatics 35:i446-i454, 2019 22. Momeni A, Thibault M, Gevaert O: Dropout-enabled ensemble learning for multi-scale biomedical data. in Crimi A, Bakas S, Kuijf H, et al (eds): Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. Cham, Switzerland, Springer International Publishing, 2019, pp 407-415 23. Broad Institute: The *AMARETTO framework for network biology and medicine: Linking disease, drivers, targets and drugs via graph-based fusion of multi- omics, clinical, imaging, and perturbation data. http://portals.broadinstitute.org/pochetlab/amaretto.html 24. GitHub: broadinstitute/ImagingAMARETTO. https://github.com/broadinstitute/ImagingAMARETTO 25. Google Colaboratory: Imaging-AMARETTO: An imaging genomics software tool to systematically interrogate multi-omics networks for relevance to radiography and histopathology imaging biomarkers of clinical outcomes with application to studies of brain tumors. https://colab.research.google.com/drive/14u1KZJ3Gf- 9qjDycyBKzBiN5VzzOa2xU#scrollTo=LujO14znmO0J 26. Stubbs BJ, Gopaulakrishnan S, Glass K, et al: TFutils: Data structures for transcription factor bioinformatics. F1000Res 8:152, 2019 27. Carey V, Gopaulakrishnan S: TFutils: Bioconductor version release (3.10). https://bioconductor.org/packages/TFutils 28. Subramanian A, Tamayo P, Mootha VK, et al: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102:15545-15550, 2005 29. Liberzon A, Birger C, Thorvaldsdottir H, et al: The Molecular Signatures Database (MSigDB) Hallmark gene set collection. Cell Syst 1:417-425, 2015 30. COSMIC: COSMIC, the Catalogue of Somatic Mutations in Cancer. https://cancer.sanger.ac.uk/cosmic 31. Mermel CH, Schumacher SE, Hill B, et al: GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol 12:R41, 2011 32. Cedoz PL, Prunello M, Brennan K, et al: MethylMix 2.0: An R package for identifying DNA methylation genes. Bioinformatics 34:3044-3046, 2018 33. Gevaert O, Tibshirani R, Plevritis SK: Pancancer analysis of DNA methylation-driven genes using MethylMix. Genome Biol 16:17, 2015 34. Champion M, Brennan K, Croonenborghs T, et al: Module analysis captures pancancer genetically and epigenetically deregulated cancer driver genes for smoking and antiviral response. EBioMedicine 27:156-166, 2018 35. Shinde J, Everaert C, Bakr S, et al: AMARETTO: Regulatory network inference and driver gene evaluation using integrative multi-omics analysis and penalized regression: Bioconductor version release 3.10, 2019. https://bioconductor.org/packages/AMARETTO 36. Gevaert O, Villalobos V, Sikic BI, et al: Identification of ovarian cancer driver genes by using module network integration of multi-omics data. Interface Focus 3: 20130013, 2013 [Erratum Interface Focus 4:20140023, 2014] 37. Gevaert O, Plevritis S: Identifying master regulators of cancer and their downstream targets by integrating genomic and epigenomic features. Pac Symp Biocomput 123-134, 2013 38. Zou H, Hastie T: Regularization and variable selection via the elastic net. J R Stat Soc Series B Stat Methodol 67:301-320, 2005 39. Newman AM, Liu CL, Green MR, et al: Robust enumeration of cell subsets from tissue expression profiles. Nat Methods 12:453-457, 2015 40. Stanford University: CIBERSORT. https://cibersort.stanford.edu 41. Ben-Porath I, Thomson MW, Carey VJ, et al: An embryonic stem cell-like gene expression signature in poorly differentiated aggressive human tumors. Nat Genet 40:499-507, 2008 42. Marson A, Levine SS, Cole MF, et al: Connecting microRNA genes to the core transcriptional regulatory circuitry of embryonic stem cells. Cell 134:521-533, 43. Kim J, Woo AJ, Chu J, et al: A Myc network accounts for similarities between embryonic stem and cancer cell transcription programs. Cell 143:313-324, 2010 44. Newman MEJ, Girvan M: Finding and evaluating community structure in networks. Phys Rev E Stat Nonlin Soft Matter Phys 69:026113, 2004 45. Baryawno N, Przybylski D, Kowalczyk MS, et al: A cellular taxonomy of the bone marrow stroma in homeostasis and leukemia. Cell 177:1915-1932.e16, 2019 46. ENCODE Project Consortium: The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306:636-640, 2004 47. Lachmann A, Xu H, Krishnan J, et al: ChEA: Transcription factor regulation inferred from integrating genome-wide ChIP-X experiments. Bioinformatics 26:2438-2444, 2010 48. Rouillard AD, Gundersen GW, Fernandez NF, et al: The Harmonizome: A collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database (Oxford) 2016:baw100, 2016 JCO Clinical Cancer Informatics 433 Gevaert et al 49. Lamb J, Crawford ED, Peck D, et al: The Connectivity Map: Using gene-expression signatures to connect small molecules, genes, and disease. Science 313:1929-1935, 2006 50. Subramanian A, Narayan R, Corsello SM, et al: A next generation Connectivity Map: L1000 platform and the first 1,000,000 profiles. Cell 171:1437-1452.e17, 51. Verhaak RGW, Hoadley KA, Purdom E, et al: Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 17:98-110, 2010 52. Gevaert O, Mitchell LA, Achrol AS, et al: Glioblastoma multiforme: Exploratory radiogenomic analysis by using quantitative image features. Radiology 273:168-174, 2014 53. Liu TT, Achrol AS, Mitchell LA, et al: Magnetic resonance perfusion image features uncover an angiogenic subgroup of glioblastoma patients with poor survival and better response to antiangiogenic treatment. Neuro-oncol 19:997-1007, 2017 54. Nicolasjilwan M, Hu Y, Yan C, et al: Addition of MR imaging features and genetic biomarkers strengthens glioblastoma survival prediction in TCGA patients. J Neuroradiol 42:212-221, 2015 55. NDEx: Welcome to the Network Data Exchange. https://home.ndexbio.org 56. GenePattern: AMARETTO supporting data files. https://datasets.genepattern.org/?prefix=data/module_support_files/Amaretto 57. Carey V: ivygapSE: A SummarizedExperiment for Ivy-GAP data: Bioconductor version release 3.10, 2019. https://bioconductor.org/packages/ivygapSE 58. Sevenich L: Brain-resident microglia and blood-borne macrophages orchestrate central nervous system inflammation in neurodegenerative disorders and brain cancer. Front Immunol 9:697, 2018 59. Takenaka MC, Gabriely G, Rothhammer V, et al: Control of tumor-associated macrophages and T cells in glioblastoma via AHR and CD39. Nat Neurosci 22: 729-740, 2019 [Erratum: Nat Neurosci 22:1533, 2019] 60. Krichevsky AM, Uhlmann EJ: Oligonucleotide therapeutics as a new class of drugs for malignant brain tumors: Targeting mRNAs, regulatory RNAs, mutations, combinations, and beyond. Neurotherapeutics 16:319-347, 2019 61. Offer S, Menard JA, Perez JE, et al: Extracellular lipid loading augments hypoxic paracrine signaling and promotes glioma angiogenesis and macrophage infiltration. J Exp Clin Cancer Res 38:241, 2019 62. Daubon T, Leon C, Clarke K, et al: Deciphering the complex role of thrombospondin-1 in glioblastoma development. Nat Commun 10:1146, 2019 63. Suva` ML, Rheinbay E, Gillespie SM, et al: Reconstructing and reprogramming the tumor-propagating potential of glioblastoma stem-like cells. Cell 157:580-594, 2014 64. Ceccarelli M, Barthel FP, Malta TM, et al: Molecular profiling reveals biologically discrete subsets and pathways of progression in diffuse glioma. Cell 164:550-563, 2016 65. Gao L, Huang S, Zhang H, et al: Suppression of glioblastoma by a drug cocktail reprogramming tumor cells into neuronal like cells. Sci Rep 9:3462, 2019 [Erratum: Sci Rep 10:2971, 2020] 66. Yuan J, Zhang F, Hallahan D, et al: Reprogramming glioblastoma multiforme cells into neurons by protein kinase inhibitors. J Exp Clin Cancer Res 37:181, 2018 67. Zhou Y, Wu S, Liang C, et al: Transcriptional upregulation of microtubule-associated protein 2 is involved in the protein kinase A-induced decrease in the invasiveness of glioma cells. Neuro-oncol 17:1578-1588, 2015 68. Yi R, Feng J, Yang S, et al: miR-484/MAP2/c-Myc-positive regulatory loop in glioma promotes tumor-initiating properties through ERK1/2 signaling. J Mol Histol 49:209-218, 2018 69. Emmert-Streib F, Dehmer M, Haibe-Kains B: Gene regulatory networks and their applications: Understanding biological and medical problems in terms of networks. Front Cell Dev Biol 2:38, 2014 70. Wooden B, Goossens N, Hoshida Y, et al: Using big data to discover diagnostics and therapeutics for gastrointestinal and liver diseases. Gastroenterology 152:53-67.e3, 2017 71. Imaging-AMARETTO HTML report of module 125 from the TCGA LGG cohort. http://portals.broadinstitute.org/pochetlab/JCO_CCI_Imaging-AMARETTO/ Imaging-AMARETTO_HTML_Report_TCGA-GBM_IVYGAP-GBM_TCGA-LGG/TCGA_LGG/AMARETTOhtmls/modules/module125.html 72. Imaging-AMARETTO HTML report of module 64 from the IvyGAP GBM cohort. http://portals.broadinstitute.org/pochetlab/JCO_CCI_Imaging-AMARETTO/ Imaging-AMARETTO_HTML_Report_TCGA-GBM_IVYGAP-GBM_TCGA-LGG/Ivygap_GBM/AMARETTOhtmls/modules/module64.html 73. Imaging-AMARETTO HTML report of module 91 from the TCGA LGG cohort. http://portals.broadinstitute.org/pochetlab/JCO_CCI_Imaging-AMARETTO/ Imaging-AMARETTO_HTML_Report_TCGA-GBM_IVYGAP-GBM_TCGA-LGG/TCGA_LGG/AMARETTOhtmls/modules/module91.html 74. Imaging-AMARETTO HTML report of module 75 from the TCGA GBM cohort. http://portals.broadinstitute.org/pochetlab/JCO_CCI_Imaging-AMARETTO/ Imaging-AMARETTO_HTML_Report_TCGA-GBM_IVYGAP-GBM_TCGA-LGG/TCGA_GBM/AMARETTOhtmls/modules/module75.html 75. Imaging-AMARETTO HTML report of module 98 from the TCGA GBM cohort. http://portals.broadinstitute.org/pochetlab/JCO_CCI_Imaging-AMARETTO/ Imaging-AMARETTO_HTML_Report_TCGA-GBM_IVYGAP-GBM_TCGA-LGG/TCGA_GBM/AMARETTOhtmls/modules/module98.html nn n 434 © 2020 by American Society of Clinical Oncology Imaging-AMARETTO: A Software Tool for Imaging Genomics in Cancer APPENDIX Communities 1 11 21 31 2 12 22 32 3 13 23 33 4 14 24 34 5 15 25 35 6 16 26 36 7 17 27 37 8 18 28 38 9 19 29 10 20 30 C2 C5 C1 Networks TCGA GBM cohort TCGA LGG cohort IvyGAP GBM cohort Stemness signatures Immune signatures FIG A1. Imaging-Community-AMARETTO identifies known drivers AHR, STAT3, CCR2,and OLIG2 and uncovers novel master drivers THBS1 and MAP2 that link distinct key mechanisms that underlie glioma. In this Imaging-Community- AMARETTO graph, the nodes represent the regulatory modules or circuits that are learned from the three studies of brain cancer (The Cancer Genome Atlas [TCGA] glioblastoma multiforme [GBM], Ivy Glioblastoma Project [IvyGAP] GBM, and TCGA low-grade glioma [LGG]; node sizes are scaled by the number of driver and target genes in the modules), the edges represent the extent of overlapping genes between the modules across the three cohorts (edge thickness is scaled with the significance of the overlapping genes between modules), and the clouds represent how the modules across the three cohorts are grouped into the communities or subnetworks that are learned using the Girvan-Newman edge betweenness community detection algorithm. Imaging-Community-AMARETTO organized modules regulated by known drivers of tumor-associated microglia and macrophage mechanisms STAT3, AHR,and CCR2 and neurodevelopmental and stemness mechanism OLIG2 into three communities. Community 5 (C5) links STAT3, AHR,and CCR2, and C1 links AHR and CCR2 as activators of shared modules. Modules regulated by OLIG2 are represented in C2 with OLIG2 as activator of its modules. Of note, C2 also contains a module that links OLIG2 and CCR2 co-acting as repressor and activator, respectively (Fig 6). Imaging-Community-AMARETTO also uncovered THBS1 and MAP2 as novel master drivers across the three STAT3, AHR, CCR2,and OLIG2 communities that provide new insights into how these distinct key mechanisms are linked in glioma. THBS1 is an activator driver of three modules in C1 and C5, and a repressor driver of three modules in C2. MAP2 is a repressor driver of three modules in C1 and C5 and an activator driver of six modules in C2 (except repressor of TCGA GBM module 98). Taken together, C1 links AHR and THBS1 (TCGA GBM module 79). C2 links OLIG2, MAP2, and THBS1 (TCGA GBM module 75; Fig 5); OLIG2 and MAP2 (IvyGAP GBM module 38, TCGA GBM module 61); CCR2, OLIG2,and MAP2 (TCGA GBM module 98; Fig 6). C5 links CCR2, AHR,and THBS1 (IvyGAP GBM module 64; Fig 3)and AHR and STAT3 (TCGA LGG module 125; Fig 2). Using genetic knockdown experiments of THBS1 from LINCS/Connectivity Map, we confirmed that THBS1 acts as activator and repressor drivers of its targets in IvyGAP GBM module 64 (Fig 3) and TCGA GBM module 75 (Fig 5), respectively (http://portals.broadinstitute.org/ pochetlab/JCO_CCI_Imaging-AMARETTO/Imaging-AMARETTO_HTML_Report_TCGA-GBM_IVYGAP-GBM_TCGA- LGG/; www.ndexbio.org/#/network/16820740-d7ea-11e9-bb65-0ac135e8bacf). JCO Clinical Cancer Informatics 435

Journal

JCO Clinical Cancer InformaticsWolters Kluwer Health

Published: May 8, 2020

References