Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Integrating ontologies of human diseases, phenotypes, and radiological diagnosis

Integrating ontologies of human diseases, phenotypes, and radiological diagnosis Abstract Mappings between ontologies enable reuse and interoperability of biomedical knowledge. The Radiology Gamuts Ontology (RGO)—an ontology of 16 918 diseases, interventions, and imaging observations—provides a resource for differential diagnosis and automated textual report understanding in radiology. An automated process with subsequent manual review was used to identify exact and partial matches of RGO entities to the Disease Ontology (DO) and the Human Phenotype Ontology (HPO). Exact mappings identified equivalent concepts; partial mappings identified subclass and superclass relationships. A total of 7913 distinct RGO entities (46.8%) were mapped to one or both of the two target ontologies. Integration of RGO’s causal knowledge resulted in 9605 axioms that expressed direct causal relationships between DO diseases and HPO phenotypic abnormalities, and allowed one to formulate queries about causal relations using the abstraction properties in those two ontologies. The mappings can be used to support automated diagnostic reasoning, data mining, and knowledge discovery. biomedical ontologies, differential diagnosis, radiology, systems integration, knowledge representation BACKGROUND AND SIGNIFICANCE Biomedical ontologies express knowledge in a human-readable and machine-computable form.1 A key strength of the ontology framework is the ability to join related ontologies by mapping identical or related terms. Such mappings enable systems to integrate each ontology’s content to create a broader, more general resource that can be applied across domains. In that way, the knowledge expressed in the ontologies can be shared and reused across domains. Radiology Gamuts Ontology Clinical imaging plays a central role in medical diagnosis and treatment. Imaging findings represent part of an individual’s phenotype, and can characterize genomic, epigenomic, and gene-expression patterns that enable targeted therapies under the rubric of precision medicine. In radiology, a key area for knowledge is the set of relationships between imaging findings and diagnoses. To that end, the Radiology Gamuts Ontology (RGO) provides a formal representation of differential diagnosis in radiology.2 RGO comprises a large set of classes that include disorders (eg Stevens-Johnson syndrome; RGO:13884), interventions (eg talc pleurodesis; RGO:25171), and imaging manifestations (eg pericardial effusion; RGO:3706). RGO includes the subsumption (is_a) relation to define a conventional hierarchy between more specific and more general concepts. The ontology’s hierarchy is relatively “flat”: the ontology includes 1782 is_a relations; most of the ontology’s entities are subclasses of the top-level Entity class. RGO specifies a causal (may_cause) relation and its inverse (may_be_caused_by) to relate entities to other disorders and to imaging manifestations. RGO’s causal relation does not equate to logical implication, nor should one consider an observation’s specified causes to be exhaustive. RGO thus supports “open-world” inference, but provides useful information to link diagnoses and imaging observations. To improve the ability to automate reasoning about diseases and their manifestations in medical imaging examinations, we sought to map RGO concepts to ontologies that organize and characterize human diseases and the manifestations of those diseases. Disease Ontology The Disease Ontology (DO) provides an extensive, open-source vocabulary of human disease to enable integration of disease-associated biomedical data.3,4 The ontology is organized hierarchically: its top-level class, disease (DOID:0000004), has 8 primary subclasses, such as disease by infectious agent (DOID:0050117). Subsumption (is_a) relations specify disease subclasses within the hierarchy. DO’s classes can have more than one parent; for example, thyroid cancer (DOID:0001781) has parents endocrine gland cancer (DOID:0000170) and thyroid gland disease (DOID:0000050). DO has been used to unify disease annotations across model organisms and improve interoperability between biological and clinical human disease-related data.5,6 The ontology also has been applied to identify the proteins related to diseases through analysis of the number of interactions with other proteins and the disease relationships of those proteins.7,8 DO’s disease classification has provided a framework to identify and define relationships between diseases and phenotypes, genotypes, and various other disease attributes from other ontologies.4 Human Phenotype Ontology The Human Phenotype Ontology (HPO) provides a structured and controlled vocabulary for the phenotypic features of hereditary, congenital, and acquired diseases.9–12 HPO focused initially on monogenic diseases, but now includes features of more than 3400 common disorders.13 HPO’s top-level class All (HP:0000001) has 5 subclasses, clinical modifier, mode of inheritance, frequency, mortality/aging, and phenotypic abnormality (HP:0000118), the last of which serves as the top-level node of the phenotypic abnormality subontology. HPO classes can have more than one parent; for example, gout (HP:0001997) has superclasses arthritis (HP:0001369) and hyperuricemia (HP:0002149). Some HPO classes incorporate logical definitions using terms from ontologies for anatomy, cell types, function, embryology, pathology and other domains.11 HPO has been applied to increase interoperability of phenotype information in rare diseases,14 to link and interpret the phenotypic abnormalities that result from genomic variation,11 and to extract phenotype concepts from electronic health record texts to identify potential causal genes.15 HPO supports formation of differential diagnoses based on matching of clinical information to phenotypic abnormalities at varying levels of specificity in the ontology’s hierarchy.10 MATERIALS AND METHODS We explored potential relationships from RGO to DO and HPO. At the time of this study, RGO (version 0.7; release 2018-02-01) incorporated 16 918 classes (plus 2954 synonyms), 1782 subsumption relations, and 55 569 causal relations. DO (release 2017-11-28; uploaded to NCBO BioPortal 6 February 2018) contained 12 498 classes, of which 3799 were flagged as obsolete. HPO (release 2018-07-25) contained 17 261 classes, of which 3655 were flagged as obsolete. The current work excluded obsolete classes, and considered the 8699 active DO classes and 13 606 active HPO classes. For the purposes of our analysis, we defined the “primary subclasses” as the 8 direct subclasses of the top-level DO entity disease (DOID:0000004) and the 25 direct subclasses of the HPO entity phenotypic abnormality (HP:0000118). All of the ontologies were accessed through the National Center for Biomedical Ontology (NCBO) BioPortal web site (http://bioportal.bioontology.org/). NCBO Annotator16,17 was used to annotate all of the RGO terms with their longest matching strings in DO and HPO, including synonyms; the software was accessed through NCBO BioPortal web services.18,19 Matches were reviewed manually. We tallied the number and types of mappings from RGO to DO and HPO. We identified the primary subclasses of each mapped RGO term, and tallied the distribution of RGO entities into those subclasses. Primary subclasses were defined as direct subclasses of DO entity disease and HPO entity phenotypic abnormality. To demonstrate the impact of the mappings, we computed the number of axioms linking DO and HPO terms through RGO’s causal relation, and provided examples to show how the integration can support inference across the ontologies. RESULTS Mapping of terms NCBO Annotator identified 7143 candidate mappings of RGO entities to DO and 8398 candidate mappings to HPO. The RGO term cleft hand or foot (RGO:12205) was mapped initially to “cleft hand” as a synonym of ectrodactyly (HP:0100257), which has subclasses of split hand (HP:0001171) and split foot (HP:0001839). The RGO term was determined to be equivalent to the HPO term ectrodactyly. Similarly, carpal and/or tarsal fusion (RGO:12199) was matched partially to the HPO term “tarsal fusion” (tarsal synostosis; HP:0008368), but was determined to be equivalent to that term’s parent, synostosis of carpals/tarsals (HP:0100266). Although gastric volvulus (RGO:3751) mapped originally to volvulus (HP:0002580), HPO defines volvulus as an “Abnormal twisting of a portion of intestine…”; thus, the RGO term was mapped instead as a subclass of the HPO term abnormality of the stomach (HP:0002577). The RGO term vocal cord paralysis or paresis (RGO:9427) was mapped automatically to vocal cord paralysis (HP:0001605). On review, the RGO entity was mapped to that term and to vocal cord paresis (HP:0001604) with inverse_is_a relations. Overall, 5485 RGO terms (32.4%) were mapped to DO terms and 4819 RGO terms (28.5%) were mapped to HPO terms; a total of 7913 distinct RGO entities (46.8%) were mapped to one or both of the 2 target ontologies. From RGO to DO, there were 2104 equivalent concepts, 3357 subclass (is_a) relations, and 24 superclass (inverse_is_a) relations. From RGO to HPO, there were 2057 equivalent concepts, 2723 subclass relations, and 39 superclass relations. We identified the “ancestors” of RGO classes in the DO and HPO hierarchies (Tables 1 and 2). For example, cervical lymphadenopathy (RGO:13) was mapped to the equivalent HPO term (HP:0025289). The hierarchy of HPO classes allows one to follow the subsumption relations to identify its ancestor, abnormality of the immune system (HP:0002715), which is a subclass of phenotypic abnormality. Table 1. Distribution of mapped Radiology Gamuts Ontology (RGO) entities and Disease Ontology (DO) entities by top-level disease subclass. Because some DO entities have two or more parents, DO classes may have more than one ancestor, and hence the total number of ancestors is greater than the number of classes. Note that 1030 RGO classes, such as Cantrell syndrome (RGO: 921), were mapped as subclasses of the DO class syndrome (DOID:225) Disease Subclass Mapped RGO Classes All DO Classes disease by infectious agent (DOID:0050117) 161 439 disease of anatomical entity (DOID:7) 2757 5747 disease of cellular proliferation (DOID:14566) 1101 2632 disease of mental health (DOID:150) 33 334 disease of metabolism (DOID:0014667) 191 445 genetic disease (DOID:630) 224 529 physical disorder (DOID:0080015) 82 53 syndrome (DOID:225) 1107 146 Total 5656 10 325 Disease Subclass Mapped RGO Classes All DO Classes disease by infectious agent (DOID:0050117) 161 439 disease of anatomical entity (DOID:7) 2757 5747 disease of cellular proliferation (DOID:14566) 1101 2632 disease of mental health (DOID:150) 33 334 disease of metabolism (DOID:0014667) 191 445 genetic disease (DOID:630) 224 529 physical disorder (DOID:0080015) 82 53 syndrome (DOID:225) 1107 146 Total 5656 10 325 Table 1. Distribution of mapped Radiology Gamuts Ontology (RGO) entities and Disease Ontology (DO) entities by top-level disease subclass. Because some DO entities have two or more parents, DO classes may have more than one ancestor, and hence the total number of ancestors is greater than the number of classes. Note that 1030 RGO classes, such as Cantrell syndrome (RGO: 921), were mapped as subclasses of the DO class syndrome (DOID:225) Disease Subclass Mapped RGO Classes All DO Classes disease by infectious agent (DOID:0050117) 161 439 disease of anatomical entity (DOID:7) 2757 5747 disease of cellular proliferation (DOID:14566) 1101 2632 disease of mental health (DOID:150) 33 334 disease of metabolism (DOID:0014667) 191 445 genetic disease (DOID:630) 224 529 physical disorder (DOID:0080015) 82 53 syndrome (DOID:225) 1107 146 Total 5656 10 325 Disease Subclass Mapped RGO Classes All DO Classes disease by infectious agent (DOID:0050117) 161 439 disease of anatomical entity (DOID:7) 2757 5747 disease of cellular proliferation (DOID:14566) 1101 2632 disease of mental health (DOID:150) 33 334 disease of metabolism (DOID:0014667) 191 445 genetic disease (DOID:630) 224 529 physical disorder (DOID:0080015) 82 53 syndrome (DOID:225) 1107 146 Total 5656 10 325 Table 2. Distribution of Human Phenotype Ontology entities by top-level phenotypic abnormality subclass. Some HPO entities have two or more parents; hence, classes may have multiple ancestors Phenotypic Abnormality Subclass Mapped RGO Classes All HPO Classes abnormal cellular phenotype (HP:0025354) 0 49 abnormality of blood and blood-forming tissues (HP:0001871) 163 563 abnormality of connective tissue (HP:0003549) 160 204 abnormality of head or neck (HP:0000152) 400 1323 abnormality of limbs (HP:0040064) 278 2734 abnormality of metabolism/homeostasis (HP:0001939) 214 1007 abnormality of prenatal development or birth (HP:0001197) 34 135 abnormality of the breast (HP:0000769) 13 32 abnormality of the cardiovascular system (HP:0001626) 532 1111 abnormality of the digestive system (HP:0025031) 426 596 abnormality of the ear (HP:0000598) 53 292 abnormality of the endocrine system (HP:0000818) 126 398 abnormality of the eye (HP:0000478) 127 1102 abnormality of the genitourinary system (HP:0000119) 359 851 abnormality of the immune system (HP:0002715) 414 633 abnormality of the integument (HP:0001574) 220 886 abnormality of the musculature (HP:0003011) 125 592 abnormality of the nervous system (HP:0000707) 543 1737 abnormality of the respiratory system (HP:0002086) 289 377 abnormality of the skeletal system (HP:0000924) 1030 3615 abnormality of the thoracic cavity (HP:0045027) 3 4 abnormality of the voice (HP:0001608) 4 29 constitutional symptom (HP:0025142) 14 69 growth abnormality (HP:0001507) 57 96 neoplasm (HP:0002664) 985 584 Total 6569 19 019 Phenotypic Abnormality Subclass Mapped RGO Classes All HPO Classes abnormal cellular phenotype (HP:0025354) 0 49 abnormality of blood and blood-forming tissues (HP:0001871) 163 563 abnormality of connective tissue (HP:0003549) 160 204 abnormality of head or neck (HP:0000152) 400 1323 abnormality of limbs (HP:0040064) 278 2734 abnormality of metabolism/homeostasis (HP:0001939) 214 1007 abnormality of prenatal development or birth (HP:0001197) 34 135 abnormality of the breast (HP:0000769) 13 32 abnormality of the cardiovascular system (HP:0001626) 532 1111 abnormality of the digestive system (HP:0025031) 426 596 abnormality of the ear (HP:0000598) 53 292 abnormality of the endocrine system (HP:0000818) 126 398 abnormality of the eye (HP:0000478) 127 1102 abnormality of the genitourinary system (HP:0000119) 359 851 abnormality of the immune system (HP:0002715) 414 633 abnormality of the integument (HP:0001574) 220 886 abnormality of the musculature (HP:0003011) 125 592 abnormality of the nervous system (HP:0000707) 543 1737 abnormality of the respiratory system (HP:0002086) 289 377 abnormality of the skeletal system (HP:0000924) 1030 3615 abnormality of the thoracic cavity (HP:0045027) 3 4 abnormality of the voice (HP:0001608) 4 29 constitutional symptom (HP:0025142) 14 69 growth abnormality (HP:0001507) 57 96 neoplasm (HP:0002664) 985 584 Total 6569 19 019 Table 2. Distribution of Human Phenotype Ontology entities by top-level phenotypic abnormality subclass. Some HPO entities have two or more parents; hence, classes may have multiple ancestors Phenotypic Abnormality Subclass Mapped RGO Classes All HPO Classes abnormal cellular phenotype (HP:0025354) 0 49 abnormality of blood and blood-forming tissues (HP:0001871) 163 563 abnormality of connective tissue (HP:0003549) 160 204 abnormality of head or neck (HP:0000152) 400 1323 abnormality of limbs (HP:0040064) 278 2734 abnormality of metabolism/homeostasis (HP:0001939) 214 1007 abnormality of prenatal development or birth (HP:0001197) 34 135 abnormality of the breast (HP:0000769) 13 32 abnormality of the cardiovascular system (HP:0001626) 532 1111 abnormality of the digestive system (HP:0025031) 426 596 abnormality of the ear (HP:0000598) 53 292 abnormality of the endocrine system (HP:0000818) 126 398 abnormality of the eye (HP:0000478) 127 1102 abnormality of the genitourinary system (HP:0000119) 359 851 abnormality of the immune system (HP:0002715) 414 633 abnormality of the integument (HP:0001574) 220 886 abnormality of the musculature (HP:0003011) 125 592 abnormality of the nervous system (HP:0000707) 543 1737 abnormality of the respiratory system (HP:0002086) 289 377 abnormality of the skeletal system (HP:0000924) 1030 3615 abnormality of the thoracic cavity (HP:0045027) 3 4 abnormality of the voice (HP:0001608) 4 29 constitutional symptom (HP:0025142) 14 69 growth abnormality (HP:0001507) 57 96 neoplasm (HP:0002664) 985 584 Total 6569 19 019 Phenotypic Abnormality Subclass Mapped RGO Classes All HPO Classes abnormal cellular phenotype (HP:0025354) 0 49 abnormality of blood and blood-forming tissues (HP:0001871) 163 563 abnormality of connective tissue (HP:0003549) 160 204 abnormality of head or neck (HP:0000152) 400 1323 abnormality of limbs (HP:0040064) 278 2734 abnormality of metabolism/homeostasis (HP:0001939) 214 1007 abnormality of prenatal development or birth (HP:0001197) 34 135 abnormality of the breast (HP:0000769) 13 32 abnormality of the cardiovascular system (HP:0001626) 532 1111 abnormality of the digestive system (HP:0025031) 426 596 abnormality of the ear (HP:0000598) 53 292 abnormality of the endocrine system (HP:0000818) 126 398 abnormality of the eye (HP:0000478) 127 1102 abnormality of the genitourinary system (HP:0000119) 359 851 abnormality of the immune system (HP:0002715) 414 633 abnormality of the integument (HP:0001574) 220 886 abnormality of the musculature (HP:0003011) 125 592 abnormality of the nervous system (HP:0000707) 543 1737 abnormality of the respiratory system (HP:0002086) 289 377 abnormality of the skeletal system (HP:0000924) 1030 3615 abnormality of the thoracic cavity (HP:0045027) 3 4 abnormality of the voice (HP:0001608) 4 29 constitutional symptom (HP:0025142) 14 69 growth abnormality (HP:0001507) 57 96 neoplasm (HP:0002664) 985 584 Total 6569 19 019 Integration across ontologies Through RGO’s causal relationships, we computed 9605 axioms that posited direct causal relationships between a disease (DO class) and a phenotypic abnormality (HPO class). Figure 1 illustrates the three distinct types of axioms that could be posited, all of which incorporated a causal relation between 2 RGO entities. Most of the axioms (n = 7058, type A) linked a disease and a phenotypic abnormality by equivalence relations to the causally-linked RGO entities. For example, through their equivalence relations to corresponding RGO terms, one can posit that angiosarcoma (DOID:0001816) may cause splenomegaly (HP:0001744). An additional 2545 axioms (type B) involved an equivalence relation from a disease to an RGO entity, and an is_a relation from the second RGO entity to the phenotypic abnormality. There were 2 axioms of type C, which included is_a relations from diseases tricuspid atresia (DOID:0080169) and tricuspid valve stenosis (DOID:4078) to tricuspid atresia or stenosis (RGO:2336). The direct DO-RGO-HPO causal relationships involved 1062 distinct diseases (DO classes) and 611 distinct phenotypic abnormalities (HPO classes). Example axioms are shown in Supplementary Appendix A. Figure 1. View largeDownload slide Examples of causal mappings from DO to HPO classes, based on equivalence or subsumption mappings from DO to RGO and from RGO to HPO. DO (“Disease”), RGO (“Gamuts Entity”), and HPO (“Phenotypic Abnormality”) classes are indicated by yellow (framed), white, and blue (unframed) rectangles. There were 7058 type A, 2545 type B, and 2 type C mappings. Figure 1. View largeDownload slide Examples of causal mappings from DO to HPO classes, based on equivalence or subsumption mappings from DO to RGO and from RGO to HPO. DO (“Disease”), RGO (“Gamuts Entity”), and HPO (“Phenotypic Abnormality”) classes are indicated by yellow (framed), white, and blue (unframed) rectangles. There were 7058 type A, 2545 type B, and 2 type C mappings. One also can apply the transitive property of RGO’s may_cause relation to identify entities with indirect causal links.20 Although transitivity can be applied exhaustively, such relationships typically have little clinical relevance; it is, however, reasonable to consider axioms involving one or two causal relations. Applying only the is_a relations and at most two causal relations in RGO (ie without the is_a relations within DO and HPO), one can express 28 536 “indirect” causal axioms from DO to HPO classes. For example: Crohn disease (DOID:0008778) = Crohn disease (RGO:34) may_cause osteomalacia (RGO:32782) may_cause protusio acetabuli (RGO:12478) = protusio acetabuli (HP:0003179) The integration also allows one to formulate queries of varying levels of abstraction within the DO and HPO class hierarchies. Figure 2 illustrates the axiom scleroderma (DOID:0000418) may_cause splenomegaly (HP:0001744) and displays the higher-level DO and HPO classes. As shown in the figure, scleroderma has 1 DO primary subclass: disease of anatomical entity (DOID:0000007). The HPO term splenomegaly has 2 primary subclasses: abnormality of the digestive system (HP:0025031) and abnormality of the immune system (HP:0002715). The value of abstraction is that one can pose queries, for example, such as, “Which musculoskeletal system disease may cause an abnormality of the digestive system?” Figure 2. View largeDownload slide An example mapping shows how the mappings enable RGO’s causal knowledge to link DO and HPO classes, in this case, from scleroderma to splenomegaly. One can use the subsumption hierarchies of DO and HPO to pose sophisticated queries. For example, the causal relationship between scleroderma to splenomegaly satisfies a query such as, “Which musculoskeletal system disease(s) may cause an abnormality of the digestive system?” Figure 2. View largeDownload slide An example mapping shows how the mappings enable RGO’s causal knowledge to link DO and HPO classes, in this case, from scleroderma to splenomegaly. One can use the subsumption hierarchies of DO and HPO to pose sophisticated queries. For example, the causal relationship between scleroderma to splenomegaly satisfies a query such as, “Which musculoskeletal system disease(s) may cause an abnormality of the digestive system?” DISCUSSION RGO—a reference knowledge source for differential diagnosis in radiology—is available through NCBO BioPortal (https://bioportal.bioontology.org/ontologies/GAMUTS) and is planned to be updated annually. Incorporation of prior and conditional probability data into RGO is underway to support clinical diagnostic reasoning. Ontology-based methods can guide named-entity recognition in clinical texts such as clinic notes and radiology reports. Our group has extracted data from 1.7 million radiology reports to determine conditional probability of the entire range of RGO entities. The prevalence and relative likelihood of diseases based on the patient’s imaging findings allow one to determine the most likely diagnoses, and to use the value of information to select the most informative questions to ask. RGO has been integrated with the Orphanet Rare Disease Ontology (ORDO) to support the use of RGO for differential diagnosis of rare disease.21 The mappings allow one to reason more generally over the ontologies. As described above, one can query which musculoskeletal system disease (a DO subclass) may cause abnormality of the digestive system (an HPO class). The mappings thus enable more complex and robust queries over the ontologies, and allow one to use the ontologies’ knowledge to answer sophisticated questions. The mapped ontologies also provide additional terms to support data mining from textual information in the electronic health record. The growing number and size of biomedical ontologies have engendered interest in merging and integrating content of related ontologies.22–24 Automated approaches have been developed to take advantage of concepts’ hierarchical relationships to guide the process of ontology alignment and integration, including tools such as OPTIMA25 and PROMPT.26 Because of RGO’s predominantly “flat” architecture – 15 242 (90.1%) of its 16 912 entities are direct subclasses of the top-level class – the investigators chose not to use such systems to merge or align the ontologies. The process of mapping the RGO entities to the HPO and DO concepts using the automated NCBO Annotator web service demonstrated the need for manual review of the partial matches. Decisions regarding how to best describe the relationship between an RGO entity and an HPO or DO concept required careful review of the ontology’s concept and occasional pursuit of textual definitions. For example, the RGO entity bronchial chondromalacia (RGO:32677) was identified initially as an inexact match for chondromalacia (DOID:2557); because DO classifies chondromalacia as an articular cartilage disease, the authors identified the potential relation as a non-match. In the case of the RGO entity dilatation of esophageal stricture (RGO:32918) that was a partial match with dilatation (HP:0002617), this relation was described as a non-match as well due to the HPO’s textual definition of dilatation, an “abnormal outpouching or sac-like dilatation in the wall of an artery, vein or the heart.” Equivalence mappings between ontologies promote interoperability of the data they describe, but the ability to perform such mappings may be limited by differences in focus and granularity between the ontologies. Mapping to equivalent concepts in the Systematized Nomenclature of Medicine – Clinical Terms (SNOMED CT) vocabulary has been identified for 30% of HPO classes27 and 65% of DO classes.28 RGO includes a large number of terms that are specific to imaging examinations, such as extrinsic impression on thoracic esophagus (RGO:3470) and lytic skeletal lesion (RGO:25747). Hence, mapping of 47% of RGO entities to DO and HPO is highly satisfactory. For ontologies where limited equivalence mappings can be identified, partial mappings can provide a “next-best approach for traversing between the two systems”.27 CONCLUSION The integration of genotypic and phenotypic knowledge across different sources of biomedical data sources – such as experimental and clinical data repositories – is increasingly critical to discover new treatments for both rare and common diseases.29 Exact and partial mappings from RGO to DO and HPO allow one to compute axioms that relate causal relationships between DO disease entities and HPO imaging phenotypes. Although an automated approach identified candidate mappings, manual review was essential to assure that the mappings were meaningful and semantically correct. The mappings allowed RGO concepts to be categorized within the hierarchies of disease and phenotypic abnormalities, which adds useful semantic structure to RGO terms, and enables one to posit axioms that link DO and HPO entities at multiple levels in their hierarchies. This information can support clinical diagnosis, data mining, and knowledge discovery. FUNDING This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors. CONTRIBUTORS MTF and CEK performed the analysis and wrote the manuscript. RWF incorporated the analysis into the ontology. All authors reviewed and approved the manuscript. SUPPLEMENTARY MATERIAL Supplementary material is available at Journal of the American Medical Informatics Association online. Competing interests: None REFERENCES 1 Bodenreider O. Biomedical ontologies in action: role in knowledge management, data integration and decision support . Yearb Med Inform 2008 ; 67 – 79 . 2 Budovec JJ , Lam CA , Kahn CE Jr. Radiology Gamuts Ontology: differential diagnosis for the Semantic Web . RadioGraphics 2014 ; 34 1 : 254 – 64 . Google Scholar Crossref Search ADS PubMed 3 Schriml LM , Arze C , Nadendla S , et al. . Disease Ontology: a backbone for disease semantic integration . Nucleic Acids Res 2012 ; 40 ( D1 ): D940 – 6 . Google Scholar Crossref Search ADS PubMed 4 Kibbe WA , Arze C , Felix V , et al. . Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data . Nucleic Acids Res 2015 ; 43 ( D1 ): D1071 – 8 . Google Scholar Crossref Search ADS PubMed 5 Bello SM , Shimoyama M , Mitraka E , et al. . Disease Ontology: improving and unifying disease annotations across species . Dis Model Mech 2018 ; 11 3 : dmm032839 . Google Scholar Crossref Search ADS PubMed 6 Schriml LM , Mitraka E. The Disease Ontology: fostering interoperability between biological and clinical human disease-related data . Mamm Genome 2015 ; 26 ( 9–10 ): 584 – 9 . Google Scholar Crossref Search ADS PubMed 7 Carson MB , Lu H. Network-based prediction and knowledge mining of disease genes . BMC Med Genomics 2015 ; 8 (Suppl 2) : S9. Google Scholar Crossref Search ADS PubMed 8 LePendu P , Musen MA , Shah NH. Enabling enrichment analysis with the Human Disease Ontology . J Biomed Inform 2011 ; 44 (Suppl 1) : S31 – 8 . Google Scholar Crossref Search ADS PubMed 9 Robinson PN , Kohler S , Bauer S , Seelow D , Horn D , Mundlos S. The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease . Am J Hum Genet 2008 ; 83 5 : 610 – 5 . Google Scholar Crossref Search ADS PubMed 10 Robinson PN , Mundlos S. The Human Phenotype Ontology . Clin Genet 2010 ; 77 6 : 525 – 34 . Google Scholar Crossref Search ADS PubMed 11 Köhler S , Doelken SC , Mungall CJ , et al. . The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data . Nucleic Acids Res 2014 ; 42 ( D1 ): D966 – 74 . Google Scholar Crossref Search ADS PubMed 12 Köhler S , Vasilevsky NA , Engelstad M , et al. . The Human Phenotype Ontology in 2017 . Nucleic Acids Res 2017 ; 45 ( D1 ): D865 – 76 . Google Scholar Crossref Search ADS PubMed 13 Groza T , Kohler S , Moldenhauer D , et al. . The Human Phenotype Ontology: semantic unification of common and rare disease . Am J Hum Genet 2015 ; 97 1 : 111 – 24 . Google Scholar Crossref Search ADS PubMed 14 Maiella S , Olry A , Hanauer M , et al. . Harmonising phenomics information for a better interoperability in the rare disease field . Eur J Med Genet 2018 ; 61 11 : 706 – 14 . 15 Son JH , Xie G , Yuan C , et al. . Deep phenotyping on electronic health records facilitates genetic diagnosis by clinical exomes . Am J Hum Genet 2018 ; 103 1 : 58 – 73 . Google Scholar Crossref Search ADS PubMed 16 Jonquet C , Shah NH , Musen MA. The open biomedical annotator . Summit Transl Bioinform 2009 ; 2009 : 56 – 60 . Google Scholar PubMed 17 Shah NH , Bhatia N , Jonquet C , Rubin D , Chiang AP , Musen MA. Comparison of concept recognizers for building the Open Biomedical Annotator . BMC Bioinformatics 2009 ; 10 (Suppl 9) : S14. Google Scholar Crossref Search ADS PubMed 18 Noy NF , Shah NH , Whetzel PL , et al. . BioPortal: ontologies and integrated data resources at the click of a mouse . Nucleic Acids Res 2009 ; 37 ( Web Server issue ): W170 – 3 . Google Scholar Crossref Search ADS PubMed 19 Whetzel PL , Noy NF , Shah NH , et al. . BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications . Nucleic Acids Res 2011 ; 39 ( Web Server issue ): W541 – 5 . Google Scholar Crossref Search ADS PubMed 20 Kahn CE Jr. Transitive closure of subsumption and causal relations in a large ontology for radiology diagnosis . J Biomed Inform 2016 ; 61 : 27 – 33 . Google Scholar Crossref Search ADS PubMed 21 Kahn CE Jr. Integrating ontologies of rare diseases and radiological diagnosis . J Am Med Inform Assoc 2015 ; 22 6 : 1164 – 8 . Google Scholar Crossref Search ADS PubMed 22 Dragisic Z , Ivanova V , Li H , Lambrix P. Experiences from the anatomy track in the ontology alignment evaluation initiative . J Biomed Semantics 2017 ; 8 1 : 56 . Google Scholar Crossref Search ADS PubMed 23 Harrow I , Jimenez-Ruiz E , Splendiani A , et al. . Matching disease and phenotype ontologies in the ontology alignment evaluation initiative . J Biomed Semantics 2017 ; 8 1 : 55 . Google Scholar Crossref Search ADS PubMed 24 Kolyvakis P , Kalousis A , Smith B , Kiritsis D. Biomedical ontology alignment: an approach based on representation learning . J Biomed Semantics 2018 ; 9 1 : 21 . Google Scholar Crossref Search ADS PubMed 25 Doshi P , Kolli R , Thomas C. Inexact matching of ontology graphs using expectation-maximization . Web Semantics 2009 ; 7 2 : 90 – 106 . Google Scholar Crossref Search ADS PubMed 26 Noy NF , Musen MA. The PROMPT suite: interactive tools for ontology merging and mapping . Int J Hum Comput Stud 2003 ; 59 6 : 983 – 1024 . Google Scholar Crossref Search ADS 27 Dhombres F , Bodenreider O. Interoperability between phenotypes in research and healthcare terminologies–Investigating partial mappings between HPO and SNOMED CT . J Biomed Semantics 2016 ; 7 3 . 28 Raje S , Bodenreider O. Interoperability of disease concepts in clinical and research ontologies: contrasting coverage and structure in the Disease Ontology and SNOMED CT . Stud Health Technol Inform 2017 ; 245 : 925 – 9 . Google Scholar PubMed 29 Denaxas SC. Integrating bio-ontologies and controlled clinical terminologies: from base pairs to bedside phenotypes . Methods Mol Biol 2017 ; 1446 : 275 – 87 . Google Scholar Crossref Search ADS PubMed © The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of the American Medical Informatics Association Oxford University Press

Integrating ontologies of human diseases, phenotypes, and radiological diagnosis

Loading next page...
 
/lp/oxford-university-press/integrating-ontologies-of-human-diseases-phenotypes-and-radiological-pyIhI4Cf0j

References (31)

Publisher
Oxford University Press
Copyright
© The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com
ISSN
1067-5027
eISSN
1527-974X
DOI
10.1093/jamia/ocy161
Publisher site
See Article on Publisher Site

Abstract

Abstract Mappings between ontologies enable reuse and interoperability of biomedical knowledge. The Radiology Gamuts Ontology (RGO)—an ontology of 16 918 diseases, interventions, and imaging observations—provides a resource for differential diagnosis and automated textual report understanding in radiology. An automated process with subsequent manual review was used to identify exact and partial matches of RGO entities to the Disease Ontology (DO) and the Human Phenotype Ontology (HPO). Exact mappings identified equivalent concepts; partial mappings identified subclass and superclass relationships. A total of 7913 distinct RGO entities (46.8%) were mapped to one or both of the two target ontologies. Integration of RGO’s causal knowledge resulted in 9605 axioms that expressed direct causal relationships between DO diseases and HPO phenotypic abnormalities, and allowed one to formulate queries about causal relations using the abstraction properties in those two ontologies. The mappings can be used to support automated diagnostic reasoning, data mining, and knowledge discovery. biomedical ontologies, differential diagnosis, radiology, systems integration, knowledge representation BACKGROUND AND SIGNIFICANCE Biomedical ontologies express knowledge in a human-readable and machine-computable form.1 A key strength of the ontology framework is the ability to join related ontologies by mapping identical or related terms. Such mappings enable systems to integrate each ontology’s content to create a broader, more general resource that can be applied across domains. In that way, the knowledge expressed in the ontologies can be shared and reused across domains. Radiology Gamuts Ontology Clinical imaging plays a central role in medical diagnosis and treatment. Imaging findings represent part of an individual’s phenotype, and can characterize genomic, epigenomic, and gene-expression patterns that enable targeted therapies under the rubric of precision medicine. In radiology, a key area for knowledge is the set of relationships between imaging findings and diagnoses. To that end, the Radiology Gamuts Ontology (RGO) provides a formal representation of differential diagnosis in radiology.2 RGO comprises a large set of classes that include disorders (eg Stevens-Johnson syndrome; RGO:13884), interventions (eg talc pleurodesis; RGO:25171), and imaging manifestations (eg pericardial effusion; RGO:3706). RGO includes the subsumption (is_a) relation to define a conventional hierarchy between more specific and more general concepts. The ontology’s hierarchy is relatively “flat”: the ontology includes 1782 is_a relations; most of the ontology’s entities are subclasses of the top-level Entity class. RGO specifies a causal (may_cause) relation and its inverse (may_be_caused_by) to relate entities to other disorders and to imaging manifestations. RGO’s causal relation does not equate to logical implication, nor should one consider an observation’s specified causes to be exhaustive. RGO thus supports “open-world” inference, but provides useful information to link diagnoses and imaging observations. To improve the ability to automate reasoning about diseases and their manifestations in medical imaging examinations, we sought to map RGO concepts to ontologies that organize and characterize human diseases and the manifestations of those diseases. Disease Ontology The Disease Ontology (DO) provides an extensive, open-source vocabulary of human disease to enable integration of disease-associated biomedical data.3,4 The ontology is organized hierarchically: its top-level class, disease (DOID:0000004), has 8 primary subclasses, such as disease by infectious agent (DOID:0050117). Subsumption (is_a) relations specify disease subclasses within the hierarchy. DO’s classes can have more than one parent; for example, thyroid cancer (DOID:0001781) has parents endocrine gland cancer (DOID:0000170) and thyroid gland disease (DOID:0000050). DO has been used to unify disease annotations across model organisms and improve interoperability between biological and clinical human disease-related data.5,6 The ontology also has been applied to identify the proteins related to diseases through analysis of the number of interactions with other proteins and the disease relationships of those proteins.7,8 DO’s disease classification has provided a framework to identify and define relationships between diseases and phenotypes, genotypes, and various other disease attributes from other ontologies.4 Human Phenotype Ontology The Human Phenotype Ontology (HPO) provides a structured and controlled vocabulary for the phenotypic features of hereditary, congenital, and acquired diseases.9–12 HPO focused initially on monogenic diseases, but now includes features of more than 3400 common disorders.13 HPO’s top-level class All (HP:0000001) has 5 subclasses, clinical modifier, mode of inheritance, frequency, mortality/aging, and phenotypic abnormality (HP:0000118), the last of which serves as the top-level node of the phenotypic abnormality subontology. HPO classes can have more than one parent; for example, gout (HP:0001997) has superclasses arthritis (HP:0001369) and hyperuricemia (HP:0002149). Some HPO classes incorporate logical definitions using terms from ontologies for anatomy, cell types, function, embryology, pathology and other domains.11 HPO has been applied to increase interoperability of phenotype information in rare diseases,14 to link and interpret the phenotypic abnormalities that result from genomic variation,11 and to extract phenotype concepts from electronic health record texts to identify potential causal genes.15 HPO supports formation of differential diagnoses based on matching of clinical information to phenotypic abnormalities at varying levels of specificity in the ontology’s hierarchy.10 MATERIALS AND METHODS We explored potential relationships from RGO to DO and HPO. At the time of this study, RGO (version 0.7; release 2018-02-01) incorporated 16 918 classes (plus 2954 synonyms), 1782 subsumption relations, and 55 569 causal relations. DO (release 2017-11-28; uploaded to NCBO BioPortal 6 February 2018) contained 12 498 classes, of which 3799 were flagged as obsolete. HPO (release 2018-07-25) contained 17 261 classes, of which 3655 were flagged as obsolete. The current work excluded obsolete classes, and considered the 8699 active DO classes and 13 606 active HPO classes. For the purposes of our analysis, we defined the “primary subclasses” as the 8 direct subclasses of the top-level DO entity disease (DOID:0000004) and the 25 direct subclasses of the HPO entity phenotypic abnormality (HP:0000118). All of the ontologies were accessed through the National Center for Biomedical Ontology (NCBO) BioPortal web site (http://bioportal.bioontology.org/). NCBO Annotator16,17 was used to annotate all of the RGO terms with their longest matching strings in DO and HPO, including synonyms; the software was accessed through NCBO BioPortal web services.18,19 Matches were reviewed manually. We tallied the number and types of mappings from RGO to DO and HPO. We identified the primary subclasses of each mapped RGO term, and tallied the distribution of RGO entities into those subclasses. Primary subclasses were defined as direct subclasses of DO entity disease and HPO entity phenotypic abnormality. To demonstrate the impact of the mappings, we computed the number of axioms linking DO and HPO terms through RGO’s causal relation, and provided examples to show how the integration can support inference across the ontologies. RESULTS Mapping of terms NCBO Annotator identified 7143 candidate mappings of RGO entities to DO and 8398 candidate mappings to HPO. The RGO term cleft hand or foot (RGO:12205) was mapped initially to “cleft hand” as a synonym of ectrodactyly (HP:0100257), which has subclasses of split hand (HP:0001171) and split foot (HP:0001839). The RGO term was determined to be equivalent to the HPO term ectrodactyly. Similarly, carpal and/or tarsal fusion (RGO:12199) was matched partially to the HPO term “tarsal fusion” (tarsal synostosis; HP:0008368), but was determined to be equivalent to that term’s parent, synostosis of carpals/tarsals (HP:0100266). Although gastric volvulus (RGO:3751) mapped originally to volvulus (HP:0002580), HPO defines volvulus as an “Abnormal twisting of a portion of intestine…”; thus, the RGO term was mapped instead as a subclass of the HPO term abnormality of the stomach (HP:0002577). The RGO term vocal cord paralysis or paresis (RGO:9427) was mapped automatically to vocal cord paralysis (HP:0001605). On review, the RGO entity was mapped to that term and to vocal cord paresis (HP:0001604) with inverse_is_a relations. Overall, 5485 RGO terms (32.4%) were mapped to DO terms and 4819 RGO terms (28.5%) were mapped to HPO terms; a total of 7913 distinct RGO entities (46.8%) were mapped to one or both of the 2 target ontologies. From RGO to DO, there were 2104 equivalent concepts, 3357 subclass (is_a) relations, and 24 superclass (inverse_is_a) relations. From RGO to HPO, there were 2057 equivalent concepts, 2723 subclass relations, and 39 superclass relations. We identified the “ancestors” of RGO classes in the DO and HPO hierarchies (Tables 1 and 2). For example, cervical lymphadenopathy (RGO:13) was mapped to the equivalent HPO term (HP:0025289). The hierarchy of HPO classes allows one to follow the subsumption relations to identify its ancestor, abnormality of the immune system (HP:0002715), which is a subclass of phenotypic abnormality. Table 1. Distribution of mapped Radiology Gamuts Ontology (RGO) entities and Disease Ontology (DO) entities by top-level disease subclass. Because some DO entities have two or more parents, DO classes may have more than one ancestor, and hence the total number of ancestors is greater than the number of classes. Note that 1030 RGO classes, such as Cantrell syndrome (RGO: 921), were mapped as subclasses of the DO class syndrome (DOID:225) Disease Subclass Mapped RGO Classes All DO Classes disease by infectious agent (DOID:0050117) 161 439 disease of anatomical entity (DOID:7) 2757 5747 disease of cellular proliferation (DOID:14566) 1101 2632 disease of mental health (DOID:150) 33 334 disease of metabolism (DOID:0014667) 191 445 genetic disease (DOID:630) 224 529 physical disorder (DOID:0080015) 82 53 syndrome (DOID:225) 1107 146 Total 5656 10 325 Disease Subclass Mapped RGO Classes All DO Classes disease by infectious agent (DOID:0050117) 161 439 disease of anatomical entity (DOID:7) 2757 5747 disease of cellular proliferation (DOID:14566) 1101 2632 disease of mental health (DOID:150) 33 334 disease of metabolism (DOID:0014667) 191 445 genetic disease (DOID:630) 224 529 physical disorder (DOID:0080015) 82 53 syndrome (DOID:225) 1107 146 Total 5656 10 325 Table 1. Distribution of mapped Radiology Gamuts Ontology (RGO) entities and Disease Ontology (DO) entities by top-level disease subclass. Because some DO entities have two or more parents, DO classes may have more than one ancestor, and hence the total number of ancestors is greater than the number of classes. Note that 1030 RGO classes, such as Cantrell syndrome (RGO: 921), were mapped as subclasses of the DO class syndrome (DOID:225) Disease Subclass Mapped RGO Classes All DO Classes disease by infectious agent (DOID:0050117) 161 439 disease of anatomical entity (DOID:7) 2757 5747 disease of cellular proliferation (DOID:14566) 1101 2632 disease of mental health (DOID:150) 33 334 disease of metabolism (DOID:0014667) 191 445 genetic disease (DOID:630) 224 529 physical disorder (DOID:0080015) 82 53 syndrome (DOID:225) 1107 146 Total 5656 10 325 Disease Subclass Mapped RGO Classes All DO Classes disease by infectious agent (DOID:0050117) 161 439 disease of anatomical entity (DOID:7) 2757 5747 disease of cellular proliferation (DOID:14566) 1101 2632 disease of mental health (DOID:150) 33 334 disease of metabolism (DOID:0014667) 191 445 genetic disease (DOID:630) 224 529 physical disorder (DOID:0080015) 82 53 syndrome (DOID:225) 1107 146 Total 5656 10 325 Table 2. Distribution of Human Phenotype Ontology entities by top-level phenotypic abnormality subclass. Some HPO entities have two or more parents; hence, classes may have multiple ancestors Phenotypic Abnormality Subclass Mapped RGO Classes All HPO Classes abnormal cellular phenotype (HP:0025354) 0 49 abnormality of blood and blood-forming tissues (HP:0001871) 163 563 abnormality of connective tissue (HP:0003549) 160 204 abnormality of head or neck (HP:0000152) 400 1323 abnormality of limbs (HP:0040064) 278 2734 abnormality of metabolism/homeostasis (HP:0001939) 214 1007 abnormality of prenatal development or birth (HP:0001197) 34 135 abnormality of the breast (HP:0000769) 13 32 abnormality of the cardiovascular system (HP:0001626) 532 1111 abnormality of the digestive system (HP:0025031) 426 596 abnormality of the ear (HP:0000598) 53 292 abnormality of the endocrine system (HP:0000818) 126 398 abnormality of the eye (HP:0000478) 127 1102 abnormality of the genitourinary system (HP:0000119) 359 851 abnormality of the immune system (HP:0002715) 414 633 abnormality of the integument (HP:0001574) 220 886 abnormality of the musculature (HP:0003011) 125 592 abnormality of the nervous system (HP:0000707) 543 1737 abnormality of the respiratory system (HP:0002086) 289 377 abnormality of the skeletal system (HP:0000924) 1030 3615 abnormality of the thoracic cavity (HP:0045027) 3 4 abnormality of the voice (HP:0001608) 4 29 constitutional symptom (HP:0025142) 14 69 growth abnormality (HP:0001507) 57 96 neoplasm (HP:0002664) 985 584 Total 6569 19 019 Phenotypic Abnormality Subclass Mapped RGO Classes All HPO Classes abnormal cellular phenotype (HP:0025354) 0 49 abnormality of blood and blood-forming tissues (HP:0001871) 163 563 abnormality of connective tissue (HP:0003549) 160 204 abnormality of head or neck (HP:0000152) 400 1323 abnormality of limbs (HP:0040064) 278 2734 abnormality of metabolism/homeostasis (HP:0001939) 214 1007 abnormality of prenatal development or birth (HP:0001197) 34 135 abnormality of the breast (HP:0000769) 13 32 abnormality of the cardiovascular system (HP:0001626) 532 1111 abnormality of the digestive system (HP:0025031) 426 596 abnormality of the ear (HP:0000598) 53 292 abnormality of the endocrine system (HP:0000818) 126 398 abnormality of the eye (HP:0000478) 127 1102 abnormality of the genitourinary system (HP:0000119) 359 851 abnormality of the immune system (HP:0002715) 414 633 abnormality of the integument (HP:0001574) 220 886 abnormality of the musculature (HP:0003011) 125 592 abnormality of the nervous system (HP:0000707) 543 1737 abnormality of the respiratory system (HP:0002086) 289 377 abnormality of the skeletal system (HP:0000924) 1030 3615 abnormality of the thoracic cavity (HP:0045027) 3 4 abnormality of the voice (HP:0001608) 4 29 constitutional symptom (HP:0025142) 14 69 growth abnormality (HP:0001507) 57 96 neoplasm (HP:0002664) 985 584 Total 6569 19 019 Table 2. Distribution of Human Phenotype Ontology entities by top-level phenotypic abnormality subclass. Some HPO entities have two or more parents; hence, classes may have multiple ancestors Phenotypic Abnormality Subclass Mapped RGO Classes All HPO Classes abnormal cellular phenotype (HP:0025354) 0 49 abnormality of blood and blood-forming tissues (HP:0001871) 163 563 abnormality of connective tissue (HP:0003549) 160 204 abnormality of head or neck (HP:0000152) 400 1323 abnormality of limbs (HP:0040064) 278 2734 abnormality of metabolism/homeostasis (HP:0001939) 214 1007 abnormality of prenatal development or birth (HP:0001197) 34 135 abnormality of the breast (HP:0000769) 13 32 abnormality of the cardiovascular system (HP:0001626) 532 1111 abnormality of the digestive system (HP:0025031) 426 596 abnormality of the ear (HP:0000598) 53 292 abnormality of the endocrine system (HP:0000818) 126 398 abnormality of the eye (HP:0000478) 127 1102 abnormality of the genitourinary system (HP:0000119) 359 851 abnormality of the immune system (HP:0002715) 414 633 abnormality of the integument (HP:0001574) 220 886 abnormality of the musculature (HP:0003011) 125 592 abnormality of the nervous system (HP:0000707) 543 1737 abnormality of the respiratory system (HP:0002086) 289 377 abnormality of the skeletal system (HP:0000924) 1030 3615 abnormality of the thoracic cavity (HP:0045027) 3 4 abnormality of the voice (HP:0001608) 4 29 constitutional symptom (HP:0025142) 14 69 growth abnormality (HP:0001507) 57 96 neoplasm (HP:0002664) 985 584 Total 6569 19 019 Phenotypic Abnormality Subclass Mapped RGO Classes All HPO Classes abnormal cellular phenotype (HP:0025354) 0 49 abnormality of blood and blood-forming tissues (HP:0001871) 163 563 abnormality of connective tissue (HP:0003549) 160 204 abnormality of head or neck (HP:0000152) 400 1323 abnormality of limbs (HP:0040064) 278 2734 abnormality of metabolism/homeostasis (HP:0001939) 214 1007 abnormality of prenatal development or birth (HP:0001197) 34 135 abnormality of the breast (HP:0000769) 13 32 abnormality of the cardiovascular system (HP:0001626) 532 1111 abnormality of the digestive system (HP:0025031) 426 596 abnormality of the ear (HP:0000598) 53 292 abnormality of the endocrine system (HP:0000818) 126 398 abnormality of the eye (HP:0000478) 127 1102 abnormality of the genitourinary system (HP:0000119) 359 851 abnormality of the immune system (HP:0002715) 414 633 abnormality of the integument (HP:0001574) 220 886 abnormality of the musculature (HP:0003011) 125 592 abnormality of the nervous system (HP:0000707) 543 1737 abnormality of the respiratory system (HP:0002086) 289 377 abnormality of the skeletal system (HP:0000924) 1030 3615 abnormality of the thoracic cavity (HP:0045027) 3 4 abnormality of the voice (HP:0001608) 4 29 constitutional symptom (HP:0025142) 14 69 growth abnormality (HP:0001507) 57 96 neoplasm (HP:0002664) 985 584 Total 6569 19 019 Integration across ontologies Through RGO’s causal relationships, we computed 9605 axioms that posited direct causal relationships between a disease (DO class) and a phenotypic abnormality (HPO class). Figure 1 illustrates the three distinct types of axioms that could be posited, all of which incorporated a causal relation between 2 RGO entities. Most of the axioms (n = 7058, type A) linked a disease and a phenotypic abnormality by equivalence relations to the causally-linked RGO entities. For example, through their equivalence relations to corresponding RGO terms, one can posit that angiosarcoma (DOID:0001816) may cause splenomegaly (HP:0001744). An additional 2545 axioms (type B) involved an equivalence relation from a disease to an RGO entity, and an is_a relation from the second RGO entity to the phenotypic abnormality. There were 2 axioms of type C, which included is_a relations from diseases tricuspid atresia (DOID:0080169) and tricuspid valve stenosis (DOID:4078) to tricuspid atresia or stenosis (RGO:2336). The direct DO-RGO-HPO causal relationships involved 1062 distinct diseases (DO classes) and 611 distinct phenotypic abnormalities (HPO classes). Example axioms are shown in Supplementary Appendix A. Figure 1. View largeDownload slide Examples of causal mappings from DO to HPO classes, based on equivalence or subsumption mappings from DO to RGO and from RGO to HPO. DO (“Disease”), RGO (“Gamuts Entity”), and HPO (“Phenotypic Abnormality”) classes are indicated by yellow (framed), white, and blue (unframed) rectangles. There were 7058 type A, 2545 type B, and 2 type C mappings. Figure 1. View largeDownload slide Examples of causal mappings from DO to HPO classes, based on equivalence or subsumption mappings from DO to RGO and from RGO to HPO. DO (“Disease”), RGO (“Gamuts Entity”), and HPO (“Phenotypic Abnormality”) classes are indicated by yellow (framed), white, and blue (unframed) rectangles. There were 7058 type A, 2545 type B, and 2 type C mappings. One also can apply the transitive property of RGO’s may_cause relation to identify entities with indirect causal links.20 Although transitivity can be applied exhaustively, such relationships typically have little clinical relevance; it is, however, reasonable to consider axioms involving one or two causal relations. Applying only the is_a relations and at most two causal relations in RGO (ie without the is_a relations within DO and HPO), one can express 28 536 “indirect” causal axioms from DO to HPO classes. For example: Crohn disease (DOID:0008778) = Crohn disease (RGO:34) may_cause osteomalacia (RGO:32782) may_cause protusio acetabuli (RGO:12478) = protusio acetabuli (HP:0003179) The integration also allows one to formulate queries of varying levels of abstraction within the DO and HPO class hierarchies. Figure 2 illustrates the axiom scleroderma (DOID:0000418) may_cause splenomegaly (HP:0001744) and displays the higher-level DO and HPO classes. As shown in the figure, scleroderma has 1 DO primary subclass: disease of anatomical entity (DOID:0000007). The HPO term splenomegaly has 2 primary subclasses: abnormality of the digestive system (HP:0025031) and abnormality of the immune system (HP:0002715). The value of abstraction is that one can pose queries, for example, such as, “Which musculoskeletal system disease may cause an abnormality of the digestive system?” Figure 2. View largeDownload slide An example mapping shows how the mappings enable RGO’s causal knowledge to link DO and HPO classes, in this case, from scleroderma to splenomegaly. One can use the subsumption hierarchies of DO and HPO to pose sophisticated queries. For example, the causal relationship between scleroderma to splenomegaly satisfies a query such as, “Which musculoskeletal system disease(s) may cause an abnormality of the digestive system?” Figure 2. View largeDownload slide An example mapping shows how the mappings enable RGO’s causal knowledge to link DO and HPO classes, in this case, from scleroderma to splenomegaly. One can use the subsumption hierarchies of DO and HPO to pose sophisticated queries. For example, the causal relationship between scleroderma to splenomegaly satisfies a query such as, “Which musculoskeletal system disease(s) may cause an abnormality of the digestive system?” DISCUSSION RGO—a reference knowledge source for differential diagnosis in radiology—is available through NCBO BioPortal (https://bioportal.bioontology.org/ontologies/GAMUTS) and is planned to be updated annually. Incorporation of prior and conditional probability data into RGO is underway to support clinical diagnostic reasoning. Ontology-based methods can guide named-entity recognition in clinical texts such as clinic notes and radiology reports. Our group has extracted data from 1.7 million radiology reports to determine conditional probability of the entire range of RGO entities. The prevalence and relative likelihood of diseases based on the patient’s imaging findings allow one to determine the most likely diagnoses, and to use the value of information to select the most informative questions to ask. RGO has been integrated with the Orphanet Rare Disease Ontology (ORDO) to support the use of RGO for differential diagnosis of rare disease.21 The mappings allow one to reason more generally over the ontologies. As described above, one can query which musculoskeletal system disease (a DO subclass) may cause abnormality of the digestive system (an HPO class). The mappings thus enable more complex and robust queries over the ontologies, and allow one to use the ontologies’ knowledge to answer sophisticated questions. The mapped ontologies also provide additional terms to support data mining from textual information in the electronic health record. The growing number and size of biomedical ontologies have engendered interest in merging and integrating content of related ontologies.22–24 Automated approaches have been developed to take advantage of concepts’ hierarchical relationships to guide the process of ontology alignment and integration, including tools such as OPTIMA25 and PROMPT.26 Because of RGO’s predominantly “flat” architecture – 15 242 (90.1%) of its 16 912 entities are direct subclasses of the top-level class – the investigators chose not to use such systems to merge or align the ontologies. The process of mapping the RGO entities to the HPO and DO concepts using the automated NCBO Annotator web service demonstrated the need for manual review of the partial matches. Decisions regarding how to best describe the relationship between an RGO entity and an HPO or DO concept required careful review of the ontology’s concept and occasional pursuit of textual definitions. For example, the RGO entity bronchial chondromalacia (RGO:32677) was identified initially as an inexact match for chondromalacia (DOID:2557); because DO classifies chondromalacia as an articular cartilage disease, the authors identified the potential relation as a non-match. In the case of the RGO entity dilatation of esophageal stricture (RGO:32918) that was a partial match with dilatation (HP:0002617), this relation was described as a non-match as well due to the HPO’s textual definition of dilatation, an “abnormal outpouching or sac-like dilatation in the wall of an artery, vein or the heart.” Equivalence mappings between ontologies promote interoperability of the data they describe, but the ability to perform such mappings may be limited by differences in focus and granularity between the ontologies. Mapping to equivalent concepts in the Systematized Nomenclature of Medicine – Clinical Terms (SNOMED CT) vocabulary has been identified for 30% of HPO classes27 and 65% of DO classes.28 RGO includes a large number of terms that are specific to imaging examinations, such as extrinsic impression on thoracic esophagus (RGO:3470) and lytic skeletal lesion (RGO:25747). Hence, mapping of 47% of RGO entities to DO and HPO is highly satisfactory. For ontologies where limited equivalence mappings can be identified, partial mappings can provide a “next-best approach for traversing between the two systems”.27 CONCLUSION The integration of genotypic and phenotypic knowledge across different sources of biomedical data sources – such as experimental and clinical data repositories – is increasingly critical to discover new treatments for both rare and common diseases.29 Exact and partial mappings from RGO to DO and HPO allow one to compute axioms that relate causal relationships between DO disease entities and HPO imaging phenotypes. Although an automated approach identified candidate mappings, manual review was essential to assure that the mappings were meaningful and semantically correct. The mappings allowed RGO concepts to be categorized within the hierarchies of disease and phenotypic abnormalities, which adds useful semantic structure to RGO terms, and enables one to posit axioms that link DO and HPO entities at multiple levels in their hierarchies. This information can support clinical diagnosis, data mining, and knowledge discovery. FUNDING This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors. CONTRIBUTORS MTF and CEK performed the analysis and wrote the manuscript. RWF incorporated the analysis into the ontology. All authors reviewed and approved the manuscript. SUPPLEMENTARY MATERIAL Supplementary material is available at Journal of the American Medical Informatics Association online. Competing interests: None REFERENCES 1 Bodenreider O. Biomedical ontologies in action: role in knowledge management, data integration and decision support . Yearb Med Inform 2008 ; 67 – 79 . 2 Budovec JJ , Lam CA , Kahn CE Jr. Radiology Gamuts Ontology: differential diagnosis for the Semantic Web . RadioGraphics 2014 ; 34 1 : 254 – 64 . Google Scholar Crossref Search ADS PubMed 3 Schriml LM , Arze C , Nadendla S , et al. . Disease Ontology: a backbone for disease semantic integration . Nucleic Acids Res 2012 ; 40 ( D1 ): D940 – 6 . Google Scholar Crossref Search ADS PubMed 4 Kibbe WA , Arze C , Felix V , et al. . Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data . Nucleic Acids Res 2015 ; 43 ( D1 ): D1071 – 8 . Google Scholar Crossref Search ADS PubMed 5 Bello SM , Shimoyama M , Mitraka E , et al. . Disease Ontology: improving and unifying disease annotations across species . Dis Model Mech 2018 ; 11 3 : dmm032839 . Google Scholar Crossref Search ADS PubMed 6 Schriml LM , Mitraka E. The Disease Ontology: fostering interoperability between biological and clinical human disease-related data . Mamm Genome 2015 ; 26 ( 9–10 ): 584 – 9 . Google Scholar Crossref Search ADS PubMed 7 Carson MB , Lu H. Network-based prediction and knowledge mining of disease genes . BMC Med Genomics 2015 ; 8 (Suppl 2) : S9. Google Scholar Crossref Search ADS PubMed 8 LePendu P , Musen MA , Shah NH. Enabling enrichment analysis with the Human Disease Ontology . J Biomed Inform 2011 ; 44 (Suppl 1) : S31 – 8 . Google Scholar Crossref Search ADS PubMed 9 Robinson PN , Kohler S , Bauer S , Seelow D , Horn D , Mundlos S. The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease . Am J Hum Genet 2008 ; 83 5 : 610 – 5 . Google Scholar Crossref Search ADS PubMed 10 Robinson PN , Mundlos S. The Human Phenotype Ontology . Clin Genet 2010 ; 77 6 : 525 – 34 . Google Scholar Crossref Search ADS PubMed 11 Köhler S , Doelken SC , Mungall CJ , et al. . The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data . Nucleic Acids Res 2014 ; 42 ( D1 ): D966 – 74 . Google Scholar Crossref Search ADS PubMed 12 Köhler S , Vasilevsky NA , Engelstad M , et al. . The Human Phenotype Ontology in 2017 . Nucleic Acids Res 2017 ; 45 ( D1 ): D865 – 76 . Google Scholar Crossref Search ADS PubMed 13 Groza T , Kohler S , Moldenhauer D , et al. . The Human Phenotype Ontology: semantic unification of common and rare disease . Am J Hum Genet 2015 ; 97 1 : 111 – 24 . Google Scholar Crossref Search ADS PubMed 14 Maiella S , Olry A , Hanauer M , et al. . Harmonising phenomics information for a better interoperability in the rare disease field . Eur J Med Genet 2018 ; 61 11 : 706 – 14 . 15 Son JH , Xie G , Yuan C , et al. . Deep phenotyping on electronic health records facilitates genetic diagnosis by clinical exomes . Am J Hum Genet 2018 ; 103 1 : 58 – 73 . Google Scholar Crossref Search ADS PubMed 16 Jonquet C , Shah NH , Musen MA. The open biomedical annotator . Summit Transl Bioinform 2009 ; 2009 : 56 – 60 . Google Scholar PubMed 17 Shah NH , Bhatia N , Jonquet C , Rubin D , Chiang AP , Musen MA. Comparison of concept recognizers for building the Open Biomedical Annotator . BMC Bioinformatics 2009 ; 10 (Suppl 9) : S14. Google Scholar Crossref Search ADS PubMed 18 Noy NF , Shah NH , Whetzel PL , et al. . BioPortal: ontologies and integrated data resources at the click of a mouse . Nucleic Acids Res 2009 ; 37 ( Web Server issue ): W170 – 3 . Google Scholar Crossref Search ADS PubMed 19 Whetzel PL , Noy NF , Shah NH , et al. . BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications . Nucleic Acids Res 2011 ; 39 ( Web Server issue ): W541 – 5 . Google Scholar Crossref Search ADS PubMed 20 Kahn CE Jr. Transitive closure of subsumption and causal relations in a large ontology for radiology diagnosis . J Biomed Inform 2016 ; 61 : 27 – 33 . Google Scholar Crossref Search ADS PubMed 21 Kahn CE Jr. Integrating ontologies of rare diseases and radiological diagnosis . J Am Med Inform Assoc 2015 ; 22 6 : 1164 – 8 . Google Scholar Crossref Search ADS PubMed 22 Dragisic Z , Ivanova V , Li H , Lambrix P. Experiences from the anatomy track in the ontology alignment evaluation initiative . J Biomed Semantics 2017 ; 8 1 : 56 . Google Scholar Crossref Search ADS PubMed 23 Harrow I , Jimenez-Ruiz E , Splendiani A , et al. . Matching disease and phenotype ontologies in the ontology alignment evaluation initiative . J Biomed Semantics 2017 ; 8 1 : 55 . Google Scholar Crossref Search ADS PubMed 24 Kolyvakis P , Kalousis A , Smith B , Kiritsis D. Biomedical ontology alignment: an approach based on representation learning . J Biomed Semantics 2018 ; 9 1 : 21 . Google Scholar Crossref Search ADS PubMed 25 Doshi P , Kolli R , Thomas C. Inexact matching of ontology graphs using expectation-maximization . Web Semantics 2009 ; 7 2 : 90 – 106 . Google Scholar Crossref Search ADS PubMed 26 Noy NF , Musen MA. The PROMPT suite: interactive tools for ontology merging and mapping . Int J Hum Comput Stud 2003 ; 59 6 : 983 – 1024 . Google Scholar Crossref Search ADS 27 Dhombres F , Bodenreider O. Interoperability between phenotypes in research and healthcare terminologies–Investigating partial mappings between HPO and SNOMED CT . J Biomed Semantics 2016 ; 7 3 . 28 Raje S , Bodenreider O. Interoperability of disease concepts in clinical and research ontologies: contrasting coverage and structure in the Disease Ontology and SNOMED CT . Stud Health Technol Inform 2017 ; 245 : 925 – 9 . Google Scholar PubMed 29 Denaxas SC. Integrating bio-ontologies and controlled clinical terminologies: from base pairs to bedside phenotypes . Methods Mol Biol 2017 ; 1446 : 275 – 87 . Google Scholar Crossref Search ADS PubMed © The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Journal

Journal of the American Medical Informatics AssociationOxford University Press

Published: Feb 1, 2019

There are no references for this article.