Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A Framework to Support the Sharing and Reuse of Computable Phenotype Definitions Across Health Care Delivery and Clinical Research Applications

A Framework to Support the Sharing and Reuse of Computable Phenotype Definitions Across Health... Introduction: The ability to reproducibly identify clinically equivalent patient populations is critical to the vision of learning health care systems that implement and evaluate evidence-based treatments. The use of common or semantically equivalent phenotype definitions across research and health care use cases will support this aim. Currently, there is no single consolidated repository for computable phenotype definitions, making it diffic ult to find all definitions that already exist, and also hindering the sharing of definitions between user groups. Method: Drawing from our experience in an academic medical center that supports a number of multisite research projects and quality improvement studies, we articulate a framework that will support the sharing of phenotype definitions across research and health care use cases, and highlight gaps and areas that need attention and collaborative solutions. Framework: An infrastructure for re-using computable phenotype definitions and sharing experience across health care delivery and clinical research applications includes: access to a collection of existing phenotype definitions, information to evaluate their appropriateness for particular applications, a knowledge base of implementation guidance, supporting tools that are user-friendly and intuitive, and a willingness to use them. Next Steps: We encourage prospective researchers and health administrators to re-use existing EHR-based condition definitions where appropriate and share their results with others to support a national culture of learning health care. There are a number of federally funded resources to support these activities, and research sponsors should encourage their use. Acknowledgements The SEDI project and ancillary study described was supported by Cooperative Agreement Number 1C1CMS331018-01-00 from the Department of Health and Human Services, Centers for Medicare & Medicaid Services. This publication was also made possible by the Patient Centered Outcomes Research Institute (PCORI) and the National Institutes of Health (NIH) Common Fund, through a cooperative agreement (U54 AT007748) from the Offic e of Strategic Coordination within the Offic e of the NIH Director and the Duke CTSA (UL1TR001117). Dr. Cameron is supported by grant 5 T32 DK007731 (Duke Training Grant in Nephrology) The contents of this publication are solely the responsibility of the authors and do not necessarily represent the offic ial views of the Patient Centered Outcomes Research or the U.S. Department of Health and Human Services or any of its agencies. We are grateful to Shelley Rusincovitch of the Duke Translational Medicine Institute for her ideas in the formulation of this commentary and to the members of the PSQ Core of the NIH Collaboratory for their ideas and discussion related to achieving valid and scalable phenotype definitions for a number of conditions. Keywords Computable Phenotypes; Electronic Health Records; Data Standards; Learning Health Care Systems This model/framework is available at EDM Forum Community: http://repository.edm-forum.org/egems/vol4/iss3/2 Disciplines Health Information Technology Creative Commons License This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. This model/framework is available at EDM Forum Community: http://repository.edm-forum.org/egems/vol4/iss3/2 eGEMs Generating Evidence & Methods to improve patient outcomes Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions eGEMs Generating Evidence & Methods to improve patient outcomes A Framework to Support the Sharing and Reuse of Computable Phenotype Definitions Across Health Care Delivery and Clinical Research Applications i ii iii Rachel L. Richesson, PhD; Michelle M. Smerek; C. Blake Cameron, MD ABSTRACT Introduction: The ability to reproducibly identify clinically equivalent patient populations is critical to the vision of learning health care systems that implement and evaluate evidence-based treatments. The use of common or semantically equivalent phenotype definitions across research and health care use cases will support this aim. Currently, there is no single consolidated repository for computable phenotype definitions, making it difficult to find all definitions that already exist, and also hindering the sharing of definitions between user groups. Method: Drawing from our experience in an academic medical center that supports a number of multisite research projects and quality improvement studies, we articulate a framework that will support the sharing of phenotype definitions across research and health care use cases, and highlight gaps and areas that need attention and collaborative solutions. Framework: An infrastructure for re-using computable phenotype definitions and sharing experience across health care delivery and clinical research applications includes: access to a collection of existing phenotype definitions, information to evaluate their appropriateness for particular applications, a knowledge base of implementation guidance, supporting tools that are user-friendly and intuitive, and a willingness to use them. Next Steps: We encourage prospective researchers and health administrators to re-use existing EHR- based condition definitions where appropriate and share their results with others to support a national culture of learning health care. There are a number of federally funded resources to support these activities, and research sponsors should encourage their use. i ii iii Duke University School of Nursing, Duke Clinical Research Institute, Duke University School of Medicine Published by EDM Forum Community, 2016 1 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 to sharing phenotype definitions will support the Introduction reuse of well-constructed and validated computable Computable phenotypes, or electronic health phenotypes, and will subsequently reduce the record (EHR)-based condition definitions, enable variation in definitions across conditions. Drawing the identification of cohorts of patients with certain from our experience from an academic medical diseases or clinical profiles for disease management center supporting a number of multisite research registries, quality improvement programs, evaluation projects, we articulate a framework that will support studies, and interventional research. Regardless the sharing of phenotype definitions across research of the application, cohort identification requires and health care use cases, and highlight gaps or queries of clinical data stores that are both valid areas that need attention and collaborative solutions. and reproducible. Currently, there is no single consolidated repository for computable phenotypes, Background and Context making it difficult to find all definitions that already A “computable phenotype” is a definition of a exist, and also hindering the sharing of definitions condition, disease, or characteristic or clinical event between user groups. Health services researchers that is based solely on data that can be processed and quality assessment groups—i.e., the National by a computer. Computable phenotype definitions Quality Forum (NQF), National Committee for provide the specifications to identify populations Quality Assurance (NCQA), the Centers for Medicare of patients with conditions of interest, and can & Medicaid Services (CMS), and the Agency for be combined with other criteria, such as age or Healthcare Research and Quality (AHRQ)—provide 1-5 other demographic information, to develop cohort computable phenotypes on a number of websites. populations for a variety of purposes. In addition, researchers and registry developers create definitions utilizing different design and Quality monitoring organizations (such as NQF, evaluation methods. Because the definitional logic NCQA, and AHRQ) create computable phenotype is often underspecified or unreported in scientific definitions for the development and monitoring of journals, it is not clear if the findings reported in health care quality measures. A number of research published research or quality improvement are networks have developed phenotype definitions comparable or relevant to clinical populations, to enable the use of EHR data for observational hindering the application of evidence-based medical research (including comparative effectiveness and nursing care. 8-11 studies) and interventional trials. Various multisite 12,13 studies use these definitions to develop registries We believe that a minimal set of well-constructed for drug safety surveillance or chronic disease and explicit EHR-based phenotype definitions will management. There are numerous and distinct create efficiencies for health care organizations use cases for computable phenotypes for health that must increasingly support growing numbers of care delivery (e.g., personalized medicine, guideline- data requests related to comparative effectiveness based care, chronic disease management, and research (CER), quality improvement, and chronic disease management. We further believe that such quality measurement) and biomedical research a set will facilitate synergies between research (genomic, observational, CER, health services and care delivery, enabling “learning health care” research, and interventional trials.) Each use case practices and subsequently improving patient represents different scientific disciplines whose outcomes. A large-scale and multipurpose approach phenotype development efforts have heretofore http://repository.edm-forum.org/egems/vol4/iss3/2 2 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 been undertaken in isolation, without the benefit quality improvement programs might be identifying eGEMs of cooperation. Further, there are no standards of populations that are different from those used in Generating Evidence & Methods practice that encourage reuse of existing definitions the development of the evidence upon which those to improve patient outcomes supporting treatment strategies and interventions or the use of common definitions for health care are based. delivery and research uses. The lack of coordination of phenotype definitions The “research informs practice informs research” eGEMs cycle that is the essence of learning health care among researchers, clinicians, and administrators has Generating Evidence & Methods systems entails that the clinical features used to led to the unintentional proliferation of numerous to improve patient outcomes define research and patient populations be well definitions for many conditions and clinical profiles. understood and comparable. Hence semantically Because each definition applies different logic (e.g., equivalent phenotype definitions must be used various combinations of diagnosis or procedure to identify clinically equivalent populations. We codes, medications, or laboratory tests) for believe that creating a centralized collection of querying EHR data, the resulting cohorts are often explicitly defined computable phenotypes, with an not directly comparable. It is unknown how much accompanying knowledge base of development and semantic variation in definitions actually exists, validation documentation, is the first step toward because this information is often underspecified in consolidating effort and harmonizing definitions. research publications. A recent report on national Information, resources, and tools that facilitate trends in diabetes specifically lists several related the reuse of existing phenotypes will reduce the conditions (including hypoglycemia, neuropathy, variation in phenotype definitions across all use chronic kidney disease, peripheral vascular disease, cases, facilitate conversations between health care cognitive decline, cancers, and even differentiating and research communities about how to compare type 1 from type 2 diabetes) whose prevalence definitions for different use cases, and ultimately could not be reported due to inconsistent EHR lead to harmonization of definitions that will simplify documentation and definitions across the United and support the identification of clinically equivalent States. The consequent likelihood that research, populations for research and health care purposes. patient care, and quality measurement communities are using different phenotype definitions for the Framework Components same condition is more concerning. The COPD Outcomes-based Network for Clinical Effectiveness The reuse of phenotype definitions can be facilitated & Research Translation (CONCERT) assessed 980 by their explicit representation and tools to support patients sampled from various EHR systems using a their evaluation and implementation in new clinical phenotype definition for chronic obstructive applications. We propose that the deliberate and pulmonary disease (COPD), and found that just over informed reuse of existing definitions will require half of those met the criteria for the well-accepted four components: (1) searchable libraries of explicitly research definition for the condition. Further, they defined phenotype definitions; (2) supporting found that the patient populations retrieved by knowledge bases with information and methods; (3) the clinical and research definitions for COPD had tools to identify, evaluate, and implement existing significantly different comorbidities and risk factors. phenotype definitions; and (4) motivated users and stakeholders to use them (Fig 1). This implies that disease management registries and Published by EDM Forum Community, 2016 3 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 Figure 1. Overview of Framework to Support the Reuse of Phenotype Definitions in Learning Health Care Systems Research Informs Practice RESEARCH HEALTH CARE Phenotype Phenotype Definition Definition LEARNING HEALTCARE SYSTEMS Tools Tools Practice Informs Research LIBRARY OF COMPUTABLE PHENOTYPES Definition | Purpose | Metadata | Validation results | Data features | Implementation experience KNOWLEDGE BASE Information | Evidence | Methods | Case Studies MOTIVATION Shared values | Shared vision | Perceived benefits | Incentives | Protections Networks Stakeholders Healthcare Systems http://repository.edm-forum.org/egems/vol4/iss3/2 4 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 hypertension control. Without clear supporting Searchable Libraries of Phenotype eGEMs Definitions documentation, clinical subject matter experts Generating Evidence & Methods may reject, as lacking face validity, well-validated to improve patient outcomes The sharing of information about computable phenotype definitions that do not match their phenotype definitions will allow implementers to expectations or intuition. Clinical practice and reuse appropriate existing definitions rather than disease definitions change over time. Therefore, creating their own. This requires access to an ample phenotype definitions in a phenotype library should eGEMs set of phenotype definitions, along with information reference the underlying clinical definitions or Generating Evidence & Methods that enables them to be evaluated and easily to improve patient outcomes guidelines upon which they are based, in order to implemented. The ideal library should be indexed better identify legacy definitions that are out of date. so that users can search by a number of different In addition, phenotype definitions should conform features including the clinical condition; the data to existing required and emerging terminologies elements; logic; the intended use case; limitations; and standards—e.g., SNOMED CT, LOINC, RxNorm, and orientation toward precision, sensitivity or LOINC, NDF-RT—for representing clinical data, as specificity. endorsed by the Office of the National Coordinator. Adherence to standards allows for a modular design Mo and colleagues call for a formal computable that reduces development and implementation representation of phenotype definitions that will costs, particularly at scale where multiple use cases enable scalability of the definitions by allowing 17 for that standard may exist concurrently. them to be applied to different data systems. Their desiderata includes the following: human-readable Because phenotype definitions might perform and computable forms, structured rules, formalisms differently when implemented in different patient for temporal relations, representations for text populations and EHR systems, information about the searching and natural language processing, and performance of phenotypes in specific organizations interfaces for external software algorithms. They should be collected from implementers and shared endorse the use of standardized terminologies, with future users. Implementation information is ontologies, and also the reuse of value sets. necessary to understand how standard definitions perform across diverse populations, heterogeneous Additional information can be included in the library organizations and EHRs systems. Specifically, or underlying knowledge base to support users’ information about the underlying population and semantic understanding of the phenotype definition, quality (i.e., completeness, accuracy, consistency) of and to enable selection of the appropriate definition data that were used to validate the definitions have to identify patient cohorts with the intended clinical important implications for interpreting the validation features. Therefore, the definitions in a phenotype results. For example, if a test population had 50 library should include metadata or supporting percent missing data in one of the defining variables information about a definition, its intended use, the clinical rationale or research justification for for the phenotype, the provision of this information provides important contextual information about the definition, and data about clinical and scientific the definition’s performance. Similarly, the testing validation in various health care settings. As an example, actual blood pressure measurements, of phenotype definitions in populations with even when they are available for long periods, did high versus low prevalence of disease will yield not contribute significantly to predictive models for different results. Recommendations for data quality Published by EDM Forum Community, 2016 5 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 assessment reporting in pragmatic trials and the current phenotype life cycle stage, and should 21,22 observational research can provide insight into include the status (e.g., in development, draft, final), which data quality dimensions (e.g., completeness, as well as tracking the version number or date of last accuracy) might be most useful to evaluate the revision. Phenotypes could be marked as retired or phenotype definition. archived in cases where clinical practice changes or the underlying clinical definitions or data standards To maximize the socialization and collaboration become out of date. around shared phenotypes, the ideal phenotype library should support communication between The Phenotype Knowledge Base (PheKB) is a large phenotype developers and implementers. The and well-indexed portal for hosting computable Centers for Medicare & Medicaid Services (CMS) phenotypes, though enhancements are needed to employs a standardized approach for enabling accommodate the above information requirements. users to post questions and share comments, and PheKB includes human-readable definitions and for maintaining quality measure definitions across machine-readable code in some cases, but the multiple programs. Such a framework could code is not fully executable across heterogeneous be adapted for use with computable phenotype EHR systems. The PheKB does have an interface libraries. During early development, draft phenotype for reporting contextual data and performance specifications could be posted in a library for public metrices of phenotype definitions, but a useful and review to evaluate feasibility and refine use cases. usable display of these data is not yet standardized. During the validation phase, the testing methodology Also, it is not known how widely PheKB is used could be opened to public comment. Once validated, outside of the Electronic Medical Records and the library could facilitate communications of Genomics (eMERGE) Network or Pharmacogenetic best practices and feedback. This would allow Research Network, whose goals are to implement implementers to share information about their decision support around clinically actionable experiences implementing phenotype definitions in genetic variants for clinical conditions. While several their local systems, and allow others to ask questions National Institutes of Health (NIH) Collaboratory to inform the many practical decisions that are made and National Patient-Centered Clinical Research when implementing abstract logic in local data Network (PCORnet) investigators have added their systems. A collaborative or interactive component phenotype definitions to PheKB, an increased uptake would also allow users to relate their experience of PheKB by other research and clinical groups will implementing definitions in different vendor systems require targeted marketing. Broader usage of PheKB and in different patient populations. Over time, the might drive enhancements to the PheKB resource, library could collect data on usage and impact, but also will likely increase the number of user and aggregate published literature based on each requirements. Several authoring tools exist, including phenotype. A record of projects that have used the PheMA project, which provides generalizable or endorsed different phenotype definitions can computable representations and automated enhance understanding of phenotype intent and mapping tools. Other research networks, such as performance, and can assist potential implementers Observational Health Data Sciences and Informatics in the selection of appropriate phenotypes. 27 11 (OHDSI) and PCORnet, include dedicated Because phenotype definitions are dynamic, the phenotype working groups and internal inventories library should reference a phenotype life cycle, of phenotype definitions. http://repository.edm-forum.org/egems/vol4/iss3/2 6 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 Although PheKB does include some “knowledge” in Knowledge Base of Information and eGEMs Methods the form of phenotype development methods and Generating Evidence & Methods validation protocols, it is limited and not tailored to improve patient outcomes Researchers and health care organizations need for different phenotype users. Information for a information about how to develop, evaluate, and broader range of use cases is needed. Rethinking implement phenotype definitions. The particular Clinical Trials: The Living Textbook of Pragmatic use case influences the nature of the phenotype Clinical Trials provides a model for disseminating eGEMs definition and system requirements. For example, in information in the form of “lessons learned” and Generating Evidence & Methods quality measurement, the purpose of the phenotype to improve patient outcomes case studies, rather than as empirical research. Many definition is to identify “bread and butter” instances other research-network websites and collaborative of a particular condition. Patients whose disease networks perform this function, but a central portal status is negative or indeterminate are excluded. to the knowledge from various networks would By contrast, genomic research usually aims to support potential implementers from multiple reliably identify both cases and controls (negative domains. cases). The phenotype definition must identify with reasonable certainty not only patients who Tools have the condition (i.e., have adequate sensitivity), Formal representations of computable phenotypes, but also patients who clearly do not have the mappings to reference coding systems and condition (i.e., high specificity). Definitions used (common) information models, and executable in disease management registries or population code can support the implementation of definitions health promotion activities have needs for higher in different populations. Mo’s desiderata highlights sensitivity at the cost of specificity, whereas CER recommendations for clinical data representation requires higher specificity and precision. Guidance to support phenotyping. This specifically calls for from different health care and research communities the structure of clinical data into queryable forms can inform users about important features and and the use of a common data model to support performance thresholds for phenotype definitions customization for the variability and availability of for different use cases. EHR data among sites. Since there currently are a Information to clarify data dependencies and number of (different) common data models used in 28-30 research networks, there is a need for tools and implementation requirements is needed to facilitate platforms to implement a given phenotype definition the sharing of phenotypes across groups. For in different contexts. Knowledge, authoring tools, example, some definitions include natural language processing (NLP) components that might not be and vocabulary mapping tools to support these feasible for some target systems. The information activities can also be centrally available through a in the knowledge base can include methods and shared knowledge base or links to a code sharing case studies from projects that have implemented base like GitHub. Similarly, the implementation of the definitions in multiple organizations; their these definitions require terminology mappings (e.g., customizations and lessons learned can inform from drug class names in NDF-RT and medication future users. Evidence-based practice guidelines that sets from RxNorm to product codes in (NDC). include justification for a definition’s logic as well as Terminology integration resources, such as RxNorm, the definition of “gold standard” for validation of EHR- the Unified Medical Language System (UMLS) and UMLS Terminology Services (UTS) tools, can benefit based phenotype definitions should also be available. Published by EDM Forum Community, 2016 7 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 phenotype use cases in many networks. To be common data models, or mappings between coding more broadly used, these tools should be centrally systems; (5) developing new phenotype definitions available with supporting instructions for people if needed; and (6) reporting implementation results, from many different domains and levels of technical along with characteristics of test data sets) for expertise. others to view (Table 1). Specifically, tools are needed for the following We see gaps and unmet needs in all areas except uses: (1) searching for phenotype definitions for phenotype development. At least two scalable that are endorsed or mandated; (2) browsing authoring tools exist—PheMA with its execution existing phenotypes to find ones that are support and OHDSI’s CALYPSO (Criteria potentially relevant that can be reused; (3) the Assessment Logic for Your Population Study in 33,34 display of relevant information to help potential Observational data). Xu et al. provide a detailed implementers understand existing definitions and inventory of other search and authoring tools. In their strengths and limitations for particular uses; addition to guided phenotype authoring tools based (4) the implementation of those definitions in local on the underlying model of the phenotype library, EHR systems with, e.g., executable code tailored to other tools theoretically could support an “import Table 1. Types of Tools and Functionality Required to Support the Sharing and Reuse of Computable Phenotype Definitions Across Health Care Delivery and Clinical Research Applications EXAMPLE OR FUNCTION PURPOSE POTENTIAL TOOL Search for phenotype definitions. Identify validated or endorsed PheKB phenotype definitions. Browse for phenotype definitions. Assess landscape. PheKB Display pertinent context Aid potential implementers in needed* information. assessing a definitions fit for their use case. Provide executable code in different Implement phenotype PheKB, formats (SQL, SAS, R, etc.) and definitions in heterogeneous crosswalks for mapping between systems. different coding systems. GitHub Develop new phenotype definitions. Create new definitions when PheMA existing ones aren’t a good fit. CALYPSO Display implementation results with Provide additional information needed* characteristics of the data in which users need to consider when phenotypes were implemented. determining whether a definition is a good fit for their use case. Note: *This represents a gap where tooling is needed. We are not aware of existing tools that support this function. http://repository.edm-forum.org/egems/vol4/iss3/2 8 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 and transformation” process that could take existing collaborators will be. Wilcox et al. assert that the eGEMs costs for sustaining research infrastructure can definitions developed locally (with local tools) and Generating Evidence & Methods be covered if value can be created. Thus, clear store them in the central repository for other to to improve patient outcomes demonstrations of reduced workload, reduced costs, access and use. or faster development resulting from the reuse of The learning health system cannot exist on phenotype definitions might motivate potential phenotypes alone. Any phenotype library would users. eGEMs need to provide a service-based API that other Generating Evidence & Methods Incentives computable clinical “services” might be able to to improve patient outcomes access in a standardized way, e.g., electronic clinical- Tangible incentives can be created through policy trial management tools that might access existing or legislation. Examples include quality reporting phenotype definitions in order to define the inclusion incentives (e.g., the CMS Physician Quality Reporting and exclusion criteria for a research trial. A number System and the financial rewards of the Meaningful of functional components, e.g., standard models Use program), and punitive consequences for and vocabulary services, would in turn be needed to noncompliance with Food and Drug Administration fully support the reuse of phenotype definitions on a (FDA) reporting specifications. Although these types grand scale. of incentives might be effective, they are time- consuming and expensive to achieve. Alternative Motivated Users and Stakeholders incentives might derive from some sort of peer The sharing of definitions and experience will pressure from the scientific community to report require deliberate action on the part of potential phenotype definitions as part of the research phenotype developers and implementers, and protocol or study results reporting in publications, or useful and intuitive tools can support this behavior. rewards for such behavior from research sponsors or Aligning existing computable phenotypes with in academic promotion rubrics. users’ needs will likely positively influence their Shared Values and Principles uptake, as will engaging all stakeholders in the design and development of phenotype resources A set of agreed upon assumptions and principles and tools described in this framework. Additionally, for research networks, sponsors, and health care a number of approaches can be used to motivate regulators to adopt is the first step in addressing individuals to search for existing definitions and the complex challenges to reusing phenotype to share the outcomes of computable phenotype definitions. These should include a stated implementations. Possible approaches include commitment to reproducible science and the creating incentives, increasing perceived benefit, standardized reporting of phenotype definitions, establishing new social norms, or regulating with use case, and validation results. Additional policies or regulations. principles could include an expectation that users of computable phenotypes will search for and Perceived Benefits and Value consider existing definitions before creating their Collaboration is fostered when the collaborators own. For conditions where a phenotype definition expect or perceive a beneficial outcome. The more already exists, researchers should carefully consider beneficial or significant the outcome, the higher whether the benefit of developing new definitions the participation and commitment level among tailored to their specific use cases outweighs the Published by EDM Forum Community, 2016 9 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 Box 1. User Scenarios that Illustrate Benefit of Shared Phenotype Definitions Scenario 1. An intervention specialist working for the Southeastern Diabetes Initiative (SEDI) wants to identify patients with type 2 diabetes across a number of health care providers in order to develop treatment programs and community interventions that will improve diabetes care. The specialist needs operational definitions for type 2 diabetes, as well as a number of associated conditions such as hypertension and chronic kidney disease. She goes to a central phenotype library and finds definitions for each condition that are appropriate for broad population screening and that can be implemented in all the SEDI sites, including one with no capacity for accessing clinical notes. She shares a link for each selected phenotype definition, plus implementation guidance and appropriate code, with the data specialists at each SEDI site. Each site implements the definition and reports their results to the phenotype library. One SEDI site had problems with the code and reported this experience as well. The original developer of the phenotype contacted the SEDI site with a suggestion. This suggestion was helpful and was therefore added to the knowledge base for other SEDI sites to access and review. Later, the study was published in a journal and referenced the link to the computable phenotype logic and supporting implementation tools. Using these definitions and tools, a new group of researchers replicated the intervention in an urban population on the West Coast and published their findings. This scenario was enabled by the following: 1. Searchable libraries of explicitly defined phenotype definitions; 2. Supporting knowledge bases with information and methods; 3. Supporting tools; and 4. Users and stakeholders motivated to consider reusing existing definitions; benefits from reuse and shared phenotype definitions were realized by the users. Scenario 2. A clinician reviews the literature and finds a study of a new medical intervention for uncontrolled hypertension. She wants to implement it on a similar population in her clinic. The published article includes a narrative discussion of the inclusion and exclusion criteria (e.g., includes diagnosis of hypertension and excludes chronic kidney disease) with hyperlinks to a public phenotype library that hosts the computable phenotype specifications for the intervention population. The clinician points her data analyst to the phenotype specifications and requests a data warehouse query to estimate the number of patients that might be eligible for the planned intervention. After obtaining the required institutional approvals, she implements the intervention and conducts a formal quality improvement study. She publishes that study and references a public link to the phenotype library and knowledge base for the specific computable phenotype-definition logic and supporting implementation tools. Future implementers access the library for implementation details, rather than contacting this clinical investigator, allowing her more time to research and plan new chronic disease management interventions. This scenario was enabled by the following: 1. Searchable libraries of explicitly defined phenotype definitions; 2. Supporting knowledge bases with information and methods; 3. Supporting tools; and 4. Users and stakeholders motivated to consider reusing existing definitions; benefits from reuse and shared phenotype definitions were realized by the users. http://repository.edm-forum.org/egems/vol4/iss3/2 10 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 losses incurred by sacrificing interoperability. Other curate authoritative phenotypes as a complement eGEMs principles might include that phenotype definitions to guideline development activities. Further, Generating Evidence & Methods should be placed in the public domain—regardless models of sharing behavior could be manufactured to improve patient outcomes of whether they derived from federally funded and made visible, such as online exchanges research, national quality reporting incentive between investigators that describe challenges or programs, or private ventures. Because phenotype observations in implementing particular definitions in definitions are developed for different purposes, certain settings. eGEMs populations, and settings, it is not feasible to define a Generating Evidence & Methods Protection from Risks to impr set of defini ove patient ou tions tcomes for all research needs. However, the explicit documentation and sharing of phenotype Inherent in understanding the motivation for sharing definitions and supporting evidence will enable is to understand what fears or hesitations research researchers to evaluate and select the best available investigators or project implementers might have. definitions for their populations and research needs. Anecdotally, the risks to sharing are concerns While there may be potential research integrity risks about publication, copyright, or inappropriate use. associated with using data or methods without Phenotype developers might not feel their definition full understanding of their limitations, a repository is of broad interest, thinking it too institution- or with information about the intent, maturity, and protocol specific to be of interest to other users, or limitations of particular phenotype definitions can they may have concern that it is not ready. These inform and empower potential users to use them factors need to be researched and understood in appropriately and at their own prudence. order to create stronger alternative inventives or beliefs that will motivate developers of phenotypes Vision of Shared Phenotype Definitions to share their definitions. The need for a shared or common vision has Discussion been identified as important success factors in collaborative projects. We provide a vision in the Computable phenotype definitions that are form of two scenarios that might motivate pan- developed and represented in an explicit and network or cross-use case sharing of phenotype standardized manner are necessary to ensure definitions (Box 1). the consistency of clinical populations sampled for different purposes. The use of semantically Communication, Marketing, and Engagement equivalent phenotype definitions can enable the Communication and marketing of a set of principles comparison of results across studies, and ensure and vision might enhance the engagement, that all patients can be reliably identified and offered participation, and support of stakeholders from evidence-based treatment options and opportunities multiple organizations and domains. Communication for research. We do not suggest that a single campaigns that inform potential users about the definition per condition is feasible, nor that one availability of existing computable phenotypes definition per use case will necessarily be sufficient. and increase their perception that reusing existing However, we do suggest that some minimum set of definitions will save them work, or produce a better definitions per condition can be identified to address definition (that has been previously tested) than the majority of use cases. It will be important to have they can do alone. Professional societies and medical resources and communication in place to ensure advocacy groups may choose to endorse and that the definitions are as accurate and scalable as Published by EDM Forum Community, 2016 11 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 possible, and that users can identify the definitions build consensus in professional society guidelines in that are the best fit for their intended purposes. a rapid learning environment. Similarly, standardized processes to update and periodically revalidate Within research networks, member investigators definitions—as knowledge of disease increases, and have a vested interest in maintaining the health of as coded terminologies, EHRs and patterns of health the network, and therefore are well incentivized care delivery mature—will be required. to support policies, communication channels, and tools that enable and encourage the sharing and The creation of a culture for sharing, reusing, and reuse of phenotype definitions within the network. harmonizing phenotype definitions will require Sharing phenotype definitions across networks changes in thinking and behavior that can be or domains (e.g., from research to health care enhanced by the following call to action for quality improvement) will be more challenging researchers and clinicians: (1) champion cultural to motivate, as it involves multiple organizations changes and resource allocations that will enable the and complex systems whose incentive structures reuse of computable phenotype definitions where may differ. Evidence-based methods that support appropriate; (2) survey the landscape for existing the collaboration of diverse stakeholders to and previously validated definitions that will meet solve challenging problems in complex systems the particular need before creating a new definition, should be applied to support the sharing and and (3) provide phenotype definition logic and standardization of computable phenotypes between implementation performance or validation results, so health care and research. The lack of supporting that others can benefit from this knowledge. theories and methods for complex cross-boundary Box 2. Call to Action for Researchers and Clinicians collaborations illustrates a gap in learning health to Facilitate Learning Health Care Systems sciences that should be addressed. The learning health care paradigm will demand 1. Champion cultural changes and resource allocations continuous development and refinement of new that will enable the reuse of computable phenotype phenotypes to identify conditions of interest and definitions where appropriate. to reflect changes in health care practice and EHR 2. Survey the landscape for existing and previously validated definitions that will meet the particular need systems. Clinicians, health care administrators, before creating a new definition. investigators, and patients benefit from the use 3. Provide phenotype definition logic and implementation of explicitly defined and validated definitions performance or validation results, so that others can for sampling, potential research participant benefit from this knowledge. identification, and broader analyses using data from EHRs. Collaboration around the development The vision of shared phenotype definitions between of computable phenotypes for emerging diseases, especially where consensus in professional societies research and health care activities will ultimately is slow to emerge (e.g., the early years of HIV/ require governance structures to control curation of AIDS) or varies over time, e.g., the Diagnostic phenotype knowledge, raising a number of questions and Statistical Manual of Mental Disorders, Fifth that will need to be addressed: Who should be the Edition (DSM-5)’s new classification of the autism guardians of such knowledge—a centrally controlled spectrum—which is not concordant with prior federal agency or commercial entity, or both? What definitions, might expedite their investigation and are the types of criteria that would be used to http://repository.edm-forum.org/egems/vol4/iss3/2 12 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 accept a phenotype definition into the repository? a knowledge base of implementation guidance, eGEMs Specifically, what gold standard evidence-based supporting tools that are user-friendly and intuitive, Generating Evidence & Methods practice guideline sources are deemed of sufficient to improve patient outcomes and a willingness to use them. We encourage quality to be acceptable as a basis for phenotype prospective researchers and health administrators definition? to reuse existing EHR-based condition definitions where appropriate and to share their results with The perceived benefits of shared phenotypes others to support a national culture of learning eGEMs might drive funding or advocacy for developing health care. A number of federally funded resources Generating Evidence & Methods and enhancing resources to support the sharing to improve patient outcomes support these activities, and research sponsors and reuse of computable phenotype definitions should encourage their use. across health care delivery and clinical research applications, but measurable results or return on Acknowledgments investment effort will be necessary to maintain them and motivate widespread use in learning The SEDI project and ancillary study described was health systems. Financial models for phenotype supported by Cooperative Agreement Number contributors and users will need to be explored. 1C1CMS331018-01-00 from the Department of Ultimately, the vision of shared phenotype definitions Health and Human Services, Centers for Medicare will only transpire if the libraries, knowledge bases, & Medicaid Services. This publication was also tools, and processes are usable and useful for users, made possible by the Patient Centered Outcomes and if the sharing of definitions creates efficiency for Research Institute (PCORI) and the National research and health care teams, as well as a synergy Institutes of Health (NIH) Common Fund, through a between them that benefits patients, payors, and cooperative agreement (U54 AT007748) from the other stakeholders. Office of Strategic Coordination within the Office of the NIH Director and the Duke CTSA (UL1TR001117). Conclusions and Call to Action Dr. Cameron is supported by grant 5 T32 DK007731 The implementation of learning health care systems (Duke Training Grant in Nephrology) is gaining momentum, and the ability to reproducibly The contents of this publication are solely the identify clinically equivalent patient populations is responsibility of the authors and do not necessarily critical to implementing and evaluating evidence- based treatments in health care systems. The use represent the official views of the Patient Centered of common or semantically equivalent phenotype Outcomes Research or the U.S. Department of definitions across research and health care use Health and Human Services or any of its agencies. cases can support this aim. A national infrastructure We are grateful to Shelley Rusincovitch of the Duke for reusing phenotype definitions and sharing Translational Medicine Institute for her ideas in the experience across health care delivery and clinical formulation of this commentary and to the members research applications will reduce duplicate efforts of the PSQ Core of the NIH Collaboratory for their and increase efficiencies. Both research and ideas and discussion related to achieving valid and provider communities need access to a collection of existing definitions, information to evaluate scalable phenotype definitions for a number of conditions. their appropriateness for particular applications, Published by EDM Forum Community, 2016 13 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 15. Granger BB, Staton M, Peterson L, Rusincovitch SA. Prevalence References and Access of Secondary Source Medication Data: Evaluation of the Southeastern Diabetes Initiative (SEDI). AMIA Jt 1. NQF. Measure Developer Guide to Submitting Measures Summits Transl Sci Proc. 2015;2015:66-70. to NQF October 2013 2013. Available from: http://www. 16. Gregg EW, Li Y, Wang J, Burrows NR, Ali MK, Rolka D, et al. qualityforum.org/Measuring_Performance/Endorsed_ Changes in diabetes-related complications in the United Performance_Measures_Maintenance.aspx. States, 1990-2010. N Engl J Med. 2014;370(16):1514-23. 2. CMS. Chronic condition data warehouse: Centers for Medicare 17. Mo H, Thompson WK, Rasmussen LV, Pacheco JA, Jiang G, and Medicaid Services 2013 [cited 2013 May 27]. Available Kiefer R, et al. Desiderata for computable representations of from: http://www.ccwdata.org/index.htm. electronic health records-driven phenotype algorithms. J Am 3. CMS. eCQM Library: Centers for Medicare & Medicaid Services; Med Inform Assoc. 2015. 2015. Available from: https://www.cms.gov/regulations-and- 18. Sun J, McNaughton CD, Zhang P, Perer A, Gkoulalas-Divanis guidance/legislation/ehrincentiveprograms/ecqm_library.html. A, Denny JC, et al. Predicting changes in hypertension 4. NCQA. Healthcare Effectiveness Data and Information Set control using electronic health records from a chronic (HEDIS) Measures: National Committee for Quality Assurance; disease management program. J Am Med Inform Assoc. 2015 [cited 2015 September 28]. Available from: http://www. 2014;21(2):337-44. ncqa.org/HEDISQualityMeasurement/HEDISMeasures.aspx 19. ONC. Interoperability Standards Advisory (ISA): Office of the 5. CCS AH. Healthcare Cost and Utilization Project (HCUP). National Coordinator for Health IT; 2016 [updated January 19, Clinical Classifications Software (CCS) for ICD-9-CM. Rockville, 2016; cited 2016 April 4]. Available from: https://www.healthit. MD: Agency for Healthcare Research and Quality 2013. gov/standards-advisory. Available from: http://www.hcup-us.ahrq.gov/toolssoftware/ 20. Zozus MN, Hammond WE, Green BB, Kahn MG, Richesson RL, ccs/ccs.jsp. Rusincovitch SA, et al. Assessing Data Quality for Healthcare 6. Greene SM, Reid RJ, Larson EB. Implementing the learning Systems Data Used in Clinical Research (Version 1.0). An NIH health system: from concept to action. Ann Intern Med. Health Care Systems Research Collaboratory Phenotypes, 2012;157(3):207-10. Data Standards, and Data Quality Core White Paper. 2013 7. Richesson R.L., Smerek M. Electronic Health Records-Based [cited 2015 June 22]. Available from: https://sites.duke.edu/ Phenotyping. In: Collaboratory NR, editor. Rethinking Clinical rethinkingclinicaltrials/tag/data-quality/. Trials: A Living Textbook of Pragmatic Clinical Trials. Durham, 21. Holve E, Kahn M, Nahm M, Ryan P, Weiskopf N. A comprehensive framework for data quality assessment in CER. NC: Duke Clinical Research Institute; 2014. AMIA Jt Summits Transl Sci Proc. 2013;2013:86-8. 8. Richesson RL, Hammond WE, Nahm M, Wixted D, Simon 22. Kahn MG, Brown JS, Chun AT, Davidson BN, Meeker D, Ryan GE, Robinson JG, et al. Electronic health records based PB, et al. Transparent Reporting of Data Quality in Distributed phenotyping in next-generation clinical trials: a perspective Data Networks. eGEMs (Generating Evidence & Methods to from the NIH Health Care Systems Collaboratory. J Am Med improve patient outcomes) 2015;3(1):Article 7. Inform Assoc. 2013;20(e2):e226-31. 23. CMS. Quality Measure Development and Management 9. Collaboratory. NHCSR. Rethinking Clinical Trials. A Living Overview: Centers for Medicare & Medicaid Services; 2016 Textbook of Pragmatic Clinical Trials. Durham: Duke University; [cited 2016 April 11]. Available from: https://www.cms.gov/ Medicare/Quality-Initiatives-Patient-Assessment-Instruments/ 10. Rusincovitch SA. Practical Development and Implementation MMS/Downloads/Quality-Measure-Development-Lifecycle- of EHR Phenotypes. Presented on NIH Collaboratory Grand Overview.pdf. Rounds, November 15, 2013. 2013. Available from: https://www. 24. University. V. PheKB 2012 [cited 2013 May 24]. Available from: nihcollaboratory.org/Pages/Grand-Rounds-11-15-13.aspx. http://www.phekb.org/. 11. Fleurence RL, Curtis LH, Califf RM, Platt R, Selby JV, Brown 25. Xu J, Rasmussen LV, Shaw PL, Jiang G, Kiefer RC, Mo H, JS. Launching PCORnet, a national patient-centered clinical et al. Review and evaluation of electronic health records- research network. J Am Med Inform Assoc. 2014;21(4):578-82. driven phenotype algorithm authoring tools for clinical and 12. Nichols GA, Desai J, Elston Lafata J, Lawrence JM, O’Connor translational research. J Am Med Inform Assoc. 2015. PJ, Pathak RD, et al. Construction of a multisite DataLink using 26. PheMA. PheMA wiki: Phenotype Execution Modeling electronic health records for the identification, surveillance, Architecture project Mayo Clinic; 2015 [cited 2015 September prevention, and management of diabetes mellitus: the 28]. Available from: http://informatics.mayo.edu/phema/index. SUPREME-DM project. Preventing chronic disease. 2012;9:E110. php/Main_Page. 13. Prieto-Centurion V, Rolle AJ, Au DH, Carson SS, Henderson 27. Hripcsak G, Duke JD, Shah NH, Reich CG, Huser V, Schuemie AG, Lee TA, et al. Multicenter study comparing case definitions MJ, et al. Observational Health Data Sciences and Informatics used to identify patients with chronic obstructive pulmonary (OHDSI): Opportunities for Observational Researchers. Stud disease. Am J Respir Crit Care Med. 2014;190(9):989-95. Health Technol Inform. 2015;216:574-8. 14. Curtis LH, Weiner MG, Boudreau DM, Cooper WO, Daniel GW, 28. Mini-Sentinel. Distributed Database and Common Data Model: Nair VP, et al. Design considerations, architecture, and use of Mini-Sentinel Coordinating Center; 2015 [updated 1/9/2015; the Mini-Sentinel distributed data system. Pharmacoepidemiol cited 2015 Aug 30]. Available from: http://mini-sentinel.org/ Drug Saf. 2012;21 Suppl 1:23-31. data_activities/distributed_db_and_data/details.aspx?ID=105. http://repository.edm-forum.org/egems/vol4/iss3/2 14 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 29. OHSDI. OMOP Common Data Model 2015 [cited 2015 34. OHDSI. Criteria Assessment Logic for Your Population Study August 30]. Available from: http://www.ohdsi.org/data- in Observational data (CALYPSO): Observational Health Data eGEMs standardization/the-common-data-model/. Sciences and Informatics; 2015 [updated August 8, 2015; cited Generating Evidence & Methods 30. PCORnet. PCORnet Common Data Model (CDM). Why, What, 2015 September 28]. Available from: http://www.ohdsi.org/ to improve patient outcomes and How? : PCORnet Coordinating Center; 2015 [cited 2015 analytic-tools/calypso-for-study-population-evaluation/. Aug 30]. Available from: http://www.pcornet.org/pcornet- 35. Wilcox A, Randhawa G, Embi P, Cao H, Kuperman G. common-data-model/. Sustainability Considerations for Health Research and Analytic 31. Jiang G, Solbrig HR, Kiefer R, Rasmussen LV, Mo H, Speltz P, Data Infrastructures. eGEMs (Generating Evidence & Methods et al. A Standards-based Semantic Metadata Repository to to improve patient outcomes). 2014;2(2):Article 8. Support EHR-driven Phenotype Authoring and Execution. 36. Bozeman B, Baoardman C. An Evidence-Based Assessment of eGEMs Stud Health Technol Inform. 2015;216:1098. Research Collaboration and Team Science: Patterns in Industry Generating Evidence & Methods 32. Richesson RL. An Informatics Framework for the Standardized and University-Industry Partnerships. Paper commissioned to improve patient outcomes Collection and Analysis of Medication Data in Networked for the National Research Council Study of the Science of Research: Journal of Biomedical Informatics - in press; 2014. Team Science. Workshop on Institutional and Organizational 33. OHDSI. CALYPSO (Criteria Assessment Logic for Your Supports for Team Science; October 24; Washington, D.C.: The Population Study in Observational data): Observational Health National Academies of Sciences, Engineering, and Medicine; Data Science and Informatics; 2016 [cited 2016 April 11]. 2013. Available from: http://www.ohdsi.org/analytic-tools/calypso- 37. Papadaki M, Hirsch G. Curing consortium fatigue. Sci Transl for-study-population-evaluation/. Med. 2013;5(200):200fs35. Published by EDM Forum Community, 2016 15 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png eGEMs Pubmed Central

A Framework to Support the Sharing and Reuse of Computable Phenotype Definitions Across Health Care Delivery and Clinical Research Applications

eGEMs , Volume 4 (3) – Jul 5, 2016

Loading next page...
 
/lp/pubmed-central/a-framework-to-support-the-sharing-and-reuse-of-computable-phenotype-RxVbPhsB6H

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Pubmed Central
ISSN
2327-9214
eISSN
2327-9214
DOI
10.13063/2327-9214.1232
Publisher site
See Article on Publisher Site

Abstract

Introduction: The ability to reproducibly identify clinically equivalent patient populations is critical to the vision of learning health care systems that implement and evaluate evidence-based treatments. The use of common or semantically equivalent phenotype definitions across research and health care use cases will support this aim. Currently, there is no single consolidated repository for computable phenotype definitions, making it diffic ult to find all definitions that already exist, and also hindering the sharing of definitions between user groups. Method: Drawing from our experience in an academic medical center that supports a number of multisite research projects and quality improvement studies, we articulate a framework that will support the sharing of phenotype definitions across research and health care use cases, and highlight gaps and areas that need attention and collaborative solutions. Framework: An infrastructure for re-using computable phenotype definitions and sharing experience across health care delivery and clinical research applications includes: access to a collection of existing phenotype definitions, information to evaluate their appropriateness for particular applications, a knowledge base of implementation guidance, supporting tools that are user-friendly and intuitive, and a willingness to use them. Next Steps: We encourage prospective researchers and health administrators to re-use existing EHR-based condition definitions where appropriate and share their results with others to support a national culture of learning health care. There are a number of federally funded resources to support these activities, and research sponsors should encourage their use. Acknowledgements The SEDI project and ancillary study described was supported by Cooperative Agreement Number 1C1CMS331018-01-00 from the Department of Health and Human Services, Centers for Medicare & Medicaid Services. This publication was also made possible by the Patient Centered Outcomes Research Institute (PCORI) and the National Institutes of Health (NIH) Common Fund, through a cooperative agreement (U54 AT007748) from the Offic e of Strategic Coordination within the Offic e of the NIH Director and the Duke CTSA (UL1TR001117). Dr. Cameron is supported by grant 5 T32 DK007731 (Duke Training Grant in Nephrology) The contents of this publication are solely the responsibility of the authors and do not necessarily represent the offic ial views of the Patient Centered Outcomes Research or the U.S. Department of Health and Human Services or any of its agencies. We are grateful to Shelley Rusincovitch of the Duke Translational Medicine Institute for her ideas in the formulation of this commentary and to the members of the PSQ Core of the NIH Collaboratory for their ideas and discussion related to achieving valid and scalable phenotype definitions for a number of conditions. Keywords Computable Phenotypes; Electronic Health Records; Data Standards; Learning Health Care Systems This model/framework is available at EDM Forum Community: http://repository.edm-forum.org/egems/vol4/iss3/2 Disciplines Health Information Technology Creative Commons License This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. This model/framework is available at EDM Forum Community: http://repository.edm-forum.org/egems/vol4/iss3/2 eGEMs Generating Evidence & Methods to improve patient outcomes Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions eGEMs Generating Evidence & Methods to improve patient outcomes A Framework to Support the Sharing and Reuse of Computable Phenotype Definitions Across Health Care Delivery and Clinical Research Applications i ii iii Rachel L. Richesson, PhD; Michelle M. Smerek; C. Blake Cameron, MD ABSTRACT Introduction: The ability to reproducibly identify clinically equivalent patient populations is critical to the vision of learning health care systems that implement and evaluate evidence-based treatments. The use of common or semantically equivalent phenotype definitions across research and health care use cases will support this aim. Currently, there is no single consolidated repository for computable phenotype definitions, making it difficult to find all definitions that already exist, and also hindering the sharing of definitions between user groups. Method: Drawing from our experience in an academic medical center that supports a number of multisite research projects and quality improvement studies, we articulate a framework that will support the sharing of phenotype definitions across research and health care use cases, and highlight gaps and areas that need attention and collaborative solutions. Framework: An infrastructure for re-using computable phenotype definitions and sharing experience across health care delivery and clinical research applications includes: access to a collection of existing phenotype definitions, information to evaluate their appropriateness for particular applications, a knowledge base of implementation guidance, supporting tools that are user-friendly and intuitive, and a willingness to use them. Next Steps: We encourage prospective researchers and health administrators to re-use existing EHR- based condition definitions where appropriate and share their results with others to support a national culture of learning health care. There are a number of federally funded resources to support these activities, and research sponsors should encourage their use. i ii iii Duke University School of Nursing, Duke Clinical Research Institute, Duke University School of Medicine Published by EDM Forum Community, 2016 1 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 to sharing phenotype definitions will support the Introduction reuse of well-constructed and validated computable Computable phenotypes, or electronic health phenotypes, and will subsequently reduce the record (EHR)-based condition definitions, enable variation in definitions across conditions. Drawing the identification of cohorts of patients with certain from our experience from an academic medical diseases or clinical profiles for disease management center supporting a number of multisite research registries, quality improvement programs, evaluation projects, we articulate a framework that will support studies, and interventional research. Regardless the sharing of phenotype definitions across research of the application, cohort identification requires and health care use cases, and highlight gaps or queries of clinical data stores that are both valid areas that need attention and collaborative solutions. and reproducible. Currently, there is no single consolidated repository for computable phenotypes, Background and Context making it difficult to find all definitions that already A “computable phenotype” is a definition of a exist, and also hindering the sharing of definitions condition, disease, or characteristic or clinical event between user groups. Health services researchers that is based solely on data that can be processed and quality assessment groups—i.e., the National by a computer. Computable phenotype definitions Quality Forum (NQF), National Committee for provide the specifications to identify populations Quality Assurance (NCQA), the Centers for Medicare of patients with conditions of interest, and can & Medicaid Services (CMS), and the Agency for be combined with other criteria, such as age or Healthcare Research and Quality (AHRQ)—provide 1-5 other demographic information, to develop cohort computable phenotypes on a number of websites. populations for a variety of purposes. In addition, researchers and registry developers create definitions utilizing different design and Quality monitoring organizations (such as NQF, evaluation methods. Because the definitional logic NCQA, and AHRQ) create computable phenotype is often underspecified or unreported in scientific definitions for the development and monitoring of journals, it is not clear if the findings reported in health care quality measures. A number of research published research or quality improvement are networks have developed phenotype definitions comparable or relevant to clinical populations, to enable the use of EHR data for observational hindering the application of evidence-based medical research (including comparative effectiveness and nursing care. 8-11 studies) and interventional trials. Various multisite 12,13 studies use these definitions to develop registries We believe that a minimal set of well-constructed for drug safety surveillance or chronic disease and explicit EHR-based phenotype definitions will management. There are numerous and distinct create efficiencies for health care organizations use cases for computable phenotypes for health that must increasingly support growing numbers of care delivery (e.g., personalized medicine, guideline- data requests related to comparative effectiveness based care, chronic disease management, and research (CER), quality improvement, and chronic disease management. We further believe that such quality measurement) and biomedical research a set will facilitate synergies between research (genomic, observational, CER, health services and care delivery, enabling “learning health care” research, and interventional trials.) Each use case practices and subsequently improving patient represents different scientific disciplines whose outcomes. A large-scale and multipurpose approach phenotype development efforts have heretofore http://repository.edm-forum.org/egems/vol4/iss3/2 2 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 been undertaken in isolation, without the benefit quality improvement programs might be identifying eGEMs of cooperation. Further, there are no standards of populations that are different from those used in Generating Evidence & Methods practice that encourage reuse of existing definitions the development of the evidence upon which those to improve patient outcomes supporting treatment strategies and interventions or the use of common definitions for health care are based. delivery and research uses. The lack of coordination of phenotype definitions The “research informs practice informs research” eGEMs cycle that is the essence of learning health care among researchers, clinicians, and administrators has Generating Evidence & Methods systems entails that the clinical features used to led to the unintentional proliferation of numerous to improve patient outcomes define research and patient populations be well definitions for many conditions and clinical profiles. understood and comparable. Hence semantically Because each definition applies different logic (e.g., equivalent phenotype definitions must be used various combinations of diagnosis or procedure to identify clinically equivalent populations. We codes, medications, or laboratory tests) for believe that creating a centralized collection of querying EHR data, the resulting cohorts are often explicitly defined computable phenotypes, with an not directly comparable. It is unknown how much accompanying knowledge base of development and semantic variation in definitions actually exists, validation documentation, is the first step toward because this information is often underspecified in consolidating effort and harmonizing definitions. research publications. A recent report on national Information, resources, and tools that facilitate trends in diabetes specifically lists several related the reuse of existing phenotypes will reduce the conditions (including hypoglycemia, neuropathy, variation in phenotype definitions across all use chronic kidney disease, peripheral vascular disease, cases, facilitate conversations between health care cognitive decline, cancers, and even differentiating and research communities about how to compare type 1 from type 2 diabetes) whose prevalence definitions for different use cases, and ultimately could not be reported due to inconsistent EHR lead to harmonization of definitions that will simplify documentation and definitions across the United and support the identification of clinically equivalent States. The consequent likelihood that research, populations for research and health care purposes. patient care, and quality measurement communities are using different phenotype definitions for the Framework Components same condition is more concerning. The COPD Outcomes-based Network for Clinical Effectiveness The reuse of phenotype definitions can be facilitated & Research Translation (CONCERT) assessed 980 by their explicit representation and tools to support patients sampled from various EHR systems using a their evaluation and implementation in new clinical phenotype definition for chronic obstructive applications. We propose that the deliberate and pulmonary disease (COPD), and found that just over informed reuse of existing definitions will require half of those met the criteria for the well-accepted four components: (1) searchable libraries of explicitly research definition for the condition. Further, they defined phenotype definitions; (2) supporting found that the patient populations retrieved by knowledge bases with information and methods; (3) the clinical and research definitions for COPD had tools to identify, evaluate, and implement existing significantly different comorbidities and risk factors. phenotype definitions; and (4) motivated users and stakeholders to use them (Fig 1). This implies that disease management registries and Published by EDM Forum Community, 2016 3 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 Figure 1. Overview of Framework to Support the Reuse of Phenotype Definitions in Learning Health Care Systems Research Informs Practice RESEARCH HEALTH CARE Phenotype Phenotype Definition Definition LEARNING HEALTCARE SYSTEMS Tools Tools Practice Informs Research LIBRARY OF COMPUTABLE PHENOTYPES Definition | Purpose | Metadata | Validation results | Data features | Implementation experience KNOWLEDGE BASE Information | Evidence | Methods | Case Studies MOTIVATION Shared values | Shared vision | Perceived benefits | Incentives | Protections Networks Stakeholders Healthcare Systems http://repository.edm-forum.org/egems/vol4/iss3/2 4 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 hypertension control. Without clear supporting Searchable Libraries of Phenotype eGEMs Definitions documentation, clinical subject matter experts Generating Evidence & Methods may reject, as lacking face validity, well-validated to improve patient outcomes The sharing of information about computable phenotype definitions that do not match their phenotype definitions will allow implementers to expectations or intuition. Clinical practice and reuse appropriate existing definitions rather than disease definitions change over time. Therefore, creating their own. This requires access to an ample phenotype definitions in a phenotype library should eGEMs set of phenotype definitions, along with information reference the underlying clinical definitions or Generating Evidence & Methods that enables them to be evaluated and easily to improve patient outcomes guidelines upon which they are based, in order to implemented. The ideal library should be indexed better identify legacy definitions that are out of date. so that users can search by a number of different In addition, phenotype definitions should conform features including the clinical condition; the data to existing required and emerging terminologies elements; logic; the intended use case; limitations; and standards—e.g., SNOMED CT, LOINC, RxNorm, and orientation toward precision, sensitivity or LOINC, NDF-RT—for representing clinical data, as specificity. endorsed by the Office of the National Coordinator. Adherence to standards allows for a modular design Mo and colleagues call for a formal computable that reduces development and implementation representation of phenotype definitions that will costs, particularly at scale where multiple use cases enable scalability of the definitions by allowing 17 for that standard may exist concurrently. them to be applied to different data systems. Their desiderata includes the following: human-readable Because phenotype definitions might perform and computable forms, structured rules, formalisms differently when implemented in different patient for temporal relations, representations for text populations and EHR systems, information about the searching and natural language processing, and performance of phenotypes in specific organizations interfaces for external software algorithms. They should be collected from implementers and shared endorse the use of standardized terminologies, with future users. Implementation information is ontologies, and also the reuse of value sets. necessary to understand how standard definitions perform across diverse populations, heterogeneous Additional information can be included in the library organizations and EHRs systems. Specifically, or underlying knowledge base to support users’ information about the underlying population and semantic understanding of the phenotype definition, quality (i.e., completeness, accuracy, consistency) of and to enable selection of the appropriate definition data that were used to validate the definitions have to identify patient cohorts with the intended clinical important implications for interpreting the validation features. Therefore, the definitions in a phenotype results. For example, if a test population had 50 library should include metadata or supporting percent missing data in one of the defining variables information about a definition, its intended use, the clinical rationale or research justification for for the phenotype, the provision of this information provides important contextual information about the definition, and data about clinical and scientific the definition’s performance. Similarly, the testing validation in various health care settings. As an example, actual blood pressure measurements, of phenotype definitions in populations with even when they are available for long periods, did high versus low prevalence of disease will yield not contribute significantly to predictive models for different results. Recommendations for data quality Published by EDM Forum Community, 2016 5 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 assessment reporting in pragmatic trials and the current phenotype life cycle stage, and should 21,22 observational research can provide insight into include the status (e.g., in development, draft, final), which data quality dimensions (e.g., completeness, as well as tracking the version number or date of last accuracy) might be most useful to evaluate the revision. Phenotypes could be marked as retired or phenotype definition. archived in cases where clinical practice changes or the underlying clinical definitions or data standards To maximize the socialization and collaboration become out of date. around shared phenotypes, the ideal phenotype library should support communication between The Phenotype Knowledge Base (PheKB) is a large phenotype developers and implementers. The and well-indexed portal for hosting computable Centers for Medicare & Medicaid Services (CMS) phenotypes, though enhancements are needed to employs a standardized approach for enabling accommodate the above information requirements. users to post questions and share comments, and PheKB includes human-readable definitions and for maintaining quality measure definitions across machine-readable code in some cases, but the multiple programs. Such a framework could code is not fully executable across heterogeneous be adapted for use with computable phenotype EHR systems. The PheKB does have an interface libraries. During early development, draft phenotype for reporting contextual data and performance specifications could be posted in a library for public metrices of phenotype definitions, but a useful and review to evaluate feasibility and refine use cases. usable display of these data is not yet standardized. During the validation phase, the testing methodology Also, it is not known how widely PheKB is used could be opened to public comment. Once validated, outside of the Electronic Medical Records and the library could facilitate communications of Genomics (eMERGE) Network or Pharmacogenetic best practices and feedback. This would allow Research Network, whose goals are to implement implementers to share information about their decision support around clinically actionable experiences implementing phenotype definitions in genetic variants for clinical conditions. While several their local systems, and allow others to ask questions National Institutes of Health (NIH) Collaboratory to inform the many practical decisions that are made and National Patient-Centered Clinical Research when implementing abstract logic in local data Network (PCORnet) investigators have added their systems. A collaborative or interactive component phenotype definitions to PheKB, an increased uptake would also allow users to relate their experience of PheKB by other research and clinical groups will implementing definitions in different vendor systems require targeted marketing. Broader usage of PheKB and in different patient populations. Over time, the might drive enhancements to the PheKB resource, library could collect data on usage and impact, but also will likely increase the number of user and aggregate published literature based on each requirements. Several authoring tools exist, including phenotype. A record of projects that have used the PheMA project, which provides generalizable or endorsed different phenotype definitions can computable representations and automated enhance understanding of phenotype intent and mapping tools. Other research networks, such as performance, and can assist potential implementers Observational Health Data Sciences and Informatics in the selection of appropriate phenotypes. 27 11 (OHDSI) and PCORnet, include dedicated Because phenotype definitions are dynamic, the phenotype working groups and internal inventories library should reference a phenotype life cycle, of phenotype definitions. http://repository.edm-forum.org/egems/vol4/iss3/2 6 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 Although PheKB does include some “knowledge” in Knowledge Base of Information and eGEMs Methods the form of phenotype development methods and Generating Evidence & Methods validation protocols, it is limited and not tailored to improve patient outcomes Researchers and health care organizations need for different phenotype users. Information for a information about how to develop, evaluate, and broader range of use cases is needed. Rethinking implement phenotype definitions. The particular Clinical Trials: The Living Textbook of Pragmatic use case influences the nature of the phenotype Clinical Trials provides a model for disseminating eGEMs definition and system requirements. For example, in information in the form of “lessons learned” and Generating Evidence & Methods quality measurement, the purpose of the phenotype to improve patient outcomes case studies, rather than as empirical research. Many definition is to identify “bread and butter” instances other research-network websites and collaborative of a particular condition. Patients whose disease networks perform this function, but a central portal status is negative or indeterminate are excluded. to the knowledge from various networks would By contrast, genomic research usually aims to support potential implementers from multiple reliably identify both cases and controls (negative domains. cases). The phenotype definition must identify with reasonable certainty not only patients who Tools have the condition (i.e., have adequate sensitivity), Formal representations of computable phenotypes, but also patients who clearly do not have the mappings to reference coding systems and condition (i.e., high specificity). Definitions used (common) information models, and executable in disease management registries or population code can support the implementation of definitions health promotion activities have needs for higher in different populations. Mo’s desiderata highlights sensitivity at the cost of specificity, whereas CER recommendations for clinical data representation requires higher specificity and precision. Guidance to support phenotyping. This specifically calls for from different health care and research communities the structure of clinical data into queryable forms can inform users about important features and and the use of a common data model to support performance thresholds for phenotype definitions customization for the variability and availability of for different use cases. EHR data among sites. Since there currently are a Information to clarify data dependencies and number of (different) common data models used in 28-30 research networks, there is a need for tools and implementation requirements is needed to facilitate platforms to implement a given phenotype definition the sharing of phenotypes across groups. For in different contexts. Knowledge, authoring tools, example, some definitions include natural language processing (NLP) components that might not be and vocabulary mapping tools to support these feasible for some target systems. The information activities can also be centrally available through a in the knowledge base can include methods and shared knowledge base or links to a code sharing case studies from projects that have implemented base like GitHub. Similarly, the implementation of the definitions in multiple organizations; their these definitions require terminology mappings (e.g., customizations and lessons learned can inform from drug class names in NDF-RT and medication future users. Evidence-based practice guidelines that sets from RxNorm to product codes in (NDC). include justification for a definition’s logic as well as Terminology integration resources, such as RxNorm, the definition of “gold standard” for validation of EHR- the Unified Medical Language System (UMLS) and UMLS Terminology Services (UTS) tools, can benefit based phenotype definitions should also be available. Published by EDM Forum Community, 2016 7 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 phenotype use cases in many networks. To be common data models, or mappings between coding more broadly used, these tools should be centrally systems; (5) developing new phenotype definitions available with supporting instructions for people if needed; and (6) reporting implementation results, from many different domains and levels of technical along with characteristics of test data sets) for expertise. others to view (Table 1). Specifically, tools are needed for the following We see gaps and unmet needs in all areas except uses: (1) searching for phenotype definitions for phenotype development. At least two scalable that are endorsed or mandated; (2) browsing authoring tools exist—PheMA with its execution existing phenotypes to find ones that are support and OHDSI’s CALYPSO (Criteria potentially relevant that can be reused; (3) the Assessment Logic for Your Population Study in 33,34 display of relevant information to help potential Observational data). Xu et al. provide a detailed implementers understand existing definitions and inventory of other search and authoring tools. In their strengths and limitations for particular uses; addition to guided phenotype authoring tools based (4) the implementation of those definitions in local on the underlying model of the phenotype library, EHR systems with, e.g., executable code tailored to other tools theoretically could support an “import Table 1. Types of Tools and Functionality Required to Support the Sharing and Reuse of Computable Phenotype Definitions Across Health Care Delivery and Clinical Research Applications EXAMPLE OR FUNCTION PURPOSE POTENTIAL TOOL Search for phenotype definitions. Identify validated or endorsed PheKB phenotype definitions. Browse for phenotype definitions. Assess landscape. PheKB Display pertinent context Aid potential implementers in needed* information. assessing a definitions fit for their use case. Provide executable code in different Implement phenotype PheKB, formats (SQL, SAS, R, etc.) and definitions in heterogeneous crosswalks for mapping between systems. different coding systems. GitHub Develop new phenotype definitions. Create new definitions when PheMA existing ones aren’t a good fit. CALYPSO Display implementation results with Provide additional information needed* characteristics of the data in which users need to consider when phenotypes were implemented. determining whether a definition is a good fit for their use case. Note: *This represents a gap where tooling is needed. We are not aware of existing tools that support this function. http://repository.edm-forum.org/egems/vol4/iss3/2 8 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 and transformation” process that could take existing collaborators will be. Wilcox et al. assert that the eGEMs costs for sustaining research infrastructure can definitions developed locally (with local tools) and Generating Evidence & Methods be covered if value can be created. Thus, clear store them in the central repository for other to to improve patient outcomes demonstrations of reduced workload, reduced costs, access and use. or faster development resulting from the reuse of The learning health system cannot exist on phenotype definitions might motivate potential phenotypes alone. Any phenotype library would users. eGEMs need to provide a service-based API that other Generating Evidence & Methods Incentives computable clinical “services” might be able to to improve patient outcomes access in a standardized way, e.g., electronic clinical- Tangible incentives can be created through policy trial management tools that might access existing or legislation. Examples include quality reporting phenotype definitions in order to define the inclusion incentives (e.g., the CMS Physician Quality Reporting and exclusion criteria for a research trial. A number System and the financial rewards of the Meaningful of functional components, e.g., standard models Use program), and punitive consequences for and vocabulary services, would in turn be needed to noncompliance with Food and Drug Administration fully support the reuse of phenotype definitions on a (FDA) reporting specifications. Although these types grand scale. of incentives might be effective, they are time- consuming and expensive to achieve. Alternative Motivated Users and Stakeholders incentives might derive from some sort of peer The sharing of definitions and experience will pressure from the scientific community to report require deliberate action on the part of potential phenotype definitions as part of the research phenotype developers and implementers, and protocol or study results reporting in publications, or useful and intuitive tools can support this behavior. rewards for such behavior from research sponsors or Aligning existing computable phenotypes with in academic promotion rubrics. users’ needs will likely positively influence their Shared Values and Principles uptake, as will engaging all stakeholders in the design and development of phenotype resources A set of agreed upon assumptions and principles and tools described in this framework. Additionally, for research networks, sponsors, and health care a number of approaches can be used to motivate regulators to adopt is the first step in addressing individuals to search for existing definitions and the complex challenges to reusing phenotype to share the outcomes of computable phenotype definitions. These should include a stated implementations. Possible approaches include commitment to reproducible science and the creating incentives, increasing perceived benefit, standardized reporting of phenotype definitions, establishing new social norms, or regulating with use case, and validation results. Additional policies or regulations. principles could include an expectation that users of computable phenotypes will search for and Perceived Benefits and Value consider existing definitions before creating their Collaboration is fostered when the collaborators own. For conditions where a phenotype definition expect or perceive a beneficial outcome. The more already exists, researchers should carefully consider beneficial or significant the outcome, the higher whether the benefit of developing new definitions the participation and commitment level among tailored to their specific use cases outweighs the Published by EDM Forum Community, 2016 9 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 Box 1. User Scenarios that Illustrate Benefit of Shared Phenotype Definitions Scenario 1. An intervention specialist working for the Southeastern Diabetes Initiative (SEDI) wants to identify patients with type 2 diabetes across a number of health care providers in order to develop treatment programs and community interventions that will improve diabetes care. The specialist needs operational definitions for type 2 diabetes, as well as a number of associated conditions such as hypertension and chronic kidney disease. She goes to a central phenotype library and finds definitions for each condition that are appropriate for broad population screening and that can be implemented in all the SEDI sites, including one with no capacity for accessing clinical notes. She shares a link for each selected phenotype definition, plus implementation guidance and appropriate code, with the data specialists at each SEDI site. Each site implements the definition and reports their results to the phenotype library. One SEDI site had problems with the code and reported this experience as well. The original developer of the phenotype contacted the SEDI site with a suggestion. This suggestion was helpful and was therefore added to the knowledge base for other SEDI sites to access and review. Later, the study was published in a journal and referenced the link to the computable phenotype logic and supporting implementation tools. Using these definitions and tools, a new group of researchers replicated the intervention in an urban population on the West Coast and published their findings. This scenario was enabled by the following: 1. Searchable libraries of explicitly defined phenotype definitions; 2. Supporting knowledge bases with information and methods; 3. Supporting tools; and 4. Users and stakeholders motivated to consider reusing existing definitions; benefits from reuse and shared phenotype definitions were realized by the users. Scenario 2. A clinician reviews the literature and finds a study of a new medical intervention for uncontrolled hypertension. She wants to implement it on a similar population in her clinic. The published article includes a narrative discussion of the inclusion and exclusion criteria (e.g., includes diagnosis of hypertension and excludes chronic kidney disease) with hyperlinks to a public phenotype library that hosts the computable phenotype specifications for the intervention population. The clinician points her data analyst to the phenotype specifications and requests a data warehouse query to estimate the number of patients that might be eligible for the planned intervention. After obtaining the required institutional approvals, she implements the intervention and conducts a formal quality improvement study. She publishes that study and references a public link to the phenotype library and knowledge base for the specific computable phenotype-definition logic and supporting implementation tools. Future implementers access the library for implementation details, rather than contacting this clinical investigator, allowing her more time to research and plan new chronic disease management interventions. This scenario was enabled by the following: 1. Searchable libraries of explicitly defined phenotype definitions; 2. Supporting knowledge bases with information and methods; 3. Supporting tools; and 4. Users and stakeholders motivated to consider reusing existing definitions; benefits from reuse and shared phenotype definitions were realized by the users. http://repository.edm-forum.org/egems/vol4/iss3/2 10 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 losses incurred by sacrificing interoperability. Other curate authoritative phenotypes as a complement eGEMs principles might include that phenotype definitions to guideline development activities. Further, Generating Evidence & Methods should be placed in the public domain—regardless models of sharing behavior could be manufactured to improve patient outcomes of whether they derived from federally funded and made visible, such as online exchanges research, national quality reporting incentive between investigators that describe challenges or programs, or private ventures. Because phenotype observations in implementing particular definitions in definitions are developed for different purposes, certain settings. eGEMs populations, and settings, it is not feasible to define a Generating Evidence & Methods Protection from Risks to impr set of defini ove patient ou tions tcomes for all research needs. However, the explicit documentation and sharing of phenotype Inherent in understanding the motivation for sharing definitions and supporting evidence will enable is to understand what fears or hesitations research researchers to evaluate and select the best available investigators or project implementers might have. definitions for their populations and research needs. Anecdotally, the risks to sharing are concerns While there may be potential research integrity risks about publication, copyright, or inappropriate use. associated with using data or methods without Phenotype developers might not feel their definition full understanding of their limitations, a repository is of broad interest, thinking it too institution- or with information about the intent, maturity, and protocol specific to be of interest to other users, or limitations of particular phenotype definitions can they may have concern that it is not ready. These inform and empower potential users to use them factors need to be researched and understood in appropriately and at their own prudence. order to create stronger alternative inventives or beliefs that will motivate developers of phenotypes Vision of Shared Phenotype Definitions to share their definitions. The need for a shared or common vision has Discussion been identified as important success factors in collaborative projects. We provide a vision in the Computable phenotype definitions that are form of two scenarios that might motivate pan- developed and represented in an explicit and network or cross-use case sharing of phenotype standardized manner are necessary to ensure definitions (Box 1). the consistency of clinical populations sampled for different purposes. The use of semantically Communication, Marketing, and Engagement equivalent phenotype definitions can enable the Communication and marketing of a set of principles comparison of results across studies, and ensure and vision might enhance the engagement, that all patients can be reliably identified and offered participation, and support of stakeholders from evidence-based treatment options and opportunities multiple organizations and domains. Communication for research. We do not suggest that a single campaigns that inform potential users about the definition per condition is feasible, nor that one availability of existing computable phenotypes definition per use case will necessarily be sufficient. and increase their perception that reusing existing However, we do suggest that some minimum set of definitions will save them work, or produce a better definitions per condition can be identified to address definition (that has been previously tested) than the majority of use cases. It will be important to have they can do alone. Professional societies and medical resources and communication in place to ensure advocacy groups may choose to endorse and that the definitions are as accurate and scalable as Published by EDM Forum Community, 2016 11 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 possible, and that users can identify the definitions build consensus in professional society guidelines in that are the best fit for their intended purposes. a rapid learning environment. Similarly, standardized processes to update and periodically revalidate Within research networks, member investigators definitions—as knowledge of disease increases, and have a vested interest in maintaining the health of as coded terminologies, EHRs and patterns of health the network, and therefore are well incentivized care delivery mature—will be required. to support policies, communication channels, and tools that enable and encourage the sharing and The creation of a culture for sharing, reusing, and reuse of phenotype definitions within the network. harmonizing phenotype definitions will require Sharing phenotype definitions across networks changes in thinking and behavior that can be or domains (e.g., from research to health care enhanced by the following call to action for quality improvement) will be more challenging researchers and clinicians: (1) champion cultural to motivate, as it involves multiple organizations changes and resource allocations that will enable the and complex systems whose incentive structures reuse of computable phenotype definitions where may differ. Evidence-based methods that support appropriate; (2) survey the landscape for existing the collaboration of diverse stakeholders to and previously validated definitions that will meet solve challenging problems in complex systems the particular need before creating a new definition, should be applied to support the sharing and and (3) provide phenotype definition logic and standardization of computable phenotypes between implementation performance or validation results, so health care and research. The lack of supporting that others can benefit from this knowledge. theories and methods for complex cross-boundary Box 2. Call to Action for Researchers and Clinicians collaborations illustrates a gap in learning health to Facilitate Learning Health Care Systems sciences that should be addressed. The learning health care paradigm will demand 1. Champion cultural changes and resource allocations continuous development and refinement of new that will enable the reuse of computable phenotype phenotypes to identify conditions of interest and definitions where appropriate. to reflect changes in health care practice and EHR 2. Survey the landscape for existing and previously validated definitions that will meet the particular need systems. Clinicians, health care administrators, before creating a new definition. investigators, and patients benefit from the use 3. Provide phenotype definition logic and implementation of explicitly defined and validated definitions performance or validation results, so that others can for sampling, potential research participant benefit from this knowledge. identification, and broader analyses using data from EHRs. Collaboration around the development The vision of shared phenotype definitions between of computable phenotypes for emerging diseases, especially where consensus in professional societies research and health care activities will ultimately is slow to emerge (e.g., the early years of HIV/ require governance structures to control curation of AIDS) or varies over time, e.g., the Diagnostic phenotype knowledge, raising a number of questions and Statistical Manual of Mental Disorders, Fifth that will need to be addressed: Who should be the Edition (DSM-5)’s new classification of the autism guardians of such knowledge—a centrally controlled spectrum—which is not concordant with prior federal agency or commercial entity, or both? What definitions, might expedite their investigation and are the types of criteria that would be used to http://repository.edm-forum.org/egems/vol4/iss3/2 12 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 accept a phenotype definition into the repository? a knowledge base of implementation guidance, eGEMs Specifically, what gold standard evidence-based supporting tools that are user-friendly and intuitive, Generating Evidence & Methods practice guideline sources are deemed of sufficient to improve patient outcomes and a willingness to use them. We encourage quality to be acceptable as a basis for phenotype prospective researchers and health administrators definition? to reuse existing EHR-based condition definitions where appropriate and to share their results with The perceived benefits of shared phenotypes others to support a national culture of learning eGEMs might drive funding or advocacy for developing health care. A number of federally funded resources Generating Evidence & Methods and enhancing resources to support the sharing to improve patient outcomes support these activities, and research sponsors and reuse of computable phenotype definitions should encourage their use. across health care delivery and clinical research applications, but measurable results or return on Acknowledgments investment effort will be necessary to maintain them and motivate widespread use in learning The SEDI project and ancillary study described was health systems. Financial models for phenotype supported by Cooperative Agreement Number contributors and users will need to be explored. 1C1CMS331018-01-00 from the Department of Ultimately, the vision of shared phenotype definitions Health and Human Services, Centers for Medicare will only transpire if the libraries, knowledge bases, & Medicaid Services. This publication was also tools, and processes are usable and useful for users, made possible by the Patient Centered Outcomes and if the sharing of definitions creates efficiency for Research Institute (PCORI) and the National research and health care teams, as well as a synergy Institutes of Health (NIH) Common Fund, through a between them that benefits patients, payors, and cooperative agreement (U54 AT007748) from the other stakeholders. Office of Strategic Coordination within the Office of the NIH Director and the Duke CTSA (UL1TR001117). Conclusions and Call to Action Dr. Cameron is supported by grant 5 T32 DK007731 The implementation of learning health care systems (Duke Training Grant in Nephrology) is gaining momentum, and the ability to reproducibly The contents of this publication are solely the identify clinically equivalent patient populations is responsibility of the authors and do not necessarily critical to implementing and evaluating evidence- based treatments in health care systems. The use represent the official views of the Patient Centered of common or semantically equivalent phenotype Outcomes Research or the U.S. Department of definitions across research and health care use Health and Human Services or any of its agencies. cases can support this aim. A national infrastructure We are grateful to Shelley Rusincovitch of the Duke for reusing phenotype definitions and sharing Translational Medicine Institute for her ideas in the experience across health care delivery and clinical formulation of this commentary and to the members research applications will reduce duplicate efforts of the PSQ Core of the NIH Collaboratory for their and increase efficiencies. Both research and ideas and discussion related to achieving valid and provider communities need access to a collection of existing definitions, information to evaluate scalable phenotype definitions for a number of conditions. their appropriateness for particular applications, Published by EDM Forum Community, 2016 13 eGEMs (Generating Evidence & Methods to improve patient outcomes), Vol. 4 [2016], Iss. 3, Art. 2 15. Granger BB, Staton M, Peterson L, Rusincovitch SA. Prevalence References and Access of Secondary Source Medication Data: Evaluation of the Southeastern Diabetes Initiative (SEDI). AMIA Jt 1. NQF. Measure Developer Guide to Submitting Measures Summits Transl Sci Proc. 2015;2015:66-70. to NQF October 2013 2013. Available from: http://www. 16. Gregg EW, Li Y, Wang J, Burrows NR, Ali MK, Rolka D, et al. qualityforum.org/Measuring_Performance/Endorsed_ Changes in diabetes-related complications in the United Performance_Measures_Maintenance.aspx. States, 1990-2010. N Engl J Med. 2014;370(16):1514-23. 2. CMS. Chronic condition data warehouse: Centers for Medicare 17. Mo H, Thompson WK, Rasmussen LV, Pacheco JA, Jiang G, and Medicaid Services 2013 [cited 2013 May 27]. Available Kiefer R, et al. Desiderata for computable representations of from: http://www.ccwdata.org/index.htm. electronic health records-driven phenotype algorithms. J Am 3. CMS. eCQM Library: Centers for Medicare & Medicaid Services; Med Inform Assoc. 2015. 2015. Available from: https://www.cms.gov/regulations-and- 18. Sun J, McNaughton CD, Zhang P, Perer A, Gkoulalas-Divanis guidance/legislation/ehrincentiveprograms/ecqm_library.html. A, Denny JC, et al. Predicting changes in hypertension 4. NCQA. Healthcare Effectiveness Data and Information Set control using electronic health records from a chronic (HEDIS) Measures: National Committee for Quality Assurance; disease management program. J Am Med Inform Assoc. 2015 [cited 2015 September 28]. Available from: http://www. 2014;21(2):337-44. ncqa.org/HEDISQualityMeasurement/HEDISMeasures.aspx 19. ONC. Interoperability Standards Advisory (ISA): Office of the 5. CCS AH. Healthcare Cost and Utilization Project (HCUP). National Coordinator for Health IT; 2016 [updated January 19, Clinical Classifications Software (CCS) for ICD-9-CM. Rockville, 2016; cited 2016 April 4]. Available from: https://www.healthit. MD: Agency for Healthcare Research and Quality 2013. gov/standards-advisory. Available from: http://www.hcup-us.ahrq.gov/toolssoftware/ 20. Zozus MN, Hammond WE, Green BB, Kahn MG, Richesson RL, ccs/ccs.jsp. Rusincovitch SA, et al. Assessing Data Quality for Healthcare 6. Greene SM, Reid RJ, Larson EB. Implementing the learning Systems Data Used in Clinical Research (Version 1.0). An NIH health system: from concept to action. Ann Intern Med. Health Care Systems Research Collaboratory Phenotypes, 2012;157(3):207-10. Data Standards, and Data Quality Core White Paper. 2013 7. Richesson R.L., Smerek M. Electronic Health Records-Based [cited 2015 June 22]. Available from: https://sites.duke.edu/ Phenotyping. In: Collaboratory NR, editor. Rethinking Clinical rethinkingclinicaltrials/tag/data-quality/. Trials: A Living Textbook of Pragmatic Clinical Trials. Durham, 21. Holve E, Kahn M, Nahm M, Ryan P, Weiskopf N. A comprehensive framework for data quality assessment in CER. NC: Duke Clinical Research Institute; 2014. AMIA Jt Summits Transl Sci Proc. 2013;2013:86-8. 8. Richesson RL, Hammond WE, Nahm M, Wixted D, Simon 22. Kahn MG, Brown JS, Chun AT, Davidson BN, Meeker D, Ryan GE, Robinson JG, et al. Electronic health records based PB, et al. Transparent Reporting of Data Quality in Distributed phenotyping in next-generation clinical trials: a perspective Data Networks. eGEMs (Generating Evidence & Methods to from the NIH Health Care Systems Collaboratory. J Am Med improve patient outcomes) 2015;3(1):Article 7. Inform Assoc. 2013;20(e2):e226-31. 23. CMS. Quality Measure Development and Management 9. Collaboratory. NHCSR. Rethinking Clinical Trials. A Living Overview: Centers for Medicare & Medicaid Services; 2016 Textbook of Pragmatic Clinical Trials. Durham: Duke University; [cited 2016 April 11]. Available from: https://www.cms.gov/ Medicare/Quality-Initiatives-Patient-Assessment-Instruments/ 10. Rusincovitch SA. Practical Development and Implementation MMS/Downloads/Quality-Measure-Development-Lifecycle- of EHR Phenotypes. Presented on NIH Collaboratory Grand Overview.pdf. Rounds, November 15, 2013. 2013. Available from: https://www. 24. University. V. PheKB 2012 [cited 2013 May 24]. Available from: nihcollaboratory.org/Pages/Grand-Rounds-11-15-13.aspx. http://www.phekb.org/. 11. Fleurence RL, Curtis LH, Califf RM, Platt R, Selby JV, Brown 25. Xu J, Rasmussen LV, Shaw PL, Jiang G, Kiefer RC, Mo H, JS. Launching PCORnet, a national patient-centered clinical et al. Review and evaluation of electronic health records- research network. J Am Med Inform Assoc. 2014;21(4):578-82. driven phenotype algorithm authoring tools for clinical and 12. Nichols GA, Desai J, Elston Lafata J, Lawrence JM, O’Connor translational research. J Am Med Inform Assoc. 2015. PJ, Pathak RD, et al. Construction of a multisite DataLink using 26. PheMA. PheMA wiki: Phenotype Execution Modeling electronic health records for the identification, surveillance, Architecture project Mayo Clinic; 2015 [cited 2015 September prevention, and management of diabetes mellitus: the 28]. Available from: http://informatics.mayo.edu/phema/index. SUPREME-DM project. Preventing chronic disease. 2012;9:E110. php/Main_Page. 13. Prieto-Centurion V, Rolle AJ, Au DH, Carson SS, Henderson 27. Hripcsak G, Duke JD, Shah NH, Reich CG, Huser V, Schuemie AG, Lee TA, et al. Multicenter study comparing case definitions MJ, et al. Observational Health Data Sciences and Informatics used to identify patients with chronic obstructive pulmonary (OHDSI): Opportunities for Observational Researchers. Stud disease. Am J Respir Crit Care Med. 2014;190(9):989-95. Health Technol Inform. 2015;216:574-8. 14. Curtis LH, Weiner MG, Boudreau DM, Cooper WO, Daniel GW, 28. Mini-Sentinel. Distributed Database and Common Data Model: Nair VP, et al. Design considerations, architecture, and use of Mini-Sentinel Coordinating Center; 2015 [updated 1/9/2015; the Mini-Sentinel distributed data system. Pharmacoepidemiol cited 2015 Aug 30]. Available from: http://mini-sentinel.org/ Drug Saf. 2012;21 Suppl 1:23-31. data_activities/distributed_db_and_data/details.aspx?ID=105. http://repository.edm-forum.org/egems/vol4/iss3/2 14 DOI: 10.13063/2327-9214.1232 Richesson et al.: Enabling Re-Use of Computable Phenotype Definitions Volume 4 (2016) Issue Number 3 29. OHSDI. OMOP Common Data Model 2015 [cited 2015 34. OHDSI. Criteria Assessment Logic for Your Population Study August 30]. Available from: http://www.ohdsi.org/data- in Observational data (CALYPSO): Observational Health Data eGEMs standardization/the-common-data-model/. Sciences and Informatics; 2015 [updated August 8, 2015; cited Generating Evidence & Methods 30. PCORnet. PCORnet Common Data Model (CDM). Why, What, 2015 September 28]. Available from: http://www.ohdsi.org/ to improve patient outcomes and How? : PCORnet Coordinating Center; 2015 [cited 2015 analytic-tools/calypso-for-study-population-evaluation/. Aug 30]. Available from: http://www.pcornet.org/pcornet- 35. Wilcox A, Randhawa G, Embi P, Cao H, Kuperman G. common-data-model/. Sustainability Considerations for Health Research and Analytic 31. Jiang G, Solbrig HR, Kiefer R, Rasmussen LV, Mo H, Speltz P, Data Infrastructures. eGEMs (Generating Evidence & Methods et al. A Standards-based Semantic Metadata Repository to to improve patient outcomes). 2014;2(2):Article 8. Support EHR-driven Phenotype Authoring and Execution. 36. Bozeman B, Baoardman C. An Evidence-Based Assessment of eGEMs Stud Health Technol Inform. 2015;216:1098. Research Collaboration and Team Science: Patterns in Industry Generating Evidence & Methods 32. Richesson RL. An Informatics Framework for the Standardized and University-Industry Partnerships. Paper commissioned to improve patient outcomes Collection and Analysis of Medication Data in Networked for the National Research Council Study of the Science of Research: Journal of Biomedical Informatics - in press; 2014. Team Science. Workshop on Institutional and Organizational 33. OHDSI. CALYPSO (Criteria Assessment Logic for Your Supports for Team Science; October 24; Washington, D.C.: The Population Study in Observational data): Observational Health National Academies of Sciences, Engineering, and Medicine; Data Science and Informatics; 2016 [cited 2016 April 11]. 2013. Available from: http://www.ohdsi.org/analytic-tools/calypso- 37. Papadaki M, Hirsch G. Curing consortium fatigue. Sci Transl for-study-population-evaluation/. Med. 2013;5(200):200fs35. Published by EDM Forum Community, 2016 15

Journal

eGEMsPubmed Central

Published: Jul 5, 2016

References