Category Archives: open data

Open Referral standard

Open_ReferralThe DDOD program is currently assisting the proponents of a new open standard for publishing human services, called Open Referral.  In order for us to be able to justify the promotion of this standard and publication of data to it, we’re first looking to develop clear and concise use cases.

The Background

Open Referral is a standard that originally came out of a Code for America initiative a couple years ago, with the goal of automating the updating of human services offered across many programs.  Doing so would not only make offered services more discoverable, but also lower the cost of administration for the service providers and referring organizations.

The Problem: A landscape of siloed directories

It’s hard to see the safety net. Which agencies provide what services to whom? Where and how can people access them? These details are always in flux. Nonprofit and government agencies are often under-resourced and overwhelmed, and it may not be a priority for them to push information out to attract more customers.

So there are many ‘referral services’ — such as call centers, resource directories, and web applications — that collect directory information about health, human, and social services. However, these directories are all locked in fragmented and redundant silos. As a result of this costly and ineffective status quo:

  • People in need have difficulty discovering and accessing services that can help them live better lives.
  • Service providers struggle to connect clients with other services that can help meet complex needs.
  • Decision-makers are unable to gauge the effectiveness of programs at improving community health.
  • Innovators are stymied by lack of access to data that could power valuable tools for any of the above.  

– Source: Open Referral project description

For potential use cases, there have been a small handful of government programs identified as potential pilots.  These include:

 

The Competition

Open Referral is not without competing standards.  In fact, the AIRS/211 Taxonomy is already widely used among certified providers of information and referral services, such as iCarol.  However, AIRS/211 has two drawbacks in comparison with Open Referral.  

First, it’s not a free and open standard.  While there are sample PDFs available for parts of the taxonomy, a full spec requires a subscription.

“If you wish to evaluate the Taxonomy prior to subscribing, you can register for evaluation purposes and have access to the full Taxonomy for a limited period of time through the search function. ”  – Source: UAIRS/211 Download page and Subscription page

The taxonomy also requires an annual license fee, which could be a challenge to continue funding in perpetuity for government and nonprofit organizations.

“Organizations need a license to engage in any use of the Taxonomy.”
— Source: AIRS/211 Subscription page

Second, the AIRS/221 taxonomy if highly structured and extensive.  While that has advantages for consistency and interoperability, it raises other challenges.  It leads to a high learning curve and therefore sets potentials barriers for organizations without technical expertise.  Open Referral states that it is a more lightweight option.

It should also be noted that there’s a CivicServices schema defined for use with  Schema.org.  Its approach is to embed machine-readable “Microdata” throughout human-readable HTML web pages.  Schema.org standards are intended to be interpreted by web engines like Google, Bing and Yahoo when indexing a website.  That said, the degree of adoption for CivicServices in particular – from either search engines or information publishers – is unclear at this point.

 

Onward!

In concept, the Open Referral standard would lower the cost and lag time for organizations to update relevant services for their constituents.  The standard is being evangelized by Greg Bloom, who has started with Code for America and has been reaching out to organizations who would be consuming this data (such as Crisis Text Line, Purple Binder and iCarol) for the purpose of defining a compelling use case.

There’s a DDOD writeup on this topic at “Interoperability: Directories of health, human and social services”, intended to facilitate creation of practical use cases.

 

 

Further reading…

Additional information on Open Referral can be found at:

Open Data Discoverability

I’m adding a working document to cover the topic of open data discoverability and usability.  It appears as though this is an area that is in desperate need for attention.  I have come across it tangentially throughout much of my work.  It deserves to be aggregated and curated.  There are also some lingering opportunities to make practical use of semantic web concepts.  There are vast repositories of data assets throughout government, academia and industry that could be better leveraged.  So lets make it happen.

 

DDOD featured on Digital Gov

DDOD logoThe Demand-Driven Open Data (DDOD) program has recently been featured on DigitalGov.  (See DigitalGov article.)

It should be added, that a major project in the works is the merging of DDOD tools and methodologies into the larger HealthData.gov program.  The effort seeks to maximize the value of existing data assets from across HHS agencies (CMS, FDA, CDC, NIH, etc.).  Already planned are new features to enhance data discoverability and usability.

We’re also looking into how to improve the growing knowledge base of DDOD use cases by leveraging semantic web and linked open data (LOD) concepts.  A couple years ago, HHS organized the Health Data Platform Metadata Challenge – Health 2.0.  The findings from this exercise could be leveraged for both DDOD and HealthData.gov.

DDOD featured on DigitalGov

Plans for Demand-Driven Open Data 2.0

Demand-Driven Open Data (DDOD) is a component HHS’s Health Data Initiative (HDI) represented publicly by  HealthData.gov.  DDOD is a framework of tools and methods to provide a systematic, ongoing and transparent mechanism for industry and academia to tell HHS more about their data needs.  The DDOD project description has recently been updated on the HHS IDEA Lab website: http://www.hhs.gov/idealab/projects-item/demand-driven-open-data/.   The writeup includes the problem description, background and history, the DDOD solution and process, and future plans.

In November 2015, the project has undergone an extensive evaluation of the activities and accomplishments from the prior year.  Based on the observations, plans are in place to deploy DDOD 2.0 in 2016.  On the process side, the new version will have clearly defined SOPs (standard operating procedures), better instructions for data requesters and data program owners, and up-front validation of use cases.  On the technology side, DDOD will integrate with the current HealthData.gov platform, with the goals of optimizing data discoverability and usability.  It will also include dashboards, data quality analytics, and automated validation of use case content.  These features help guide the operations of DODD and HealthData.gov workflow.

Provider Network Directories on FHIR

FHIR logoI’ve done a lot of work on designing provider network directory schemas.  Much of it is described in this blog (“provider directories” tag) and in the related “Interoperability” entry on the DDOD website.  But so far, the effort has been focused on designing a standard data schema that could adequately represent the way the healthcare industry currently operates in terms of provider networks and health insurance coverage.  Now I’d like to highlight an important factor that’s been overlooked: the mechanics of moving this data between systems.

Simplified provider network directory modelIn their recent machine readability requirement for insurance issuers on health insurance marketplaces, CMS/CCIIO did not specify the transport mechanism for the QHP schema. The only requirement is to register the URL containing the data with HIOS (Health Insurance Oversight System). The URLs could be to a static page or to a dynamic RESTful query. I’d like to point out that CMS or third party services have an opportunity to provide significant value to both consumer applications and transaction oriented systems by adding a RESTful FHIR layer. Ideally, this would be done in front of globally aggregated datasets that have been registered in HIOS.  The resulting FHIR API would have resource types of Provider, Network and Plan, which correspond to the JSON files of the QHP provider directory schema.  The most relevant resource type

Much of the usefulness for machine readable provider network requirement is around enabling consumers to ask certain common questions when they need to select an insurance plan. (For example: Which insurance plans is my doctor in? Is she taking new patients at a desired facility under a particular plan? What plans have the specialists I need in a specific geographic region?) These questions could easily translate to FHIR queries using the Search interaction on any of the defined resource types.  With required monthly updates and potentially frequent changes in network and provider demographics, there are also use cases that benefit from availability of the History interaction, either as a type-level change log or an instance-level version view.  Additionally, by adding search parameters, response record count limits, and pagination in front of network directory datasets, load from traffic on aggregated data servers could be much more efficient.

NPPES on FHIR serverI set up a server with an example of a FHIR API implemented for provider directories, although limited to NPPES data model.  A big thanks to Dave McCallie for creating and sharing the original codebase: GitHub.com/ DavidXPortnoy/ nppes_fhir_demo.  You can find the live non-production sandbox version here: http://fhir-dev.ddod.us:8080/nppes_fhir.  Here are a few sample queries you can run against it:

I’m working on expanding the functionality of this server to accommodate the full provider network directory schema, including components of provider demographics, facilities, organizations, credentialing, insurance plans, plan coverage, and formularies.

 


Edit 10/2015: It should be said that my HHS Entrepreneur-in-Residence colleague, Alan Viars, has led an effort to build a robust API for NPPES for HHS IDEA Lab’s NPPES Modernization Project.  It’s designed to handle both efficient read access wanted by many applications and robust methods for making changes.  Although initially it focused on providing the simplest purpose built API possible, Alan is now looking at creating a version that would be based on FHIR practices.


Additional FHIR server implementations

The current FHIR server is quite simple.  It’s implemented using Python, Elasticsearch as document store for NPPES records, Flask as Python web server, and Gunicorn as WSGI web gateway.  Let’s call it the Flask-ElasticSearch implementation.   There are a couple other more popular alternatives.

It seems that the most active FHIR open source codebase is HAPI, located at https://github.com/jamesagnew/hapi-fhir.  It’s managed by James Agnew at University Health Network.  This is a Java / Maven library for creating both FHIR servers and clients.  Its ability to easily bolt FHIR onto any database makes it ideal for extending the API to existing applications.  It also enables existing apps to connect to other FHIR servers as a client.  This codebase is quite full featured, supporting all current FHIR resource types, most operations, and both XML and JSON encodings.  Relative to other alternatives, it’s well documented as well.  There’s a live demo project available: http://fhirtest.uhn.ca/

Finally, FHIRbase, located at https://github.com/fhirbase/fhirbase, is a relational storage server for FHIR with a document API.  It uses PostgreSQL as the relational database engine and written in PLpgSQL.  FHIRplace, located at https://github.com/fhirbase/fhirplace, provides a server that accesses FHIRbase.  It’s written in Clojure, Node.js, and JavaScript.  And like HAPI, it supports all current FHIR resource types, operations, and both XML and JSON encodings.

There are also a surprisingly large number of Windows-based FHIR servers that I haven’t considered, due to a desire to stay on non-proprietary platforms.  Although perhaps it shouldn’t be that surprising given the Windows heavy history of EHR and other healthcare apps.

 

Provider network directory standards

Here’s my most recent contribution to the effort around deploying data interoperability standards for use with healthcare provider network directories.  The schema proposed for use by QHP (Qualified Health Plans) on health insurance marketplaces can be found on GitHub:  https://github.com/CMSgov/QHP-provider-formulary-APIs.  Designing an improved model for the provider directory and plan coverage standards required analysis of:

The data model now looks like this:

Background info on this topic can be found in the related DDOD article.

Vision of healthcare provider network directories

Background

There are four pieces of information that U.S. consumers need to make informed choices about their healthcare insurance coverage.

  1. Directory: What are the healthcare provider demographics, including specialty, locations, hours, credentialing?
  2. Coverage: Does the provider take a particular insurance plan?
  3. Benefits: What are the benefits, copays and formularies associated with my plan?
  4. Availability: Is the provider accepting new patients for this particular insurance plan and location?

Without having these capabilities in place, consumers are likely to make uninformed decisions or delay decisions.  That in turn has significant health and financial impacts.

Problem

Healthcare provider directories have historically been supplied by the NPPES database.  But it has been lacking in terms of being accurate, up to date, or even able to represent reality accurately.  First, the overhead of making changes is quite high and there hasn’t been an easy way for a provider to delegate ability to make changes.  Second, the incentives aren’t there.  There are no penalties for abandoning updates and many providers don’t realize how frequently NPPES data is downloaded and propagated to consumer-facing applications.  Third, the data model is fixed by regulation, but it cannot accurately represent the many-to-many relationships among practitioners, groups, facilities and locations.  It also doesn’t adequately reflect the ability to manage multiple specialties and accreditations.


Incidentally, my work in the area of provider directories has been driven by the needs of DDOD.  Specifically, there were at least five DDOD use cases that directly depended on solving the provider directory problems.  But the actual problem extends well past the use cases.  An accurate and standardized “provider dimension” is needed for any type of analytics or applications involving providers.  That could include having access to insurance coverage information to analytics on utilization, open payments, fraud and comparative effectiveness research.

Addressing consumers need to understand their options in terms of coverage and benefits has historically been a challenge that’s yet to be solved.  There are routine complaints of consumers signing up for new coverage, only to find out that their provider doesn’t take their new plan or that they are not accepting patients for their plan.  These problems have been the driver for Insurance Marketplaces (aka, FFMs) instituting a new rule requiring QHPs (Qualified Health Plans) to publish machine readable provider network directories that are updated on at least a monthly basis.  This rule, which is effective open enrollment 2015 and the technical challenges around it are described in detail in the related DDOD discussion on provider network directories.  (Note that although the rule refers to “provider directories”, in reality it includes all 4 pieces of information.)  CMS already collects all this information from QHPs during the annual qualifications process.  It asks payers to submit template spreadsheets containing information about their plans, benefits and provider networks.

The seemingly simple question as to whether a provider is taking new patients has been a challenge as well.  That’s because the answer is both non-binary and volatile.  The answer might be different depending on insurance plan, type of referral, location and even time of day.  It may also fluctuate based on patient load, vacations and many other factors.  The challenged becomes even harder when you consider the fact that providers often don’t have the time or financial incentive to update this information with the payers.

Approach

Aneesh Chopra and I put together an industry workgroup to help determine how to best implement the QHP rule.  The workgroup spans the full spectrum of industry participants, payers, payer-provider intermediaries, providers and consumer applications.  It should be noted that we have an especially strong representation from payers and intermediaries, representing a substantial portion of the market.  While looking at the best ways to implement the rule from a technical and logistical perspective, we identified a missing leg: incentives.

3 pillars needed to reach critical mass for a new standard to become sustainable
Technology Logistics Incentives

The QHP rule and the specified data schema provides a starting point for the technology.  Workgroup participants also suggested how to use their organizations’ existing systems capabilities to fulfill the rule requirements.  We discussed logistics of how data can get moved from its multiple points of origin to CMS submission.

Through this exercise, it became quite clear that the implementation of the QHP mandate could make significant progress towards achieving its stated goals if certain actions are taken in another area — Medicare Advantage (MA).  That’s because, much of the data in the proposed standard originates with providers, rather than payers.  Such data typically includes provider demographics, credentialing, locations, and whether they’re accepting new patients.  But at this point, marketplaces are able to only exert economic pressure on payers.  MA, on the other hand, can leverage the STAR rating system to establish incentives for providers as well, which typically get propagated into provider-payer contracts.  STAR incentives are adjusted every year.  So it should be well within CMS’s ability to establish the desired objectives.  They can also leverage the CAHPS survey to measure the level of progress these efforts are making towards providing the necessary decision making tools to consumers.  At the moment, marketplaces don’t have any such metric.

It’s worth noting that Original Medicare (aka, Medicare FFS or Fee for Service) has an even stronger ability to create incentives for providers and I’ve been talking with CMS’s CPI group about publishing PECOS data to the new provider directory standard.  PECOS enjoys much more accurate and up to date provider data than NPPES, due to its use for billing.  But the PECOS implementation is not as challenging as its QHP counterpart in that we’re effectively publishing coverage for only one plan.  So complexities around plan coverage and their mapping to provider networks don’t apply.  But consumers still benefit from up to date provider information.

Vision

If we create incentive-driven solutions in the areas of Marketplaces, Medicare Advantage, Managed Medicaid, and Original Medicare, we might be able to solve the problems plaguing NPPES without requiring new regulation or a systems overhaul.  We will be including the vast majority of the practitioners across the U.S., almost all payers and deliver the needed information for consumers to make decisions about their coverage.

Finally, we are partnering with Google to leverage the timing of the QHP rule with a deployment of a compatible standard on Schema.org.  Doing so would help cement the standards around provider directories and insurance coverage even further.  It empowers healthcare providers and payers to publish their information in a decentralized manner.  Since updating information is so easy, it can happen more frequently.  Third party applications could pull this information directly from the source, rather than relying on a central body.  And the fact that search engines correctly interpret and index previously unstructured data means faster answers for consumers even outside of specialized applications.

Using DDOD to identify and index data assets

Part of implementing the Federal Government’s M-13-13 “Open Data Policy – Managing Information as an Asset” is to create and maintain an Enterprise Data Inventory (EDI).   EDI is supposed to catalog government-wide SRDAs (Strategically Relevant Data Assets).  The challenge is that the definition of an SRDA is subjective within the context of an internal IT system, there’s not enough budget to catalog the huge number of legacy systems, and it’s hard to know when you’re done documenting the complete set.

Enter DDOD (Demand-Driven Open Data).  While it doesn’t solve these challenges directly, its practical approach to managing open data initiatives certainly can improve the situation.  Every time an internal “system of record” is identified for a DDOD Use Case, we’re presented with a new opportunity to make sure that an internal system is included in the EDI.  Already, DDOD has been able to identify missing assets.

DDOD helps with EDI and field-level data dictionary

But DDOD can do even better.  By focusing on working one Use Case at a time, we provide the opportunity to catalog the data asset to a much more granular level.  The data assets on HealthData.gov and Data.gov are catalog at the dataset level, using the W3C DCAT (Data Catalog) Vocabulary.  The goal is to catalog datasets associated with DDOD Use Cases at the field-level data dictionary level.  Ultimately, we’d want to get attain a level of sophistication at which we’re semantically tagging fields using controlled vocabularies.

Performing field-level cataloging all this has a couple important advantages.  First, in enables better indexing and more sophisticated data discovery on HealthData.gov and other HHS portals.  Second, it identifies opportunities to link across datasets from different organizations and even across different domains.  The mechanics of DDOD in relation to EDI, HealthData.gov, data discoverability and linking is further explained at the Data Owners section of the DDOD website.

Note: HHS EDI is not currently available as a stand-alone data catalog.  But it’s incorporated into http://www.healthdata.gov/data.json, because this catalog includes all 3 types of access levels: public, restricted public, and non-public datasets.

Obtaining data on cost of FDA drug approval process

To follow up on the post describing Investment Model for Pharma…   We’re working on obtaining data on cost of FDA drug approval process via DDOD (Demand-Driven Open Data).  Use Case 34: Cost of drug approval process describes this effort.  It identifies the drivers and value of obtaining this data in informing policy.  The writeup identifies several data sources and how to go about using them.  The information provided has come from discussions with FDA’s CDER Office of Strategic Programs (OSP).

Data sources identified:

  • IND activity: Distinct count of new INDs (Investigational New Drug) received during the calendar year and previously received INDs which had an incoming document during the same period: INDs with Activity page
  • PDUFA reports: The Prescription Drug User Fee Act (PDUFA) requires FDA to submit two annual reports to the President and the Congress for each fiscal year: 1) a performance report and 2) a financial report
  • FTE reports: Statistics on number of FDA employees and grade levels
  • ClinicalTrials.gov might provide glimpses into drug approval activity, although it’s not complete (especially for Phase 1 trials) and mixes in non-IND trials.
  • Citeline has counts of active compounds under development, including breakdown by Phase

As more users come forward to identify specifics of how they need to use the data, there’s an opportunity to refine the use case and focus efforts on obtaining data not yet available.

DDOD Love from Health Datapalooza 2015

Health Datapalooza

Demand-Driven Open Data (DDOD) has gotten a lot of coverage throughout Health Datapalooza 2015.  I participated in 4 panels throughout the week and had the opportunity to explain DDOD to many constituents.

  • Developer HealthCa.mp
    Health DevCamp logo
    Developer HealthCa.mp is a collaborative event for learning about existing and emerging APIs that can be used to develop applications that will help consumers, patients and/or beneficiaries achieve better care through access to health data, especially their own!Areas of focus include:
    • Prototype BlueButton on FHIR API from CMS
    • Project Argonaut
    • Privacy on FHIR initiative
    • Sources of population data from CMS and elsewhere around HHS
  • Health Datapalooza DataLab
    EVENT DETAILS HHS has so much data! Medicare, substance abuse and mental health, social services and disease prevention are only some of the MANY topical domains where HHS provides huge amounts of free data for public consumption. It’s all there on HealthData.gov! Don’t know how the data might be useful for you? In the DataLab you’ll meet the people who collect and curate this trove of data assets as they serve up their data for your use. But if you still want inspiration, many of the data owners will co-present with creative, insightful, innovative users of their data to truly demonstrate its alternative value for positive disruptions in health, health care, and social services.

    Moderator: Damon Davis, U.S. Department of Health & Human Services

    Panelists: Natasha Alexeeva, Caretalia; Christina Bethell, PhD, MBA, MPH, Johns Hopkins; Lily Chen, PhD, National Center for Health Statistics; Steve Cohen, Agency for Healthcare Research & Quality; Manuel Figallo, Sas; Reem Ghandour, DrPH, MPA, Maternal and Child Health Bureau; Jennifer King, U.S. Department of Health & Human Services; Jennie Larkin, PhD, National Institutes of Health; Brooklyn Lupari, Substance Abuse & Mental Health Services Administration; Rick Moser, PhD, National Cancer Institute; David Portnoy, MBA, U.S. Department of Health & Human Services; Chris Powers, PharmD, Centers for Medicare and Medicaid Services; Elizabeth Young, RowdMap

  • No, You Can’t Always Get What You Want: Getting What You Need from HHS
    EVENT DETAILSWhile more data is better than less, pushing out any ol’ data isn’t good enough.  As the Data Liberation movement matures, the folks releasing the data face a major challenge in determining what’s the most valuable stuff to put out.  How do they move from smorgasbord to intentionally curated data releases prioritizing the highest-value data?  Folks at HHS are wrestling with this, going out of their way to make sure they understand what you want and ensure you get the yummy data goodies you’re craving.  Learn how HHS is using your requests and feedback to share data differently.  This session explores the HHS new initiative, the Demand-Driven Open Data (DDOD): the lean startup approach to public-private collaboration.  A new initiative out of HHS IDEA Lab, DDOD is bold and ambitious, intending to change the fundamental data sharing mindset throughout HHS agencies — from quantity of datasets published to actual value delivered.

    Moderator: Damon Davis, U.S. Department of Health & Human Services

    Panelists: Phil Bourne, National Institute of Health (NIH); Niall Brennan, Centers for Medicare & Medicaid Services; Jim Craver, MMA, Centers for Disease Control & Prevention; Chris Dymek, EdD, U.S. Department of Health & Human Services; Taha Kass-Hout, Food & Drug Administration; Brian Lee, MPH, Centers for Disease Control & Prevention; David Portnoy, MBA, U.S. Department of Health & Human Services

  • Healthcare Entrepreneurs Boot Camp: Matching Public Health Data with Real-World Business Models
    EVENT DETAILSIf you’ve ever considered starting something using health data, whether a product, service, or offering in an existing business, or a start-up company to take over the world this is something you won’t want to miss.  In this highly-interactive, games-based brew-ha, we pack the room full of flat-out gurus to get an understanding of what it takes to be a healthcare entrepreneur.  Your guides will come from finance and investment; clinical research and medical management; sales and marketing; technology and information services; operations and strategy; analytics and data science; government and policy; business, product, and line owners from payers and providers; and some successful entrepreneurs who have been there and done it for good measure.  We’ll take your idea from the back of a napkin and give you the know-how to make it a reality!

    Orchestrators: Sujata Bhatia, MD, PhD, Harvard University; Niall Brennan, Centers for Medicare & Medicaid Services; Joshua Rosenthal, PhD, RowdMap; Marshall Votta, Leverage Health Solutions

    Panelists: Michael Abate, JD, Dinsmore & Shohl LLP; Stephen Agular, Zaffre Investments; Chris Boone, PhD, Health Data Consortium; Craig Brammer, The Health Collaborative; John Burich, Passport Health Plan; Jim Chase, MHA, Minnesota Community Measurement; Arnaub Chatterjee, Merck; Henriette Coetzer, MD, RowdMap; Jim Craver, MAA, Center for Disease Control; Michelle De Mooy, Center for Democracy and Technology; Gregory Downing, PhD, U.S. Department of Health & Human Services; Chris Dugan, Evolent Health; Margo Edmunds,PhD, AcademyHealth; Douglas Fridsma, MD, PhD, American Medical Informatics Association; Tina Grande, MHS, Healthcare Leadership Council; Mina Hsiang, US Digital Services; Jessica Kahn, Center for Medicare & Medicaid Services; Brian Lee, MPH, Center for Disease Control; David Portnoy, MBA, U.S. Department of Health & Human Services; Aaron Seib, National Association for Trusted Exchange; Maksim Tsvetovat, OpenHealth; David Wennberg, MD, The Dartmouth Institute; Niam Yaraghi, PhD, Brookings Institute; Jean-Ezra Yeung, Ayasdi

 

There were follow-up publications as well.  Among them, was HHS on a mission to liberate health data from GCN.

GCN article on DDOD
HHS found that its data owners were releasing datasets that were easy to generate and least risky to release, without much regard to what data consumers could really use. The DDOD framework lets HHS prioritize data releases based on the data’s value because, as every request is considered a use case.It lets users — be they researchers, nonprofits or local governments — request data in a systematic, ongoing and transparent way and ensures there will be data consumers for information that’s released, providing immediate, quantifiable value to both the consumer and HHS.

My list of speaking engagements at Palooza is here.