It’s apps that always get the glory don’t they? After reflecting on my talk at the Health Refactored conference in Mountain View last week, I saw a lot of focus on the consumer and client side of the healthcare API equation, but almost nothing on how to build up a scalable, compliant and secure governance layer for API traffic that might contain sensitive PHI or PII information.
There were quite a lot of great talks, and one of my favorites was Rachel Kalmar’s talk on Towards an Open Data Ecosystem. One of the points she made was with regards to the semantics of health care data, especially in relation to the new category of smart devices such as the fitbit as well as related fitness apps such as RunKeeper.
She made an important point on health care data/fitness interoperability – how does one measure a “step”, and how would such “steps” be interoperable among different service providers? This question is difficult because each company will have their own definition making this type of interoperability problem elusive, and because each of these companies eventually want to survive, they will need some sort of business plan, which likely amounts to “monetizing” this stored data, ensuring that it remains locked-up in silos. I tweeted to her that we have JSON and XML but this only helps with structure, not semantics. Can standards help here? Maybe. It seems like one of those big “lets get together and definite an ontology” projects.
From an API perspective, each of these companies are collecting data from these devices and storing it using RESTful API calls. There are two important but related points here:
1. Who owns the data?
2. What sort of privacy and compliance protections are required for fitness/health data?
Point 1 is controversial, and I would argue that defining such ownership is a difficult non-technical question. When is data my own property? As an avid user of RunKeeper it seems logical that the record of my runs, time elapsed, elevation gain, average pace, best pace and frequency are all my data. After all, I generated it! Seems obvious, no?
One opposing point comes from 17th century philosopher John Locke. Given a state where the world has a large set of common resources, he asks a strikingly similar question about property. When is something that I obtain for free my property?
The example he uses is an acorn and apples found under a tree used for nourishment. Precisely when does the apple become my property?
He writes.. “I asked then, when did they begin to be his? When he digest? Or when he eat? Or when he boiled? Or when he brought them home? Or when he picked them up? And it is plain, if the first gather made them not his, nothing else could.” For Locke the distinction is whether or not labor was mixed with the resource.
If we apply this logic to a free app such as RunKeeper, it’s not like I can reach inside my body and pull out a USB drive containing my running data. The good folks at RunKeeper have mixed considerable labor in terms of building the app, running the infrastructure to host the API that receives the data from the phone and tracking the results. RunKeeper provides the labor, and the customer captures this value at no monetary cost. Sure, it was my body that did the run, but under this logic, it’s RunKeepers data, because they are the ones that put the work into it. I just got a free app, similar to Locke’s example of picking up an apple under the tree.
As for API building blocks, Intel’s Expressway can help protect PII, PHI, and PCI data in API calls as data is received or queried, helping it address #2. The trouble here, however, is that it’s not entirely clear what sort of protections a service provider must provide. Even if data is de-identified, examples such as the “Massachusetts Attack” demonstrate that even “de-identified” data can be combined with other data sources to re-identify the data. Does that mean we shouldn’t mitigate such attacks with tools like Expressway? I think we absolutely have to.