A Public Infrastructure For Data Access
March 8, 2016
Larry Smarr’s major contributions to scientific progress are well known. A physicist and the founding director of the National Center for Supercomputing Applications (NCSA), he helped bring the power of computing to scientific research at a time when computers will still highly specialized instruments. Today he is the Director of the California Institute for Telecommunications and Information Technology (Calit2), one of the most innovative research institutes in the world. He’s also an avid self-tracker, using his own data to correctly self-diagnose the onset of Crohn’s disease. In preparation for our upcoming Quantified Self Public Health Symposium I asked Larry about his idea for a large scale, non-commercial, broadly accessible infrastructure for improving access to self-collected data for both personal and public benefit.
Gary Wolf: What’s the role of the public health sector and of the academic research community in a world where individuals and consumer-oriented tech companies are taking on increasingly complex questions of personal and population health?
Larry Smarr: The fundamental role is bridging the gap between N=1 and N=a lot. Any time in the last 30 years when I’ve seen a technical innovation that mattered, like a software tool, the first approaches aren’t ready for prime time. They are not developed with professional-level software engineering, version controls, documentation and all that. Similarly, scaling up of biomedical observations made by N=1 quantified individuals is going to require the professional methodologies of the public health sector.
GW: Can this be left to industry?
LS: Not entirely, although startups are doing a fabulous job of getting tracking tools into the hands of tens of millions of individuals. The problem is how to do research on the data produced by that broad population. Too often these days I see researchers from the university going to tracking companies and asking for access to the company’s raw data feeds, for instance to heart rate or exercise time series, and the company says no. They will give you the weekly or daily average, but you can’t get to the raw data. If you go to them and say, I’ve got this really great innovation that can be used to understand this data, more often than not they decline. They have an installed base and market share to protect, which naturally tends to make them conservative. I think there is a real opening for companies to make this anonymized broad population data available to academic researchers. That’s when a raft of scientific discoveries will be made from the quantified population.
GW: Those are the consumer fitness companies, but what about the healthcare IT world?
LS: Again there is a disconnect between the consumer fitness cloud-based apps for millions of individuals and the electronic health records in your healthcare provider. If you’re a doctor in a medical office, unlike a data science researcher, you don’t want all this data. What you want to know is: did my patient do 1000 steps or 10,000 steps today, did you get aerobic exercise or not, are they getting enough sleep? So it’s not like you need a vast dumping place inside electronic health records. Again, I think pilot experiments are the way to get started.
GW: You’re arguing that the incentives aren’t there.
LS: These are currently major structural barriers. Who is going to work on the bridging we are discussing? There aren’t incentives for the commercial tracking companies to work on it. Neither are there incentives for the electronic medical record companies to work on it. NIH isn’t going to support bridging between commercial companies. It falls between the stools. You need to have the research community, and health care IT experts, the commercial tracking companies, and the individual self-trackers all come together and collaborate.
GW: You envision some kind of technical system so that individuals and health care providers and researchers could all benefit from access to data. What does your experience tell you about how long this would take to have a working prototype that would be practically useful?
LS: It’s a three-to-five year project. I think if a major funder did a call for proposals requiring a health care provider, university research community and the self-tracking community to come together a prototype a solution, I think they would get some very interesting proposals.
GW: In a talk you gave in 2011, you said “science is not enough.” You pointed out that we’ve known the link between smoking and cancer for over half a century, and yet global cigarette consumption has tripled during this time. So we have all this possibility for new discoveries with self-tracking data, but how is that going to help make people healthier?
LS: Yes, just knowledge of what causes negative impacts on health is not enough. My former UC San Diego colleague Naomi Oreskes documents how economic interests slowed down the logical social reaction to smoking health threats and climate change in her Merchants of Doubt: How a Handful of Scientists Obscured the Truth on Issues from Tobacco Smoke to Global Warming (2010). We are seeing similar delaying and disinformation tactics in the obesity/diabetes epidemic, which has been building for four decades. It is sobering to me to see someone as politically skilled as New York City mayor Bloomberg defeated in his efforts to ban jumbo sugary drinks. My best guess is that we face a multi-decadal battle, just as we have had with tobacco and climate change, to get our society to move to healthy eating and drinking. The bright spots are subcultures of healthy living, often empowered by tracking and social media, that are developing across the country. My hope is that these will spread and scale over the next decade.
GW: It seems you are also pointing toward activism, since that’s been so important with smoking.
LS: Activism is essential given the enormous power of the entrenched economic interests. Activism can lead to regulatory reform, which over time can make huge social changes. For example, when I grew up in the 1950s and early 60s my father didn’t smoke, but he was embarrassed that you had to have ashtrays in your house, because he said you couldn’t tell people not to smoke in your own home. Socially, you just couldn’t. About that time the Surgeon General’s report on smoking was published. Fifty years later, huge chunks of society are smoke-free, such as all the University of California campuses, restaurants, and large social gatherings. Just think of what an enormous shift that has been! We are beginning to see similar activism in getting pension fund investors to boycott carbon fuel companies in order to slow down climate change. So can we imagine a boycott against sweetened beverages and high glycemic prepared foods? I believe that there is a huge role for health-related individual and organized activism in the near future.
GW: At the last Quantified Self Public Health meeting, you suggested that this emerging field needs a new kind of journal where individuals can report their discoveries. In light of the big challenges you’ve been describing, challenges that can’t be solved by academic and research publication alone, what kind of contribution could a new journal make?
LS: Let’s go back to the issue of scaling we discussed. Imagine the journal articles are fairly short, describing how the data was generated, but the back end is a publicly available cloud of data so that you could begin growing a large dataset of N=1 projects. Then the research community could pick up on the ideas coming out of the Quantified Self community, explore the data, and take it further. That’s how things grow.
GW: You want to be on the editorial board?
LS: No, I want to submit a paper!