Tag Archives: export
Quantified Self Labs is dedicated to the idea that data access matters. Moving forward, we’re going to be exploring different aspects of how data access affects our personal and public lives. Stay tuned to our QS Access channel for more news, thoughts, and insights.
On January 13th Uber, a wildly popular and often scrutinized ride share company, announced they have entered into an agreement with the City of Boston to share anonymized data generated by users of the service. This is the first partnership between Uber and a local government body, but points to the ability to potentially partner with cities that want to take a peak at the vast amount of data about when and where people are traveling within their municipality. Our first reaction to this was to explore if Uber has provided any method for it’s own users to access and export their trip data. Surely if they can able to export and pass along data to a third party, they can pass that data to their own users?
In our exploration of the mobile and web user platforms we found that Uber currently does not offer users with an easy way to access their data. As an Uber customer, you are provided with email receipts of your trips that include travel information, a route of the ride, and cost. This information is also available through their online user account page. However, it is not exportable and accessible in a method that allows individuals to store information in a consistent and machine readable format (such as a csv file). In our search for methods to assist in exporting Uber ride data, I stumbled upon this data scraper on Github developed by Josh Hunt. It’s useful to know that Uber has a standard no scraping clause in in it’s Terms of Service, but individual users accessing their own data for their own reasons is probably not what these clauses are meant to protect.
Aside from data access issues there is of course open questions about how Uber will implement privacy protections governing sensitive user data. Of course, Uber is not without fault in this space. The now infamous blog post pointing to their ability to track one-night stands (archived here) was enough for some users to question ethical standards within Uber. In their announcement, Uber touched on this issue by stating that they will provide some privacy protections by only offering anonymized aggregated data to third party partners. Protecting user privacy through data aggregation and anonymization is a step in the right direction, but there remain these open issues around data access for users. Uber and the cities they partner with will learn a lot about how we travel, but the partnership between Uber and their users could be improved by helping users (myself included) understand their own data and behavior by allowing easier access to the data we contribute when we use the service.
We’re interested to hear from our readers about their experiences using the above mentioned tool, or similar tools to access and export their Uber trip data. Please let us know. We’ve also reached out to Uber for comment.
I reached out to Uber Support over Twitter and received the following response:
“Unfortunately this is not currently a feature, however we’re always looking to improve and I’ll pass your suggestion along! *NM” (link)
Today’s post comes to us from Dawn Nafus and Robin Barooah. Together they led an amazing breakout session at the 2014 Quantified Self Europe Conference on the topic of understanding and mapping data access. We have a longstanding interest in observing and communicating how data moves in and out of the self-tracking systems we use every day. That interest, and support from partners like Intel and the Robert Wood Johnson Foundation, has helped us start to explore different methods of describing how data flows. We’re grateful to Dawn and Robin for taking this important topic on at the conference, and to all the breakout attendees who contributed their thoughts and ideas. If mapping data access is of interest to you we suggest you join the conversation on the forum or get in touch with us directly.
Mapping Data Access
By Dawn Nafus and Robin Barooah
One of the great pleasures of the QS community is that there is no shortage of smart, engaged self-trackers who have plenty to say. The Mapping Data Access session was no different, but before we can tell you about what actually happened, we need to explain a little about how the session came to being.
Within QS, there has been a longstanding conversation about open data. Self-trackers have not been shy to raise complaints about closed systems! Some conversations take the form of “how can I get a download of my own data?” while other conversations ask us to imagine what could be done with more data interoperability, and clear ownership over one’s own data, so that people (and not just companies) can make use of it. One of the things we noticed about these conversations is that when they start from a notion of openness as a Generally Good Thing, they sometimes become constrained by their own generality. It becomes impossible not to imagine a big pot of data in the sky. It becomes impossible not to wonder about where the one single unifying standard is going to come from that would glue all this data together in a sensible way. If only the world looked something like this…
We don’t have a big pot of data in the sky, and yet data does, more or less, move around one way or another. If you ask where data comes from, the answer is “it depends.” Some data come to us via just a few noise-reducing hops away from the sensors from which they came, while others are shipped around through multiple services, making their provenance more difficult to track. Some points of data access come with terms and conditions attached, and others less so. The system we have looks less like a lot and more like this…
… a heterogeneous system where some things connect, but others don’t. Before the breakout session, QS Labs had already begun a project  to map the current system of data access through APIs and data downloads. It was an experiment to see if having a more concrete sense of where data actually comes from could help improve data flows. These maps were drawn from what information was publicly available, and our own sense of the systems that self-trackers are likely to encounter.
Any map has to make choices about what to represent and what to leave out, and this was no different. The more we pursued them, there more it became clear that one map was not going to be able to answer every single question about the data ecosystem, and that the choices about what to keep in, and what to edit out, would have to reflect how people in the community would want to use the map. Hence, the breakout session: what we wanted to know was, what questions did self-trackers and toolmakers have that could be answered with a map of data access points? Given those questions, what kind of a map should it be?
Participants in the breakout session were very clear about the questions they needed answers to. Here are some of the main issues that participants thought a mapping exercise could tackle:
Tool development: If a tool developer is planning to build an app, and that app cannot generate all the data it needs on its own, it is a non-trivial task to find out where to get what kind of data, and whether the frequency of data collection suits the purposes, whether the API is stable enough, etc.. A map can ease this process.
Making good choices as consumers: Many people thought they could use a map to better understand whether the services they currently used cohered with their own sense of ‘fair dealings.’ This took a variety of forms. Some people wanted to know the difference between what a company might be capable of knowing about them versus the data they actually get back from the service. Others wanted a map that would explicitly highlight where companies were charging for data export, or the differences between what you can get as a developer working through an API and what you can get as an end user downloading his or her own data. Others still would have the map clustered around which services are easy/difficult to get data out of at all, for the reason that (to paraphrase one participant) “you don’t want to end up in a data roach motel. People often don’t know beforehand whether they can export their own data, or even that that’s something they should care about, and then they commit to a service. Then they find they need the export function, but can’t leave.” People also wanted the ability to see clearly the business relationships in the ecosystem so they could identify the opposite of the ‘roach motel’—“I want a list of all the third party apps that rely on a particular data source, because I want to see the range of possible places it could go.”
Locating where data is processed: Many participants care deeply about the quality of the data they rely on, and need a way of interpreting the kinds of signals they are actually getting. What does the data look like when it comes off the sensor, as opposed to what you see on the service’s dashboard, as opposed to what you see when you access it through an API or export feature? Some participants have had frustrating conversations with companies about what data could fairly be treated as ‘raw’ versus where the company had cleaned it, filtered it, or even created its own metric that they found difficult to interpret without knowing what, exactly, goes into it. While some participants did indeed want a universally-applicable ‘quality assessment,’ as conveners, we would point out that ‘quality’ is never absolute—noisy data at a high sample rate can be more useful for some purposes than, say, less noisy but infrequently collected data. We interpreted the discussion to be, at minimum, a call for greater transparency in how data is processed, so that self-trackers can have a basis on which to draw their own conclusions about what it means.
Supporting policymaking: Some participants had a sense that maps which highlighted the legal terms of data access, including the privacy policies of service use, could support the analysis of how the technology industry is handling digital rights in practice, and that such an analysis could have public policy implications. Sometimes this idea didn’t take the form of a map, but rather a chart that would make the various features of the terms of service comparable. The list mentioned earlier of which devices and services rely on which other services was important not just to be able to assess the extent of data portability, but also to assess what systems represent more risk of data leaking from one company to another without the person’s knowledge or consent. As part of the breakout, the group drew their own maps—maps that either they would like to exist in the world even if they didn’t have all the details, or maps of what they thought happened to their own data. One person, who drew a map of where she thought her own data goes, commented (again, a paraphrase) “All I found on this map was question marks, as I tried to imagine how data moves from one place to the next. And each of those question marks appeared to me to be an opportunity for surveillance.”
What next for mapping?
If you are a participant, and you drew a map, it would help continue the discussion if you talked a little more about what you drew on the breakout forum page. If you would like to get involved in the effort, please do chime in on the forum, too.
Clearly, these ecosystems are liable to change more rapidly than they can be mapped. But given the decentralized nature of the current system (which many of us see as a good thing) we left the breakout with the sense that some significant social and commercial challenges could in fact be solved with a better sense of the contours and tendencies of the data ecosystem as it works in practice.
 This work was supported by Intel Labs and the Robert Wood Johnson Foundation. One of us (Dawn) was involved in organizing support for this work, and the other (Robin) worked on the project. We are biased accordingly.
Earlier today John Wilbanks sent out this tweet:
— John Wilbanks (@wilbanks) December 11, 2013
John was lamenting the fact that he couldn’t export and store the genome interpretations that 23&Me provides (they do provide a full export of a user’s genotype). By the afternoon two developers, Beau Gunderson and Eric Jain, had submitted their projects. (You can view them here and here).
We’ve doing some exploration and research about QS APIs over the last two years and we’ve come to understand that having data export is key function of personal data tools. Being able to download and retain an easily decipherable copy of your personal data is important for a variety of reasons. One just needs to spend some time in our popular Zeo Shutting Down: Export Your Data thread to understand how vital this function is.
We know that some toolmakers already include data export as part of their user experience, but many have not or only provide partial support. I’m proposing that we, as a community of people who support and value the ability to find personal meaning through personal data, work together to provide the tools and knowledge to help people access their data.
Would you help and be a part of our Personal Data Task Force*? We can work together to build a common set of resources, tools, how-to’s and guides to help people access their personal data. I’m listening for ideas and insights. Please let me know what you think and how you might want to help.
*We’re inspired by Sina Khanifar’s work on the Rapid Response Internet Task Force.
If you’re a loyal, or even infrequent user of the Zeo sleep tracking device then you’ve probably heard the sad news that the company has shut down. This opens up a lot of questions about what is means to make consumer devices in this day and age, but rather than focus on those issues we’ld like to talk a bit about data.
Zeo has been unfortunately a little quiet on the communication front and there are quite a few users out there who are wondering about what will happen to all those restless nights and sound sleeps that were captured by their device. This has been compounded by the fact that the Zeo website went down for a short time (it is up as of this writing) closing off access to user accounts and the data therein. Lucky for you there have been quite a few enterprising and enthusiastic individuals who have taken the time to create or highlight ways to capture and store your Zeo data.
Use The Zeo Website
You can’t fault Zeo with making it hard to access your own data. As long as their website is up you can easily download your sleep data from by logging into your user account at mysleep.myzeo.com. After logging into your account you will see a link on the right hand side labeled “Export Data.” Click that link and you’ll be able to download a CSV file containing all your sleep data. They’ve even provided a description of the data and formats that you can download here.
Eric Blue’s FreeMyZeo Data Exporter
QS Los Angeles Meetup Organizer and hacker extraordinaire whipped up a simple data export tool using the Zeo API. The great thing about Eric’s is that even if the myZeo web portal goes down this tool should continue to work.
Download Data Directly From the Device
If you’re using a Zeo bedside device then you can continue to use it and download the data directly from the memory card without relying on uploading it to the Zeo website. In order to do this you’ll have to read the documentation and use the Data Decoder Library. These files are hard to find as they’ve been removed from the Zeo developer website, but you can access them from our Forum thanks to our friend Dan Dascalesu. Zeo also created a viewer using this library that you can use via this Sourceforge page.
If you’ve found another way to download Zeo data please let us know. You can also participate in the great forum discussion that inspired this post.