Tag Archives: data analysis
Like many people paying attention to the press around Quantified Self, self-tracking, and wearable technology I was intrigued by the many articles that focused on a newly published research letter in the Journal of the American Medical Association. The letter, Accuracy of Smartphone Applications and Wearable Devices for Tracking Physical Activity Data, authored by Meredith A. Case et al., described a laboratory study that examined a few different smartphone applications and self-tracking devices. Specifically, they tested the accuracy of steps reported by the three different apps: Moves (Galaxy S4 and iPhone 5s), Withings Health Mate (iPhone 5s), and the Fitbit app (iPhone 5s), three wrist-worn devices: Nike Fuelband, Fitbit Flex, and the Jawbone UP24, and three waist-worn devices: Fitbit One, Fitbit Zip, and the Digi-Walker SW-200. Participants walked on a treadmill at 3.0 MPH for trials of 500 steps and 1500 steps while a research assistant manually counted the actual steps taken. Here’s what they found:
As the data from this research isn’t available we’re left to rely on the authors description of the data. They state that differences in observed vs device recorded steps counts “ranged from−0.3% to 1.0% for the pedometer and accelerometers [waist], −22.7%to −1.5% for the wearable devices [wrist], and −6.7% to 6.2% for smartphone applications [phone apps].” Overall the authors concluded that devices and smartphone apps were generally accurate for measuring steps. However, much of the press around this study dipped into the realm of sensationalism or attention grabbing headlines, for instance: Science Says FitBit Is a Joke.
Part of our work here at Quantified Self Labs is to encourage and help individuals make sense of their own data. After reading this research letter, or one of the many articles which covered it, you might be asking yourself, “I wonder if my device is accurate?” or “Should I be using a step tracking device or just my phone?” In the interest of helping people make sense of their data so that they can come to their own conclusions I decided to do a quick analysis of my own personal data.
For this analysis I examined the step data derived from my Fibit One and the Moves app I have installed on my iPhone 5. (Important note: the iPhone 5 does not have the M7 or M8 chip present on the 5s and 6/6+, respectively, which natively tracks steps.) I had a sneaking suspicion that my data experience differed from the findings of Case and her colleagues. Specifically, I had a hypothesis that the data from every day tracking via the Moves app would be significantly different than data from my Fitbit One.
First, I downloaded and exported my daily aggregate Fitbit data for 2014 using our Google Spreadsheets Fitbit script. I then exported my complete Moves app data via their online web portal. To create a daily aggregate step value from my Moves data I collapsed all activities in the summary_2014.csv file for each day. (Side note: We’ll be publishing a series of how-to’s for doing simple data transformations like this soon). This allowed me to create a file with daily aggregate step data from both Moves and my Fitbit for each day of 2014. Unfortunately I did not have my Fitbit for the first few weeks of 2014 so the data represents steps counts for 342 days (1/24/14 to 12/31/14).
I found that my Fitbit One consistently reports a higher number of total steps per day than my Moves app. Overall, for the 342 days I had 689,192 more steps reported by Fitbit than by the Moves app. The descriptive information is included in the table below:
Another way to look at this is by visualizing both data sets across the full time-frame:
There a few interesting things to point out in this dataset. On two days I have 0 steps reported from my Moves app. One day, Moves was unable to connect with their online service due to me being in an area with little to no cell signal. On the other day my phone was off, probably due to an iOS 8 release and having to reboot my phone a few times.
It is also clear to me that differences in data are related to how I wear my Fitbit and use my phone. For my Fitbit, it is basically on my hip from the time I wake up until the time I go to bed each night. However, my phone isn’t always “on my body” throughout the day. I think this is probably the case for more people.
Since I wear my Fitbit at all times some of the data it captures erroneously is included in the total step count. For instance, for the last few months in this data set I was commuting about 10 miles per day during the week by bike. This data is accurately captured as cycling by Moves, but captured as steps by my Fitbit. Therefore some over-reporting by Fitbit is present in the data.
For my own data I found that the Fitbit reports higher steps on most, if not all days, than the Moves app on my iPhone 5. There are a few caveats with this data and analysis that are worth mentioning. First, this exploration was intended to begin a conversation around the real-world use of activity monitoring apps and devices, and the data they collect. It was not intended as a statement on truth or validity (however I would welcome the help of a volunteer to follow me around with a manual clicker counting all my steps). Second, this analysis was undertaken in part to help you understand that scientists of all types, be it citizen or academic, have the ability to work with their own data in order to come to their own conclusions about what works or doesn’t work for them. Lastly, this analysis was completed very quickly and I am sure that other individuals may have different ideas about how to explore and analyze the data. For this reason I’m posting the daily aggregate values in a open Google Spreadsheet here.
Eric Jain stumbled upon a study published in 2013 that found the a full moon was associated with less sleep. Being an avid self-tracker and a toolmaker he decided to find out if that was true for him as well. Eric used his tool, Zenobase, to import, aggregate, filter, and then analyze his sleep data in a few unique ways. While he found some evidence that a full moon was associated with less total sleep he wasn’t able to make any statistically significant results. Watch his short video below, filmed at the Seattle QS meetup group, then take a look at his great screencast where he walks through all his steps to complete this analysis.
We are not the only ones curious about whether our activity level looks different when seen with different trackers. Bastian Greshake, co-founder of OpenSNP.org, has been comparing his FuelBand and his Fitbit for months. Here’s what he found.
Inspired by Ernesto’s post I wanted to take a look at how my data for the Fitbit and the FuelBand compare to each other. I started wearing the FuelBand in October of last year. Since then it has only left my wrist to recharge the battery. I was already carrying a Fitbit Ultra, which I’ve had since May 2012. I wear the FuelBand on my dominant arm. The Fitbit is usually clipped to the pocket of my jeans and I have it on my non-dominant arm while sleeping. From my day-to-day experience I have a sense that FuelBand steps are usually a good way below the Fitbit steps. But I also thought that the difference was getting smaller, probably due to firmware updates on the FuelBand.
Using the Fitbit-API (and it’s integration into openSNP) it’s quite easy to get a file that contains all step counts measured with the Ultra. If you have an openSNP account you can download the complete file, also including sleep data and body measurements here. Unfortunately the Nike+ API isn’t ready yet, so one needs to manually scrape the data. As this is boring work that can’t easily be automated I only got FuelBand step data back to 2013/11/16. Still, that should be enough to get a first insight on how both devices compare.
Ian Clements has been self-tracking since 1974 – mostly exercise, weight, and general health indicators. But in 2007 he was diagnosed with terminal cancer. This set off a more comprehensive mission of self-tracking to figure out which lifestyle changes and supplements were helping him to live longer. In the video below, Ian walks through his fascinating and detailed journey in data analysis land and shares the lessons he has learned. (Filmed by the London QS Show&Tell meetup group.)
Some people may be wondering how I find all the amazing people conducting neat self-tracking experiments and creating jaw-dropping personal data visualizations. Well, for the most part I just listen. I’m constantly paying attention to what’s being said on twitter about #QuantifiedSelf. When that doesn’t work I just use the power of Google to find people who are blogging about self-tracking, self-experimentation, or personal data. It’s great to look through the search results and see how many people are sharing their personal stories and insights. While doing some searching this morning I stumbled across a project that immediately brought a smile to my face. Hopefully you’re excited by this as much as I am.
Chris Volinsky is a statistician at AT&T Research and he’s no stranger to handling large data problems. Back in 2008 he was part of the team that won the $1 Million Netflix prize. He also has quite the impressive list of research papers that illustrate the many different uses of cellphone location data. But what is really interesting about Chris is his newest project: My Year of Data
Back in November of 2011 Chris started off a blog entry that with this:
My name is Chris. I am 40 years old. I am 5’9 1/2″ and weigh 174 pounds. I walked 9,048 steps and have consumed 1,406 calories today (so far).
Realizing that he’ld been gaining weight and wasn’t at his optimal health he decided to take a data-centric approach to improving his health. He is a statistician after all. So far, he’s found some interesting things. Take for instance his weight and dietary tracking.
As he explains in this post, Chris typically has a hard time tracking his diet consistently. This can be pretty frustrating when you hear about how important it is to eat this or not eat that to help with weight reduction. Rather than get frustrated Chris turned to the data to see what he could learn. When he stopped looking at the data he was entering and started looking at the missing data an interesting trend lept out. He found that fluctuations in his weight appeared to be correlated with whether or not he was logging food. Take for instance the plot below. It appears that there is a pretty clear association with periods of weight loss and periods of actively logging his food (pink zones). The opposite also appears to be true – no food logging = weight gain.
So this is where a typical NFATW post would stop. We have an interesting finding and a neat data visualization. But, Chris is doing something much more interesting than just talking about his weight data. He is on a long-term self-tracking and self-discovery journey and he is trying to enlist other interested parties to help him. Chris is going the extra step and posting all of his self-tracking data online for anyone to analyze, visualize, or just get inspired.
You can access all of his amazing data via a public dropbox folder that he’s set up. He even has a nice README file explaining the datasets and formats. So far he’s sharing the following:
- Fitbit: sleep and activity data
- FitLinxx: weight training data from gym activities
- Livestrong: dietary tracking data
- Runkeeper: running and other exercise activity data
- RescueTime: productivity tracking (computer/internet use)
All the data is open and available for you to play with. This should be a really interesting project to keep “track” of in the future (pun definitely intended). To help inspire some action on your part I took some time today and looked at Chris’s most recent available data to see what I could find out. I downloaded his Fitbit data and decided to look for any interesting patterns. Turns out that when taking a look at his daily patterns of activity there seems to be something going on on Thursdays that reduces his step count and activity time . Also, Saturday is by far the best day with an average of 9,862.56 steps and a 5.3 hours spent being active (data available here).
Make sure to reach out to Chris over at his blog and take a took at his data to see what interesting thing you can figure out!
Every few weeks be on the lookout for new posts profiling interesting individuals and their data. If you have an interesting story or link to share leave a comment or contact the author here.
This post and instructions are no longer up to date. For a current how-to please visit the updated post.
On February 11th FitBit released their API into the wild and let developers get to work. Since then there have been some very neat integrations. One of the best uses of the API it the open source script that enables users to download their data into google spreadsheets. Developed by John McLaughlin, this script gives everyone the ability to get their historical data from FitBit and play with visualizations and analytics. Even someone without any programming experience can start creating very neat dynamic charts and graphs in under 30 minutes. For example I created the the following charts in just a few minutes (click images for interactive versions):
If you already have a FitBit you might be wondering how to actually implement John’s script to grab your own data and start making fun charts and graphs. It takes about 15 minutes from start to finish to set up your FitBit developer account and then set up the script in Google Docs. The step-by-step process is outlined after the jump.