Use TwArχiv to analyze your Twitter archives
January 9, 2018
We are happy to welcome this guest post on a community tool by Bastian Greshake Tzovaras. Bastian is the director of research at the Open Humans project. He can be found online at @gedankenstuecke. -Steven
I’ve built a Twitter analysis web application that’s open to everyone to use and learn from. Often the best data for learning something about yourself are data you’ve already collected; sometimes without even being explicitly aware of collecting it. Social media activity, for example. We often send off Facebook posts or tweets with very little thought about the metadata that we generate in doing so. Where was I when I made that post? What time was it? What type of content did it contain? Did I retweet or reply to another person’s post? And, of course, what did my post contain?
This data can be extremely powerful – for others. The language you use in your Tweets can be used to predict your age as well as your income. Twitter uses the data to gather information about your likes, dislikes, and possessions – among other topics. But what if you want to learn about yourself with your own Twitter data?
The tool I created allows anybody to explore their own Twitter archive in detail. First, you’ll want to request your archive from Twitter. It will contain all the tweets you have ever sent, with not only the text but all the metadata as well. To look at these metadata, go to my small web application called TwArχiv (pronounced tw-archive), which allows you to upload your data and explore it using interactive graphs.
For instance, you can see how the nature of the tweets you send change over time. Are you replying more to people than you used to or is it all just retweets by now? For my own data it seems that finishing up my PhD work had quite an impact, starting in late 2016. With less procrastination I wrote fewer unprompted tweets. Instead, replying to people became more central to my Twitter experience.
There is also plenty of research on gender bias in social media usage and whose voices are being amplified, with men being overwhelmingly favored. TwArχiv allows one to do some soul searching on this. It tries to predict the gender of the people you interact with based on their first names and shows you whether your reply and retweet behaviour is gender-balanced.
My own graphs show that I had (and have) a good way to go here. Especially 2010 is wildly off when it comes to the gender representation in my Twitter interactions. What happened during that time? I was politically active in the German Pirate Party, which was infamous for being a “boys club”.
If you have geolocation enabled on your tweets, you can get an idea of where you tweet. With a fully zoomable map, TwArχiv allows you to explore the globe on all scales to see the broader picture as well as street-level tweet distributions. As a first attempt of seeing movement patterns, you can also get a time-stamped version of the map that highlights locations one tweet at a time.
If you want to give a try with your own archive, you can head to TwArχiv.org. The data storage is handled by Open Humans and by default your archive and the resulting visualizations will be private. (You can choose to make them public, though, to share them with your friends and followers – mine are here!).
A note: The Twitter archive does not contain any direct messages but only your tweets, so if you have a public Twitter account the archive is basically all your “public Twitter interactions”.
If you have ideas on how to extend the functionality of TwArχiv or you want to code your own Twitter archive analysis, you could even get funding to do so: The Open Humans’ mini-grants of USD 5,000 for projects that will enrich the Open Humans ecosystem are a perfect fit for this kind of data visualization and analysis.