Wednesday 17 November 2010

The Tweet Police



The visualisation below shows those tweets using Tableau Public - by time of day, by keyword groups and the content of the tweets. For instance, try typing "youth" into the search box on the viz and the time series of tweets is filtered to only those including the word "youth". Also, try clicking on one of the hour bars to see the keyword trends for that hour - it's possible to see the most incident packed part of Manchester for any particular hour of the dataset.





The technical bit...

I used python and the tweepy library to get all the tweets from the four accounts that the GMP used - the twitter api is quite easy to interact with and by using the minimum tweet id as a filter, it was easy to get the older tweets. I then stored the data in a local MySQL database and used NLTK in python to break the words down into individual objects to then count up. Good thing about tweepy and python is that OAuth is a lot easier (seems to be impossible currently in R).




No comments:

Post a Comment