Wednesday, 17 November 2010

The Tweet Police



The visualisation below shows those tweets using Tableau Public - by time of day, by keyword groups and the content of the tweets. For instance, try typing "youth" into the search box on the viz and the time series of tweets is filtered to only those including the word "youth". Also, try clicking on one of the hour bars to see the keyword trends for that hour - it's possible to see the most incident packed part of Manchester for any particular hour of the dataset.





The technical bit...

I used python and the tweepy library to get all the tweets from the four accounts that the GMP used - the twitter api is quite easy to interact with and by using the minimum tweet id as a filter, it was easy to get the older tweets. I then stored the data in a local MySQL database and used NLTK in python to break the words down into individual objects to then count up. Good thing about tweepy and python is that OAuth is a lot easier (seems to be impossible currently in R).




2 comments:

  1. It is impossible to live without failing at something, unless you live so cautiously that you might has well not have lived at all, in which case you have failed by default. See the link below for more info.


    #impossible
    www.matreyastudios.com

    ReplyDelete
  2. I have found your blogs to be friendly and welcoming. Thanks for making this one. I really enjoy reading and surfing it. Try to visit my site @ www.imarksweb.org

    Lim

    ReplyDelete