The changing nature of big data sources as a barrier to smart city implementation

Twitter is going to be a major data source for the NEXUS project, as it is for almost every other project studying social media data. Twitter is extraordinarily generous in terms of making its data available, and this generosity has sparked huge research interest.

UK-twitter switch to place

One of the problems of relying on third party data is, however, that they are under no obligation to keep the service going! Social media companies change their APIs all the time, and impact on research is a minor consideration when they do. An example of this is Twitter’s move away from precise co-ordinates and geolocations of Tweets towards “places”, in partnership with Foursquare. This move (explained in more detail here and here) happened towards the end of April last year, and has resulted in a significant drop off in the amount of tweets with precise latitude and longitude locations (see red line in the graph above for stats for the UK).

We can work around this individual service change, but it does highlight a wider problem for the “big data” driven vision of smart cities: if big data is essentially produced as a byproduct of other activities or services, the data will evolve as the services themselves changed (Google Flu Trends is another example of this happening – see the article by Lazer et al.). Hence smart cities which make use of this data will also have to change the way they use it. This in my view is one of the major barriers to widespread adoption of this type of data.