London tweets

Language diversity in the city of London is in the news again due to a research project by Ed Manley and James Cheshire of University College London (UCL) on posts on Twitter collected over the summer just ended. To identify the languages in their collection of tweets they used:

“the Chromium Compact Language Detector – a open-source Python library adapted from the Google Chrome algorithm to detect a website’s language – in detecting the predominant language contained within around 3.3 million geolocated tweets, captured in London over the course of this summer”

There is a zoomable map and an interesting blog post about the results by James Cheshire. The Telly has its say here.

I have previously blogged about language diversity in London, and minority languages on Twitter, but this new work nicely combines both themes. Unfortunately, it only presents a partial picture of the language diversity of London Twitter users as it “only include[s] people who have a good location (through GPS) and those who are connected to the internet”. Nevertheless, it does show at least 66 languages were used in the data collected by our UCL colleagues. This of course is just the tip of the iceberg of the hundreds of languages spoken but not tweeted in the city.

[Hat tip Mark Liberman at Language Log]

Here at Endangered Languages and Cultures, we fully welcome your opinion, questions and comments on any post, and all posts will have an active comments form. However if you have never commented before, your comment may take some time before it is approved. Subsequent comments from you should appear immediately. We will not edit any comments unless asked to, or unless there have been html coding errors, broken links, or formatting errors. We still reserve the right to censor any comment that the administrators deem to be unnecessarily derogatory or offensive, libellous or unhelpful, and we have an active spam filter that may reject your comment if it contains too many links or otherwise fits the description of spam. If this happens erroneously, email the author of the post and let them know. And note that given the huge amount of spam that all WordPress blogs receive on a daily basis (hundreds) it is not possible to sift through them all and find the ham. In addition to the above, we ask that you please observe the Gricean maxims: Be relevant That is, stay reasonably on topic. Be truthful This goes without saying; don’t give us any nonsense. Be concise Say as much as you need to without being unnecessarily long-winded. Be perspicuous This last one needs no explanation. We permit comments and trackbacks on our articles. Anyone may comment. Comments are subject to moderation, filtering, spell checking, editing, and removal without cause or justification. All comments are reviewed by comment spamming software and by the site administrators and may be removed without cause at any time. All information provided is volunteered by you. Any website address provided in the URL will be linked to from your name, if you wish to include such information. We do not collect and save information provided when commenting such as email address and will not use this information except where indicated. This site and its representatives will not be held responsible for errors in any comment submissions. Again, we repeat: We reserve all rights of refusal and deletion of any and all comments and trackbacks.

Leave a Comment