23 Mar Twitter Metrics
Twitter has arguably been the most popular among the data sources that form the basis of so-called altmetrics. Tweets to scholarly documents have been heralded as both early indicators of citations as well as measures of societal impact. This project provides an overview of Twitter activity as the basis for scholarly metrics from a critical point of view and equally describes the potential and limitations of scholarly Twitter metrics. By reviewing the literature on Twitter in scholarly communication and analyzing 24 million tweets linking to scholarly documents, it aims to provide a basic understanding of what tweets can and cannot measure in the context of research evaluation. Going beyond the limited explanatory power of low correlations between tweets and citations, this chapter considers what types of scholarly documents are popular on Twitter, and how, when and by whom they are diffused in order to understand what tweets to scholarly documents measure.
The analysis of tweets that mention scholarly documents is based on Twitter data collected by Altmetric.com until June 2016, which contains 24.3 million tweets mentioning 3.9 unique documents. Altmetric’s Twitter data is matched to bibliographic information from WoS using the DOI. This match between document metadata and tweets affords the possibility to determine the amount of scholarly output that does and does not get tweeted. The link to WoS data also provides access to cleaner and extended metadata of tweeted documents, such as the publication year, journal, authors and their affiliations, and a classification system of scientific disciplines. At the same time, the match of the two databases also excludes tweets to publications not indexed in WoS and thus comes with the known restriction and biases of WoS coverage. This is why the following analysis describes results for two datasets, containing, respectively all 24.3 million tweets covered by Altmetric and 3.9 million tweets mentioning documents with a DOI, covered by WoS and published in 2015.
The largest number of tweeted documents were eprints from arXiv: a total of 319,411 arXiv submissions were tweeted 1.1 million times by 110,134 users. The number of tweeted documents and unique users derivate particularly for the most popular sources. Although arXiv, PLOS ONE and SSRN are the most popular platforms according to the number of tweeted documents, Nature, The Conversation and PLOS ONE are tweeted by the largest number of users.
Dividing Twitter accounts into three groups of top 1%, 9% and 90% of users (according to number of tweets) helps to distinguish lead and highly active users from less active ones. This classification provides insights into tweeting behavior of different types of users. Separating the 601,290 users tweeting a 2015 WoS article by number of tweets, 6,016 lead users, 54,535 highly active and 540,739 least active users can be identified.
Lead users contributed between 84 and 19,973 tweets, a median of 149 tweets per users, had on average 935 followers (median; mean=3,862) and tweeted the 2015 papers during an average tweet span of 598 days. Highly active users contributed between 9 and 83 tweets (median=16), had less followers (median=442.5; mean=2,136) and shorter tweet spans (mean=388 days), while least active users tweeted up to eight times (median=1), had 212 followers and were active for a period of 58 days. Lead (top 1% of users), highly active (9%) and least active (90%) users contributed 43%, 31% and 25% of tweets to the entire set of 2015 WoS papers, respectively.
Interestingly, these percentages differ among NSF disciplines with least active users overrepresented among those tweeting literature from Professional Fields, Social Sciences, Psychology and Earth and Space. On the contrary, lead users were overrepresented in Chemistry, Physics, Mathematics and Engineering & Technology, which were the fields exhibiting the lowest Twitter coverage, density and number of unique users. Assuming that the general public is least active when it comes to tweeting about scholarly papers, they are more likely to engage with articles published in journals from the Professional Fields and Social Sciences and less likely to tweet Chemical papers. The high presence of lead users in Chemistry and Physics might, at least partly, be caused by accounts promoting these papers automatically, such as @blackphysicists and @MathPaper, which were the two most active accounts in dataset A, tweeting 51 and 67 scholarly documents per day.
There are clear differences between weekdays and weekends, reflecting patterns of the work week. During the week, tweeting activity increases from Monday (14% of tweets) to peak Wednesday (18%) and decrease again towards the weekend. Twitter users tweet, on average, 23% more about scholarly documents on a Wednesday and 41% less on a Sunday. The figure above also shows the different magnitude of Twitter activity among years, as well as an overall increase throughout each year. While a general increase from January to December can be observed for each of the four years of tweets, the general trend also reflects the academic year: activity is higher in spring and fall and drops slightly in summer and particularly during the winter break during the last two and first weeks of each year.
Haustein, S. (in press). Scholarly Twitter metrics. In W. Glänzel, H.F. Moed, U. Schmoch, & M. Thelwall (Eds.), Handbook of Quantitative Science and Technology Research. Springer.