The growing divide between higher and low impact scientific journals

Ten years ago the Public Library of Science started one big lower impact and a series of smaller higher impact journals. Over the years these publication outlets diverged. The growing divide between standard and top journals might mirror wider trends in scholarly publishing.

There are roughly two kinds of journals in the Public Library of Science (PLoS): low impact (IF = 3.06) and higher impact (3.9 < IF < 13.59) journals. There is only one low impact journal, PLoS ONE, which is bigger in terms of output than all the other journals in PLoS combined. Its editorial policy is fundamentally different to the higher impact journals in that it does not require novelty or ‘strong results’. All it requires is methodological soundness.

Comparing PLoS ONE to the other PLoS journals then offers the opportunity to plot the growing divide between ‘high impact’ and ‘standard’ research papers. I will follow the hypothesis that more and more information is required for a publication (Vale, 2015). More information could be mirrored in three values: the number of references, authors, or pages.

And indeed, the higher impact PLoS journal articles have longer and longer reference sections, a rise of 24% from 46 to 57 over the last ten years (Pearson r = .11, Spearman rho = .11), see also my previous blog post for a similar pattern in another high impact journal outside of PLoS.


The lower impact PLoS ONE journal articles, on the other hand, practically did not change in the same period (Pearson r = .01, Spearman rho = -.00).


The diverging pattern between higher and low impact journals can also be observed with the number of authors per article. While in 2006 the average article in a higher impact PLoS journal was authored by 4.7 people, the average article in 2016 was written by 7.8 authors, a steep rise of 68% (Pearson r = .12, Spearman rho = .19).


And again, the low impact PLoS ONE articles do not exhibit the same change, remaining more or less unchanged (Pearson r = .01, Spearman rho = .02).


Finally, the number of pages per article tells the same story of runaway information density in higher impact journals and little to no change in PLoS ONE. Limiting myself to articles published until late november 2014(when lay-out changes complicate the comparison), the average higher impact journal article grew substantially in higher impact journals (Pearson r = .16, Spearman rho = .13) but not in PLoS ONE (Pearson r = .03, Spearman rho = .02).



So, overall, it is true that more and more information is required for a publication in a high impact journal. No similar rise in information density is seen in PLoS ONE. The publication landscape has changed. More effort is now needed for a high impact publication compared to ten years ago.

Wanna explore the data set yourself? I made a web-app which you can use in RStudio or in your web browser. Have fun with it and tell me what you find.

— — —
Vale, R.D. (2015). Accelerating scientific publication in biology Proceedings of the National Academy of Sciences, 112, 13439-13446 DOI: 10.1101/022368

The slowing down of the biggest scientific journal

PLoS ONE started 11 years ago to disruptively change scholarly publishing. By now it is the biggest scientific journal out there. Why has it become so slow?

Many things changed at PLoS ONE over the years, reflecting general trends in how researchers publish their work. For one thing, PLoS ONE grew enourmously. After publishing only 137 articles in its first year, the number of articles published per year peaked in 2013 at 31,522.


However, as shown in the figure above, since then they have declined by nearly a third. In 2016 only 21,655 articles were published in PLoS ONE. The decline could be due to a novel open data policy implemented in March 2014, a slight increase in the cost to publish in October 2015, or a generally more crowded market place for open access mega journals like PloS ONE (Wakeling et al., 2016).

However, it might also be that authors are becoming annoyed with PLoS ONE for getting slower. In its first year, it took 95 days on average to get an article from submission to publication in PLoS ONE. In 2016 it took a full 172 days. This renders PLoS ONE no longer the fastest journal published by PLoS, a title it held for nine years.


The graph below shows the developemtn of PLoS ONE in more detail by plotting each article’s review and publication speed against its publication date, i.e. each blue dot represents one of the 159,000 PLoS ONE articles.


What can explain the increasingly poor publication speed of PLoS ONE? Most people might think it is the sheer volume of manuscripts the journal has to process. Processing more articles might simply slow a journal down. However, this slow down continued until  2015, i.e. beyond the peak in publication output in 2013. Below, I show a more thorough analysis which reiterates this point. The plot shows each article in PLoS ONE in terms of its time from submission to publication and the number of articles published around the same time (30 days before and after). There is a link, for sure (Pearson r = .13, Spearman rho = .15), but it is much weaker than I would have thought.


Moreover, when controlling for publication date via a partial correlation, the pattern above becomes much weaker (partial Pearson r = .05, partial Spearman rho = .11). This suggests that much of PLoS ONE’s slow down is simply due to the passage of time. Perhaps, during this time scientific articles changed, requiring a longer time to evaluate whether they are suitable for the journal.

For example, it might be that articles these days include more information which takes longer to be assessed by scientific peers. More information could be mirrored in three values: the number of authors (information contributors), the reference count (information links), the page count (space for information). However, the number of authors per article has not changed over the years (Pearson r = .01, Spearman rho = .02). Similarly, there is no increase in the length of the reference sections over the years (r = .01; rho = -.00). Finally, while articles have indeed become longer in terms of page count (see graph below), the change is probably just due to a new lay-out in January 2015.


Perhaps, it takes longer to go through peer-review at PLoS ONE these days because modern articles are increasingly complex and interdisciplinary. A very small but reliable correlation between subject categories per article and publication date supports this possibility somewhat, see below. It is possible that PLoS ONE simply finds it increasingly difficult to look for the right experts to assess the scientific validity of an article because articles have become more difficult to pin down in terms of the expertise they require.


Having celebrated its 10 year anniversary, PLoS ONE can be proud to have revolutionized scholarly publishing. However, whether PLoS ONE itself will survive in the new publishing environment it helped to create remains to be seen. The slowing down of its publication process is certainly a sign that PLoS ONE needs to up its game in order to remain competitive.

Wanna explore the data set yourself? I made a web-app which you can use in RStudio or in your web browser. Have fun with it and tell me what you find.

— — —
Wakeling S, Willett P, Creaser C, Fry J, Pinfield S, & Spezi V (2016). Open-Access Mega-Journals: A Bibliometric Profile. PloS one, 11 (11) PMID: 27861511

Do twitter or facebook activity influence scientific impact?

Are scientists smart when they promote their work on social media? Isn’t this a waste of time, time which could otherwise be spent in the lab running experiments? Perhaps not. An analysis of all available articles published by PLoS journals suggests otherwise.

My own twitter activity might best be thought of as learning about science (in the widest sense), while what I do on facebook is really just shameless procrastination. It turns out that this pattern holds more generally and impacts on how to use social media effectively to promote science.

In order to make this claim, I downloaded the twitter and facebook activity associated with every single article published in any journal by the Public Library of Science (PLoS), using this R-script here. PLoS is the open access publisher of the biggest scientific journal PLoS ONE as well as a number of smaller, more high impact journals. The huge amount of data allows me to have a 90% chance of discovering even a small effect (r = .1) if it actually exists.

I should add that I limited my sample to those articles published after May 2012 (which is when PLoS started tracking tweets) and January 2015 (in order to allow for at least two years to aggregate citations). The 87,649 remaining articles published in any of the PLoS journals offer the following picture.


There is a small but non-negligible association between impact on twitter (tweets) and impact in the scientific literature (citations): Pearson r = .12, p < .001; Spearman rho = .18, p < .001. This pattern held for nearly every PLoS journal individually as well (all Pearson r ≥ .10 except for PLoS Computational Biology; all Spearman rho ≥ .12 except for PLoS Pathogens). This result is in line with Peoples et al.’s (2016) analysis of twitter activity and citations in the field of ecology.

So, twitter might indeed help a bit to promote an article. Does this hold for social media in general? A look at facebook reveals a different picture. The relationship between facebook mentions of an article and its scientific impact is so small as to be practically negligible: Pearson r = .03, p < .001; Spearman rho = .06, p < .001. This pattern of only a tiny association between facebook mentions and citations held for every single PLoS journal (Pearson r ≤ .09, Spearman rho ≤ .08).


In conclusion, twitter can be used for promoting your scientific work in an age of increased competition for scientific reading time (Renear & Palmer, 2009). Facebook, on the other hand, can be used for procrastinating.

Wanna explore the data set yourself? I made a web-app which you can use in RStudio or in your web browser. Have fun with it and tell me what you find.

— — —
Peoples BK, Midway SR, Sackett D, Lynch A, & Cooney PB (2016). Twitter Predicts Citation Rates of Ecological Research. PloS one, 11 (11) PMID: 27835703

Renear AH, & Palmer CL (2009). Strategic reading, ontologies, and the future of scientific publishing. Science (New York, N.Y.), 325 (5942), 828-32 PMID: 19679805

Did genes shape my mother tongue?

Intuitively, one is inclined to answer with a resounding ‘no’. Of course not, had I been adopted by Thai parents, I would speak Thai. But I was not. My parents and my mother tongue are German. Still, there is a growing opinion that genes do nonetheless play a role.

Before looking at this opinion, it is worth asking why genes shouldn’t play a role in language. A computational model by Andrea Baronchelli of Northeastern University presents a good case. It suggests that the great diversity of languages is due to fast language change. This in turn favours generalist language learners who are able to learn any language equally well. Why? Well, genes are slow to change. Language presents a moving target for evolutionary mechanisms. Instead of adapting to any language in particular, people who can learn any language are at an advantage.

Her genes are different to mine, as is her language. Coincidence?

It is thus crucial to look at the rate of language change: is it slow enough for genes to change in response to it? An examination of the connections between modern languages which emerged out of a common origin and separated millennia ago gives some clues as to the real rate of language change. For example, the first European visitors to India noticed curious commonalities between Indian languages such as Sanskrit (3 = ‘tráyas’) and European ones such as ancient Greek (‘treĩs’) and Latin (‘trēs’). Since the time of the split between European and Indian languages these words do not appear to have changed much.
Nowadays, this can be extended beyond mere anecdotes. In a 2007 article in Nature, Mark Pagel and colleagues showed that the more often a word is used today the more likely it is to be similar across languages with a common origin, even if this connection lies 7,500 years in the past. Using structural features, such as grammar systems and the inventory of language sounds, one can look even up to 12,000 years into the past. These numbers correspond to approximately a quarter of the time the world’s languages had in order to differentiate! So, yes, language vocabulary and structural features do indeed change quickly, but still, there are exceptions, for example among the very common words. This opens up the possibility that genes – which are quite stable – do influence at least those language features which have been found to be consistent for thousands of years.
What is missing so far is an actual example of such a gene-language link. It was found by Dan Dediu and Robert Ladd who looked at tone, a feature which is a relatively stable language characteristic. Tone refers to the use of pitch differences to differentiate words. Take this Thai tongue twister, for example: /mǎi mài mâi mái/. The same consonants and vowels get repeated with different pitches resulting in the sentence ‘Does new silk burn?’. Dediu and Ladd noticed a surprising parallel between the location of tone languages and the location of different versions of two genes in the world, as can be seen on the following map. They tested this gene-tone relation formally and it emerged that it is unusually strong among the possible combinations of genes and structural language features. Furthermore, it does not appear to be due to historical accidents or geographic patterns alone. These two genes called ASPM and Microcephalin are somehow linked to whether a language uses tone. How can that be?


    Geographic distribution of A) one version of ASPM, B) one version of Microcephalin, C) and tone languages.

Geographic distribution of A) one version of ASPM, B) one version of Microcephalin, C) and tone languages.

The most straight forward explanation would be that there is a tone gene – if you have it in one version you can learn tone otherwise not. Dediu and Ladd reject such a direct account – my own ASPM and Microcephalin versions do not determine whether I will ever be able to learn Thai. Instead, genes could exert a subtle effect, nudging successive generations of language learners in a certain direction. Imagine a bunch of German children were dropped on a lonely island and they learnt Thai from Thai native teachers. They would probably manage very well and their teachers would be very proud. Without their teachers noticing it, however, the German children struggled a bit with the Thai tone system. Over generations, this struggle would reduce tonality bit by bit. Were Thai teachers to discover this island again a few hundred years later, they would be astonished what an odd version of Thai people spoke on the island. A Thai without tone.
So, because language is a not a homogenous ever-changing system, but instead a mix of stable and less stable features, the former could potentially be influenced by genes which are known to be stable as well. So, did genes shape my mother tongue? In a sense yes, the combined genetic background of generations of German speakers shaped German. In another sense no, my genes did not determine that German would end up being my mother tongue. Both answers are true.


Auroux, S. (2000). History of the Language Sciences. Berlin, New York: Walter de Gruyter.

Baronchelli A, Chater N, Pastor-Satorras R, & Christiansen MH (2012). The biological origin of linguistic diversity. PloS one, 7 (10) PMID: 23118922 doi:10.1371/journal.pone.0048029

Dediu D, & Cysouw M (2013). Some structural aspects of language are more stable than others: a comparison of seven methods. PloS one, 8 (1) PMID: 23383035 doi:10.1371/journal.pone.0055009

Dediu D, & Levinson SC (2012). Abstract profiles of structural stability point to universal tendencies, family-specific factors, and ancient connections between languages. PloS one, 7 (9) PMID: 23028843doi:10.1371/journal.pone.0045198

Pagel M, Atkinson QD, & Meade A (2007). Frequency of word-use predicts rates of lexical evolution throughout Indo-European history. Nature, 449 (7163), 717-20 PMID: 17928860 doi:10.1038/nature06176




‘approximately a quarter of the time the world’s languages had in order to differentiate’

Assuming a) a common origin lying 10,000 years in the past and b) one language which existed 40,000 years ago when some of its speakers left Africa to populate the world. The latter estimate is taken from: Diamond, J.: Guns, Germs, and Steel


‘tone, a feature which is a relatively stable language characteristic’

Across language families, it is ranked the 15th most stable structural language feature among 68 investigated by Dediu and Levinson (2012). Across different ways of quantifying stability, it is ranked 19th out of 62 (Dediu & Cysouw, 2013).


‘Take this Thai tongue twister, for example: /mǎi mài mâi mái/.’

You can listen to it here (go to 4. The most difficult word and tongue twisters)



1) By YashiWong (Own work), via Wikimedia Commons

2) p. 288 in: Dediu D (2011). Are languages really independent from genes? If not, what would a genetic bias affecting language diversity look like? Human biology, 83 (2), 279-96 PMID: 21615290