Nature News reports SCIgen gibberish papers; can we rely on conventional peer-review? Or can machines help?

Richard van Noorden has an important report

http://www.nature.com/news/publishers-withdraw-more-than-120-gibberish-papers-1.14763

Two science publishers have withdrawn more than 120 papers after a researcher in France identified them as computer-generated. According to Nature News, 16 fraudulent papers appeared in publications from Germany-based Springer, and more than 100 were published by the New York-based Institute of Electrical and Electronic Engineers (IEEE).

It’s not clear what the motive was – academic fraud? or a Sokal/Bohannon-like demo of the frailty of peer-review? But the immediate effect is to show that a large number of “peer-reviewed” scientific papers have flaws.

This should surprise no-one who understands the process of scientific publication. I will assert that, in principle, every published article has flaws. Most will be minor – typos in references or mislabelled diagrams or typos in tables or misdrawn chemical diagrams or countless other errors.

Consider a doctoral thesis – possibly  the most intensively peer-reviewed document that a scientist produces. The thesis is written knowing that failure may be absolute – a career could depend on it. It has taken months to prepare. Almost always the student has to revise it for “minor errors”. (My own thesis had a number and yet I have asked for it to be digitised at Oxford). Errors are ubiquitous.

There are roughly three absolute reviewers of scientific material:

  • The natural and physical world. Nature (not the journal) always wins. It is fair – God does not play dice – but neither does s/he tolerate errors. This is the ultimate arbiter. One of the strucures in my thesis was “wrong”. I discovered later that it was in a subgroup (Fd3) of the reported space group (Fd3m). This wasn’t trivial – it included a rare sort of twinning (which has given me minor eponymity) This is how science progresses. Science is a series of snapshots.
  • The computer.  It doesn’t lie. If you don’t get the same answer as someone else then either you or they or both have to find out where the problem is. It’s interesting that most of these fake papers were in the area of Computer Science. Properly reported CS should be very difficult to fake. Unfortunately much of it is very badly reported.
  • Humans. Human judgment is variable and changes with time. A “good” paper noes may be “bad” at a later stage and vice versa. An “exciting” one now may be shown to be uninteresting later or vice versa.  Science often changes by paradigm shifts and many of those were rejected when first published. Moving continents? ulcerating bacteria? charged species in solution? Examples of science that would have led to dismissal for lack of  ”impact”

The rush for immediate impact is anti-scientific as is the rush for multiple publications.

I doubt this will change.

But one thing that can help to reduce noise, error, fraud, duplication etc is the use of machines.

Machines can detect fraud (I shall show how shortly). Machines can detect errors – we have already shown this. Machines can reproduce (or fail to reproduce) computational science.  This could and should be done.

 

The problem is that it is a lot of work to set up the proper apparatus. And publishers don’t like that (I expect a few shining examples such as IUCr/Acta Crystallographica). It costs money to verify and check science. That eats into profits. And while publishers get paid for the number of papers they publish (and generally not the ones they reject) why bother?

Why do chemistry publishers not insist on machine readable spectra. It’s trivial.

Why do they not insist on machine readable chemical structures? That’s even more trivial.

Because it costs effort?

And worse – it means that the scientific literature becomes a semantic database. And that would never do, because it could replace the secondary databases that generate hundreds of millions of dollar income.

I and my friends could have all the tools to create higher quality chemistry, less fraud, more value. And that goes for many other sciences.

Machines can help authors… I’ve tried that for over 10 years. No progress.

Will the culture of publication change in my lifetime??

That’s up to you.