Brian McMahon: Publishing Semantic Crystallography; EVERY science (data) publisher should watch this ALL THE WAY through

#semphyssci

I owe a huge debt to the International Union of Crystallography (IUCr) and Brian McMahon. Quite simply they are the best semantic scientific publishers of the current century. They also have the best community-base for scientific publishing that I know. The Union exists for its members and not for itself; its processes are as democratic as a scholarly body allows, and it is passionate about doing science properly.

The IUCr has always had a major emphasis on data and terminology. It has run experiments on how reproducible crystallographic experiments can be. It spends much time on the basis of the science and how to describe it. For over three decades it has had initiatives in defining data representation. It’s blessed with the fact that modern instruments are highly reproducible and that crystallization is a classic method of purification. Because of that a crystal structure done in labs A and B are likely to be in very close agreement. There are exceptions – biological macromolecules are more heterogeneous – but generally it’s a highly reproducible science.

 

This tradition is now central to its publication ethos. Essentially every published result must be replicable (potentially falsifiable) from the information in the publication. Even 45 years ago (when I started) we were expected to type our raw data (thousands of observations) into the pages of the journals. Now it’s electronic but the bar has risen – we now have to publish the X-ray images. There is no room for subjectivity – and if the methodology is flawed the community WILL find it out.

The Union is committed to making crystallography accessible to everyone. For this reason it has advocated for 30 years that ALL publications (not just Acta Cryst) should publish their crystallographic data. It’s moving towards OpenAccess and has a completely Open Access journal, Acta Cryst. E. In this journal the complete crystallographic experiment is checked, and if it’s apparently flawed it’s returned to the authors for comment. Every atom, every bond is checked.

Surely that’s enormously expensive? How much does it cost?

ONE HUNDRED AND FIFTY (150) DOLLARS. That’s all. For a paper where every inch is peer reviewed. Where the contribution by the publisher is enormous.

Contrast that with a publisher which charges TWENTY times that and adds NO value.

The reason that IUCr can do this, and why it is so highly regarded in all disciplines is that over the years they have steadily invested in the information infrastructure (ontology) of their discipline. And it’s been a community effort. Many people (Sid Hall, Howard Flack, Herb Bernstein, John Westbrook, and 30 others http://www.iucr.org/resources/cif/comcifs/members including me ) have contributed in mails, meetings, software, specifications and lots more). Progress has been steady.

And all of this has been designed, guided, glued together by Brian. And he’s done more – in the small Chester office of IUCr he and a few others have built a remarkable suite of publishing software. Fit for purpose, respected by the community of authors and readers/users alike. What other science can say that? (A very few, and I hope they’ll identify themselves here).

And for me, IUCr/CIF/Brian have been a guiding light in the development of CML.

Here he is at our Semantic Physical Science symposium, 2012-01-12 http://vimeo.com/35397924

START AT THE BEGINNING AND WATCH IT ALL THE WAY THROUGH. Then watch it again. Then point your friends at it and take a copy.

0:00 Title

0:40 International Union of Crystallography

1:08 CIF

3:36 CIF Syntax and dataTypes

4:30 Publishing with CIF

6:41 Demonstration: CheckCIF

12:02 Interactive Chemical validation

14:42 Linking data to journal article and search for novelty of data

15:08 Jmol display applet

21:03 Supplementary data

21:47 PublCIF a tool to merge data and text and annotate them

27:08 end