An informal definition of OpenScience

Over at the open-science mailing list at okfn.org, Michael Nielsen just posted a great “informal” definition of open science:

 

Open science is the idea that scientific knowledge of all kinds should be openly shared as early as is practical in the discovery process.

The discussion on the list has been very interesting, but that particular “informal” definition is great because it gets at why we’re struggling with established social norms in science given the new technological methods of communicating results:

 

…when the journal system was developed in the 17th and 18th centuries it was an excellent example of open science.  The journals are perhaps the most open system for the dissemination of knowledge that can be constructed — if you’re working with 17th century technology.  But, of course, today we can do a lot better.

10 years of CDK

Today marks (roughly) the tenth birthday of a fantastically successful open science project called the Chemical Development Kit (CDK).  At the time the skeleton of the project was set down on my office whiteboard, I was still the lead developer of Jmol, and Egon Willighagen and Christoph Steinbeck had contributed code to the Jmol project. Christoph’s pet code was a neat 2-d structure editor called JChemPaint, and Egon was working largely on the Chemical Markup Language (CML), although his code contributions were showing up nearly everywhere. Egon and Christoph were in the US for a “Chemistry and the Internet” conference and made a side trip by train to visit me so we could figure out how to unify these projects and to make a more general and reusable set of chemical objects.

The CDK waterfall whiteboard

The CDK waterfall whiteboard

The CDK design session was a fun weekend. In retrospect, they were some of the purest days of collaborative creativity I’ve ever experienced. We spent many hours and a lot of coffee hashing out some of the basic classes of CDK. The final picture of the whiteboard shows a classic waterfall diagram of what we were going to implement.

I’m the first to admit that my contributions to CDK were minimal. Egon & Chris ran with the design, expanded and improved it, implemented all the missing pieces, and released it to the world. It has become an important piece of scientific software, particularly in the bioinformatics community. Beyond Egon & Chris, Rajarshi Guha has been one of the prime developers of the software.

CDK is, by all objective standards a fantastic success story of open source scientific software. It has a large and vibrant user community, active developers, and a number of people (including myself) who browse the code just to see how it does something difficult. Egon has written a thoughtful piece on where CDK should go from here.

Happy Birthday CDK!

Packmol

Packmol One of the biggest issues you face when you first start doing molecular dynamics (MD) simulations is how to create an initial geometry that won’t blow up in the first few time steps. Repulsive forces are very steep if the atoms are too close to each other, and if you are trying to simulate a condensed phase (liquid, solid, or interfacial) system, it can be hard to know how to make a sensible initial structure.

Packmol is a cool program that appears to solve this problem. It creates an initial point for molecular dynamics simulations by packing molecules in defined regions of space. The packing guarantees that short range repulsive interactions do not disrupt the simulations. The great variety of types of spatial constraints that can be attributed to the molecules, or atoms within the molecules, makes it easy to create ordered systems, such as lamellar, spherical or tubular lipid layers. It works with PDB and XYZ files and appears to be available under the GPL. Very, very cool!

Gwyddion – Open Source SPM analysis

Gwyddion We just discovered a very cool open source program for analyzing scanning probe microscopy (SPM) data files. There a number of incompatible and proprietary file formats for surface microscopies (AFM, MFM, STM, SNOM/NSOM) and getting data out from a microscope for further processing (including baseline leveling, profile analysis, and statistical analysis) can be a difficult task. Gwyddion is a Gtk+ based package that runs on Linux, Mac OS X (with MacPorts) and Windows and appears to do nearly everything that some expensive commercial packages (and some free closed-source packages) can do. Some of our colleagues were very happy to discover this piece of wizardry!

If you’re going to do good science, release the computer code too

A very nice aarticle by Darrel Ince has just been posted over at the Guardian. It deals with the climate-gate email theft and the quality of academic science code has just been . An excerpt:

Computer code is also at the heart of a scientific issue. One of the key features of science is deniability: if you erect a theory and someone produces evidence that it is wrong, then it falls. This is how science works: by openness, by publishing minute details of an experiment, some mathematical equations or a simulation; by doing this you embrace deniability. This does not seem to have happened in climate research. Many researchers have refused to release their computer programs — even though they are still in existence and not subject to commercial agreements. An example is Professor Mann’s initial refusal to give up the code that was used to construct the 1999 “hockey stick” model that demonstrated that human-made global warming is a unique artefact of the last few decades. (He did finally release it in 2005.)

Being Scientific: Fasifiability, Verifiability, Empirical Tests, and Reproducibility

If you ask a scientist what makes a good experiment, you’ll get very specific answers about reproducibility and controls and methods of teasing out causal relationships between variables and observables. If human observations are involved, you may get detailed descriptions of blind and double-blind experimental designs. In contrast, if you ask the very same scientists what makes a theory or explanation scientific, you’ll often get a vague statement about falsifiability. Scientists are usually very good at designing experiments to test theories. We invent theoretical entities and explanations all the time, but very rarely are they stated in ways that are falsifiable. It is also quite rare for anything in science to be stated in the form of a deductive argument. Experiments often aren’t done to falsify theories, but to provide the weight of repeated and varied observations in support of those same theories. Sometimes we’ll even use the words verify or confirm when talking about the results of an experiment. What’s going on? Is falsifiability the standard? Or something else?

The difference between falsifiability and verifiability in science deserves a bit of elaboration. It is not always obvious (even to scientists) what principles they are using to evaluate scientific theories,[1] so we’ll start a discussion of this difference by thinking about Popper’s asymmetry.[2] Consider a scientific theory (T) that predicts an observation (O). There are two ways we could approach adding the weight of experiment to a particular theory. We could attempt to falsify or verify the observation. Only one of these approaches (falsification) is deductively valid:

Falsification Verification
If T, then O
Not-O
If T, then O
O


Not-T T


Deductively Valid Deductively Invalid

Popper concluded that it is impossible to know that a theory is true based on observations (O); science can tell us only that the theory is false (or that it has yet to be refuted). He concluded that meaningful scientific statements are falsifiable.

A more realistic picture of scientific theories isn’t this simple. We often base our theories on a set of auxiliary assumptions which we take as postulates for our theories. For example, a theory for liquid dynamics might depend on the whole of classical mechanics being taken as a postulate, or a theory of viral genetics might depend on the Hardy-Weinberg equilibrium. In these cases, classical mechanics (or the Hardy-Wienberg equilibrium) are the auxiliary assumptions for our specific theories.

These auxiliary assumptions can help show that science is often not a deductively valid exercise. The Quine-Duhem thesis[3] recovers the symmetry between falsification and verification when we take into account the role of the auxiliary assumptions (AA) of the theory (T):

Falsification Verification
If (T and AA), then O
Not-O
If (T and AA), then O
O


Not-T T


Deductively Invalid Deductively Invalid

That is, if the predicted observation (O) turns out to be false, we can deduce only that something is wrong with the conjunction, (T and AA); we cannot determine from the premises that it is T rather than AA that is false. In order to recover the asymmetry, we would need our assumptions (AA) to be independently verifiable:

Falsification Verification
If (T and AA), then O
AA
Not-O
If (T and AA), then O
AA
O


Not-T T


Deductively Valid Deductively Invalid

Falsifying a theory requires that auxiliary assumption (AA) be demonstrably true. Auxiliary assumptions are often highly theoretical — remember, auxiliary assumptions might be statements like the entirety of classical mechanics is correct or the Hardy-Weinberg equilibrium is valid! It is important to note, that if we can’t verify AA, we will not be able to falsify T by using the valid argument above. Contrary to Popper, there really is no asymmetry between falsification and verification. If we cannot verify theoretical statements, then we cannot falsify them either.

Since verifying a theoretical statement is nearly impossible, and falsification often requires verification of assumptions, where does that leave scientific theories? What is required of a statement to make it scientific?

Carl Hempel came up with one of the more useful statements about the properties of scientific theories:[4] “The statements constituting a scientific explanation must be capable of empirical test.” And this statement about what exactly it means to be scientific brings us right back to things that scientists are very good at: experimentation and experimental design. If I propose a scientific explanation for a phenomenon, it should be possible to subject that theory to an empirical test or experiment. We should also have a reasonable expectation of universality of empirical tests. That is multiple independent (skeptical) scientists should be able to subject these theories to similar tests in different locations, on different equipment, and at different times and get similar answers. Reproducibility of scientific experiments is therefore going to be required for universality.

So to answer some of the questions we might have about reproducibility:

  • Reproducible by whom? By independent (skeptical) scientists, working elsewhere, and on different equipment, not just by the original researcher.
  • Reproducible to what degree? This would depend on how closely that independent scientist can reproduce the controllable variables, but we should have a reasonable expectation of similar results under similar conditions.
  • Wouldn’t the expense of a particular apparatus make reproducibility very difficult? Good scientific experiments must be reproducible in both a conceptual and an operational sense.[5] If a scientist publishes the results of an experiment, there should be enough of the methodology published with the results that a similarly-equipped, independent, and skeptical scientist could reproduce the results of the experiment in their own lab.

Computational science and reproducibility

If theory and experiment are the two traditional legs of science, simulation is fast becoming the “third leg”. Modern science has come to rely on computer simulations, computational models, and computational analysis of very large data sets. These methods for doing science are all reproducible in principle. For very simple systems, and small data sets this is nearly the same as reproducible in practice. As systems become more complex and the data sets become large, calculations that are reproducible in principle are no longer reproducible in practice without public access to the code (or data). If a scientist makes a claim that a skeptic can only reproduce by spending three decades writing and debugging a complex computer program that exactly replicates the workings of a commercial code, the original claim is really only reproducible in principle. If we really want to allow skeptics to test our claims, we must allow them to see the workings of the computer code that was used. It is therefore imperative for skeptical scientific inquiry that software for simulating complex systems be available in source-code form and that real access to raw data be made available to skeptics.

Our position on open source and open data in science was arrived at when an increasing number of papers began crossing our desks for review that could not be subjected to reproducibility tests in any meaningful way. Paper A might have used a commercial package that comes with a license that forbids people at university X from viewing the code![6] Paper 2 might use a code which requires parameter sets that are “trade secrets” and have never been published in the scientific literature. Our view is that it is not healthy for scientific papers to be supported by computations that cannot be reproduced except by a few employees at a commercial software developer. Should this kind of work even be considered Science? It may be research, and it may be important, but unless enough details of the experimental methodology are made available so that it can be subjected to true reproducibility tests by skeptics, it isn’t Science.


  1. This discussion closely follows a treatment of Popper’s asymmetry in: Sober, Elliot Philosophy of Biology (Boulder: Westview Press, 2000), pp. 50-51.
  2. Popper, Karl R. “The Logic of Scientific Discovery” 5th ed. (London: Hutchinson, 1959), pp. 40-41, 46.
  3. Gillies, Donald. “The Duhem Thesis and the Quine Thesis”, in Martin Curd and J.A. Cover ed. Philosophy of Science: The Central Issues, (New York: Norton, 1998), pp. 302-319.
  4. C. Hempel. Philosophy of Natural Science 49 (1966).
  5. Lett, James, Science, Reason and Anthropology, The Principles of Rational Inquiry (Oxford: Rowman & Littlefield, 1997), p. 47
  6. See, for example www.bannedbygaussian.org

On Reproducibility

I just got back from a fascinating one-day workshop on “Data and Code Sharing in Computational Sciences” that was organized by Victoria Stodden of the Yale Internet Society Project. The workshop had a wide-ranging collection of contributors including representatives of the computational and data-driven science communities (everything from Astronomy, and Applied Math to Theoretical Chemistry and Bioinformatics), intellectual property lawyers, the publishing industry (Nature Publishing Group and Seed Media, but no society journals), foundations, funding agencies, and the open access community. The general recommendations of the workshop are going to be closely aligned with open science suggestions, as any meaningful definition of reproducibility requires public access to the code and data.

There were some fascinating debates at the workshop on foundational issues; What does reproducibility mean? How stringent of a reproducibility test should be required of scientific work? Reproducible by whom? Should resolution of reproducibility problems be required for publication? What are good roles for journals and funding agencies in encouraging reproducible research? Can we agree on a set of reproducible science guidelines which we can encourage our colleagues and scientific communities to take up?

Each of the attendees was asked to prepare a thought piece on the subject, and I’ll be breaking mine down into a couple of single-topic posts in the next few days / weeks.

The topics are roughly:

  • Being Scientific: Fasifiability, Verifiability, Empirical Tests, and Reproducibility
  • Barriers to Computational Reproducibility
  • Data vs. Code vs. Papers (they aren’t the same)
  • Simple ideas to increase openness and reproducibility

Before I jump in with the first piece, I thought it would be helpful to jot down a minimal idea about science that most of us can agree on, which is “Scientific theories should be universal”. That is, multiple independent scientists should be able to subject these theories to similar tests in different locations, on different equipment, and at different times and get similar answers. Reproducibility of scientific observations is therefore going to be required for scientific universality. Once we agree on this, we can start to figure out what reproducibility really means.