# A Bechdel test for scientific workshops

After attending two recent scientific conferences, one which was gender balanced, and one which was so gender-imbalanced that it engendered snarky out-of-band twitter comments, it struck me that we might need a Bechdel Test for scientific workshops.  The Bechdel test is a simple test for movies.  To pass the test, a movie has to have:

1. at least two [named] women in it,
2. who talk to each other,
3. about something besides a man.

Seems simple, right?  You’d be amazed at just how few popular movies pass the test, including some set in universes that were originally designed for equality. (I’m talking about you, Star Trek reboot.)

Here’s an analogous test for scientific workshops or conference symposia.  Does the workshop have:

1. at least two female invited speakers,
2. who are asked questions by female audience members,

Again, this seems simple, right?  But you’d be shocked how few scientific conference symposia or workshops can live up to this standard.  I suspect this depends strongly on specific research fields.

Rigoberto Hernandez has been talking about advancing science through diversity for quite a while.  I finally got to hear him speak about the OXIDE project on this latest trip, and he’s got a lot of great things to say about how diversity can strengthen science. I think one great way to help is to point out the good conferences we attend which live up to this standard.

Rigoberto also happened to be one of the organizers of the gender-balanced conference, which was also one of the best meetings I’ve ever attended.

# OpenScience comes of age

In 1998, Open Science seemed like a pretty obvious projection of basic scientific principles into the digital age.  I didn’t think the ideas would meet much, if any, resistance from the scientific community.   And in October 1999, Brookhaven National Lab sponsored a meeting called Open Source / Open Science that, in retrospect, was a pretty utopian gathering.  There were a lot of the current OpenScience community members present at the meeting (notably Brian Glanz and Greg Wilson).   It felt like everyone would be convinced to do Open Source & Open Data science in short order.

The past 14 years have been instructive in just how long it can take to make cultural changes in the scientific community.

So, it was an amazing experience to be present when the Office of Science and Technology Policy (OSTP) announced the Champions of Change for Open Science.  These are 13 incredible individuals and organizations with great stories about sharing their science.  It feels like we’ve made significant motion on implementing policies that are friendly to Open Science.   I should note that we’re particularly happy to see OSTP use the phrase Open Science, and not the more narrow terms: Open Data or Open Access.  I’m hopeful that Open Source will also be part of science policy going forward.

There was a second group who got the opportunity to present at this event at a poster session later that day.  I haven’t seen the list publicized elsewhere, but these are some sharp folks who deserve recognition for their work.  I’m going to highlight some of these in the coming week.  Here’s the list of posters:

1. Richard Judson & Ann Richard from the National Center for Computational Toxicology presented on “ACToR & DSSTox: EPA Open Information Tools for Chemicals in the Environment”
2. Tom Bleier, Clark Dunson & Michael Lencioni from the QuakeFinder project presented on “Electromagnetic Earthquake Forecasting Research”
3. David C. Van Essen from WUSTL presented on the “Human Connectome Project
4. Heather Piwowar & Jason Priem presented a poster on “ImpactStory: Open Carrots for Open Science”
5. Jean-Claude Bradley (Drexel) and Andrew Lang (Oral Roberts University) presented a poster on “Open Notebook Science“.
6. Dan Gezelter (that’s me) presented on “The OpenScience Project“.
7. John Wilbanks from Sage Bionetworks presented on “Portable Legal Consent – Let Patients Donate Data to Science
8. Matt Martin from the National Center for Computational Toxicology presented on “ToxRefDB & ToxCastDB: High-Throughput Toxicology Resources”
9. Brian Athey and Christoph Brockel presented on “The tranSMART Platform: Accelerating Open Science, Data Analytics and Data Sharing”
10. Alexander Wait Zaranek, Ward Vandewege & Jonathan Sheffi from Clinical Future, Inc. presented on “Transparent Informatics: A Foundation for Precision Medicine

It was an intense day, and I’m delighted that Open Science has finally come of age.

# OpenScience poster

I’m giving a poster in a few days about openscience.org, and it has been a very long time since I’ve had to make a poster.  This one turned out quite text-heavy, but I wanted to make a few arguments that seemed difficult or impossible to translate into graphics.   A PDF (9.3 MB) of the draft is available by clicking the image on the right…

Comments and suggestions, as always, are quite welcome.

# Not a kickstarter for science, a prize clearinghouse

Yesterday’s post on the reversible random number generators received some interesting reactions from my colleagues.  They were uniformly impressed with the solution to what everyone thought was a hard problem, but surprisingly, most of the scientists I talked to were most excited about the fact that dangling a $500 reward for solving a hard problem generated nearly instantaneous results. Typical comments: I wonder if I similarly spent my startup how much science I could get done… Also, it is amazing what$500 buys these days!

Think how many problems we could solve if we dangled a few prizes for other knotty problems.

• The problem itself was well-framed and finite:  ”We need a time-reversible random number generator.”  It was something that a lot of people in the field could agree was interesting when framed to them properly.
• The group offering the prize was widely-respected for previous work on related problems.
• The prize and the solution were both posted on a highly visible physics site (arXiv).
• The reward was about fame and recognition by the community more than it was about money.

I’m now wondering if all of  the attempts to get a kickstarter or crowdsourced funding model for science (e.g. sciflies, petridish, scifundchallenge, fundageek) are just a bit misguided.  Science is darned expensive, and for better or worse, we’re going to be wedded to federal and foundation funding for science for a long time.  All funding models have an aspect of salesmanship to them – a scientist must convince the funder that the problem itself is interesting enough to need solving, and that their lab is the one to solve it.   In the NSF-style funding model, scientific communities do have significant input into what the “good problems” are, but the necessary delays in funding and the scarcity of funds means that we’re not very agile.

Perhaps we need a clearinghouse where scientific communities can agree on a tough challenge, pool some minimal award money (like $500 or$1000) and let their young colleagues have a go at winning fame by solving them.

# Why aren’t voting machines required to be Open Source?

If ever there was a need for the transparency that open source software brings it is in the realm of voting machine technology.    This story makes that point crystal clear.   There may or may not be shenanigans going on in Ohio.  The point is that we have no way of knowing what the patches on those Ohio voting machines actually do, and no faith in the code reading, debugging, and auditing ability of elected officials.   If we want to be confident in the workings of our democracy, closed-source voting machines should be banned.

For that matter, why aren’t voting systems required to leave a physical paper trail so that we can check up on the tabulating algorithms?

# On Reproducibility

I just got back from a fascinating one-day workshop on “Data and Code Sharing in Computational Sciences” that was organized by Victoria Stodden of the Yale Internet Society Project. The workshop had a wide-ranging collection of contributors including representatives of the computational and data-driven science communities (everything from Astronomy, and Applied Math to Theoretical Chemistry and Bioinformatics), intellectual property lawyers, the publishing industry (Nature Publishing Group and Seed Media, but no society journals), foundations, funding agencies, and the open access community. The general recommendations of the workshop are going to be closely aligned with open science suggestions, as any meaningful definition of reproducibility requires public access to the code and data.

There were some fascinating debates at the workshop on foundational issues; What does reproducibility mean? How stringent of a reproducibility test should be required of scientific work? Reproducible by whom? Should resolution of reproducibility problems be required for publication? What are good roles for journals and funding agencies in encouraging reproducible research? Can we agree on a set of reproducible science guidelines which we can encourage our colleagues and scientific communities to take up?

Each of the attendees was asked to prepare a thought piece on the subject, and I’ll be breaking mine down into a couple of single-topic posts in the next few days / weeks.

The topics are roughly:

• Being Scientific: Fasifiability, Verifiability, Empirical Tests, and Reproducibility
• Barriers to Computational Reproducibility
• Data vs. Code vs. Papers (they aren’t the same)
• Simple ideas to increase openness and reproducibility

Before I jump in with the first piece, I thought it would be helpful to jot down a minimal idea about science that most of us can agree on, which is “Scientific theories should be universal”. That is, multiple independent scientists should be able to subject these theories to similar tests in different locations, on different equipment, and at different times and get similar answers. Reproducibility of scientific observations is therefore going to be required for scientific universality. Once we agree on this, we can start to figure out what reproducibility really means.