Open Science Codefest

Open Science CodefestThe National Center for Ecological Analysis and Synthesis (NCEAS) at UCSB is co-sponsoring the Open Science Codefest 2014, which aims to bring together researchers from ecology, biodiversity science, and other earth and environmental sciences with computer scientists, software engineers, and developers to collaborate on coding projects of mutual interest.

Do you have a coding project that could benefit from collaboration, or software skills you’d like to share? The codefest will be held from September 2-4 in Santa Barbara, CA.

Inspired by hack-a-thons and organized in the participant-driven, unconference style, the Open Science Codefest is for anyone with an interesting problem, solution, or idea that intersects environmental science and computer programming. This is the conference where you will actually get stuff done – whether that’s coding up a new R module, developing an ontology, working on a data repository, creating data visualizations, dreaming up an interactive eco-game, discussing an idea, or any other concrete collaborative goal that interests a group of people.

Looks like a great program!

Two Shark Studies Reveal the Old and Slow

Sharks live in the vast, deep, and dark ocean, and studying these large fish in this environment can be difficult. We may have sharks ‘tweeting’ their location, but we still know relatively little about them. Sharks have been on the planet for over 400 million years and today, there are over 400 species of sharks, but how long do they live, and how do they move? Two recent studies published in in PLOS ONE have addressed some of these basic questions for two very different species of sharks:  great whites and megamouths.

The authors of the first study looked at the lifespan of the great white shark. Normally, a shark’s age is estimated by counting growth bands in their vertebrae (image 1), not unlike counting rings inside a tree trunk. But unfortunately, these bands can be difficult to Great white vertdifferentiate in great whites, so the researchers dated the radiocarbon that they found in them. You might wonder where this carbon-14 (14C) came from, but believe it or not, radiocarbon was deposited in their vertebrae when thermonuclear bombs were detonated in the northwestern Atlantic Ocean during the ‘50s and ’60s. These bands therefore provide age information. Based on the ages of the sharks in the study, the researchers suggest that great whites may live much longer than previously thought. Some male great whites may even live to be over 70 years old, and this may qualify them as one of the longest-living shark species. While these new estimates are impressive, they may also help scientists understand how threats to these long-living sharks may impact the shark population.

A second shark study analyzed the structure of a megamouth shark’s pectoral fin (image 2) to understand and predict their motion through the water. Discovered megamouth finin 1976, the megamouth is one of the rarest sharks in the world, and little is known about how they move through the water. We do know that the megamouth lives deep in the ocean and is a filter feeder, moving at very slow speeds to filter out a meal with its large mouth. But swimming slowly in the water is difficult in a similar way flying slowly in an airplane is difficult. Sharks need speed to control lift and movement.

To better understand the megamouth’s slow movement, the researchers measured the cartilage, skin histology, and skeletal structure of the pectoral fins of one female and one male megamouth shark, caught accidentally and preserved for research. The researchers found that the megamouth’s skin was highly elastic, and its cartilage was made of more ‘segments’ than any other known shark, which may provide added flexibility compared to other species. megamouth jointThe authors also suggest that the joint structure (image 3) of the pectoral fin may allow forward and backward rotation, motions that are largely restricted in most sharks.  The authors suggest that this flexibility and mobility of the pectoral fin may be specialized for controlling body posture and depth at slow swimming speeds. This is in contrast to the fins of fast-swimming sharks that are generally stiff and immobile.

In addition to the difficulties in exploring deep, dark seas, small sample sizes present challenges for many shark studies, including those described here. But whether studying the infamous great white shark or one of the rare megamouths, both contribute to a growing body of knowledge of these elusive fish.

Citations: Hamady LL, Natanson LJ, Skomal GB, Thorrold SR (2014) Vertebral Bomb Radiocarbon Suggests Extreme Longevity in White Sharks. PLoS ONE 9(1): e84006. doi:10.1371/journal.pone.0084006

Tomita T, Tanaka S, Sato K, Nakaya K (2014) Pectoral Fin of the Megamouth Shark: Skeletal and Muscular Systems, Skin Histology, and Functional Morphology. PLoS ONE 9(1): e86205. doi:10.1371/journal.pone.0086205

Images1: doi:10.1371/journal.pone.0084006.g001

Image 2: doi:10.1371/journal.pone.0086205.g003

Image 3: doi:10.1371/journal.pone.0086205.g004

The post Two Shark Studies Reveal the Old and Slow appeared first on EveryONE.

On the Science “Sting”

Science magazine (a closed-access publisher) does a “sting” on crappy OA journals (and boy are there lots of these), and Michael Eisen points out how this sting is more about how crappy peer review is at catching bad science (even at Science).  Here’s the best quote from Eisen’s response:

“To suggest – as Science (though not Bohannon) are trying to do – that the problem with scientific publishing is that open access enables internet scamming is like saying that the problem with the international finance system is that it enables Nigerian wire transfer scams.

There are deep problems with science publishing. But the way to fix this is not to curtain open access publishing. It is to fix peer review.”

A Bechdel test for scientific workshops

After attending two recent scientific conferences, one which was gender balanced, and one which was so gender-imbalanced that it engendered snarky out-of-band twitter comments, it struck me that we might need a Bechdel Test for scientific workshops.  The Bechdel test is a simple test for movies.  To pass the test, a movie has to have:

  1. at least two [named] women in it,
  2. who talk to each other,
  3. about something besides a man.

Seems simple, right?  You’d be amazed at just how few popular movies pass the test, including some set in universes that were originally designed for equality. (I’m talking about you, Star Trek reboot.)

Here’s an analogous test for scientific workshops or conference symposia.  Does the workshop have:

  1. at least two female invited speakers,
  2. who are asked questions by female audience members,
  3. about their research.

Again, this seems simple, right?  But you’d be shocked how few scientific conference symposia or workshops can live up to this standard.  I suspect this depends strongly on specific research fields. 

Rigoberto Hernandez has been talking about advancing science through diversity for quite a while.  I finally got to hear him speak about the OXIDE project on this latest trip, and he’s got a lot of great things to say about how diversity can strengthen science. I think one great way to help is to point out the good conferences we attend which live up to this standard.

Rigoberto also happened to be one of the organizers of the gender-balanced conference, which was also one of the best meetings I’ve ever attended.

OpenScience comes of age

In 1998, Open Science seemed like a pretty obvious projection of basic scientific principles into the digital age.  I didn’t think the ideas would meet much, if any, resistance from the scientific community.   And in October 1999, Brookhaven National Lab sponsored a meeting called Open Source / Open Science that, in retrospect, was a pretty utopian gathering.  There were a lot of the current OpenScience community members present at the meeting (notably Brian Glanz and Greg Wilson).   It felt like everyone would be convinced to do Open Source & Open Data science in short order.

The past 14 years have been instructive in just how long it can take to make cultural changes in the scientific community.

So, it was an amazing experience to be present when the Office of Science and Technology Policy (OSTP) announced the Champions of Change for Open Science.  These are 13 incredible individuals and organizations with great stories about sharing their science.  It feels like we’ve made significant motion on implementing policies that are friendly to Open Science.   I should note that we’re particularly happy to see OSTP use the phrase Open Science, and not the more narrow terms: Open Data or Open Access.  I’m hopeful that Open Source will also be part of science policy going forward.

openscipostersThere was a second group who got the opportunity to present at this event at a poster session later that day.  I haven’t seen the list publicized elsewhere, but these are some sharp folks who deserve recognition for their work.  I’m going to highlight some of these in the coming week.  Here’s the list of posters:

  1. Richard Judson & Ann Richard from the National Center for Computational Toxicology presented on “ACToR & DSSTox: EPA Open Information Tools for Chemicals in the Environment”
  2. Tom Bleier, Clark Dunson & Michael Lencioni from the QuakeFinder project presented on “Electromagnetic Earthquake Forecasting Research”
  3. David C. Van Essen from WUSTL presented on the “Human Connectome Project
  4. Heather Piwowar & Jason Priem presented a poster on “ImpactStory: Open Carrots for Open Science”
  5. Jean-Claude Bradley (Drexel) and Andrew Lang (Oral Roberts University) presented a poster on “Open Notebook Science“.
  6. Dan Gezelter (that’s me) presented on “The OpenScience Project“.
  7. John Wilbanks from Sage Bionetworks presented on “Portable Legal Consent – Let Patients Donate Data to Science
  8. Matt Martin from the National Center for Computational Toxicology presented on “ToxRefDB & ToxCastDB: High-Throughput Toxicology Resources”
  9. Brian Athey and Christoph Brockel presented on “The tranSMART Platform: Accelerating Open Science, Data Analytics and Data Sharing”
  10. Alexander Wait Zaranek, Ward Vandewege & Jonathan Sheffi from Clinical Future, Inc. presented on “Transparent Informatics: A Foundation for Precision Medicine

It was an intense day, and I’m delighted that Open Science has finally come of age.

OpenScience poster

OpenSciencePoster.001

I’m giving a poster in a few days about openscience.org, and it has been a very long time since I’ve had to make a poster.  This one turned out quite text-heavy, but I wanted to make a few arguments that seemed difficult or impossible to translate into graphics.   A PDF (9.3 MB) of the draft is available by clicking the image on the right…

Comments and suggestions, as always, are quite welcome.

Playing with MultiGraph

multigraph-logo72x72I’ve been playing around with a cool JavaScript library called MultiGraph which lets you interact with graphical data embedded in a blog post.   The data format is a simple little xml file called a “MUGL“.   Here’s a sample that took all of about 10 minutes to create:

Note that you can pan and zoom in on the data.   For those readers who are interested, this data is the Oxygen-Oxygen pair distribution function, \(g_{OO}(r)\), for liquid water that was inferred from X-ray scattering data from  G. Hura, J. M. Sorenson,  R. M. Glaeser, and  T. Head-Gordon, J. Chem. Phys. 113(20), pp. 9140-9148 (2000).

Inserting this into the blog post involved uploading two files, the javascript library itself and the MUGL file. After those were in place, there were only two lines that needed to be added to the blog post:

<script type="text/javascript" src="http://www.openscience.org/blog/wp-content/uploads/2013/05/multigraph-min.js"></script>


<div class="multigraph" data-height="300" data-src="http://www.openscience.org/blog/wp-content/uploads/2013/05/gofrmugl.xml" data-width="500"></div>

One thing that would be nice would be a way to automate the process of going from an xmgrace file directly to the MUGL format.

SimThyr – simulation software for pituitary thyroid feedback

feedback_overview_smallThis is a bit outside our normal area of expertise, but it looks interesting.

Thyroid hormones play an important role in metabolism, growth and differentiation. Therefore, exact regulation of thyroid hormone levels is vital for most organisms. The mechanism for the feedback control known, but the dynamics are still a bit of a mystery.  There’s an interesting page on the different models for thyrotropic feedback control at the Midizinische Kybernetic (Medical Cybernetics) site.  SimThyr is an open source Pascal-based simulation program for the pituitary thyroid feedback control mechanism that explores these models and makes predictions for dynamics based on parameters of the feedback mechanism.

Not a kickstarter for science, a prize clearinghouse

prize_moneyYesterday’s post on the reversible random number generators received some interesting reactions from my colleagues.  They were uniformly impressed with the solution to what everyone thought was a hard problem, but surprisingly, most of the scientists I talked to were most excited about the fact that dangling a $500 reward for solving a hard problem generated nearly instantaneous results.  Typical comments:

I wonder if I similarly spent my startup how much science I could get done…

Also, it is amazing what $500 buys these days!

Think how many problems we could solve if we dangled a few prizes for other knotty problems.

So what made this work?

  • The problem itself was well-framed and finite:  ”We need a time-reversible random number generator.”  It was something that a lot of people in the field could agree was interesting when framed to them properly.
  • The group offering the prize was widely-respected for previous work on related problems.
  • The prize and the solution were both posted on a highly visible physics site (arXiv).
  • The reward was about fame and recognition by the community more than it was about money.

I’m now wondering if all of  the attempts to get a kickstarter or crowdsourced funding model for science (e.g. sciflies, petridish, scifundchallenge, fundageek) are just a bit misguided.  Science is darned expensive, and for better or worse, we’re going to be wedded to federal and foundation funding for science for a long time.  All funding models have an aspect of salesmanship to them – a scientist must convince the funder that the problem itself is interesting enough to need solving, and that their lab is the one to solve it.   In the NSF-style funding model, scientific communities do have significant input into what the “good problems” are, but the necessary delays in funding and the scarcity of funds means that we’re not very agile.

Perhaps we need a clearinghouse where scientific communities can agree on a tough challenge, pool some minimal award money (like $500 or $1000) and let their young colleagues have a go at winning fame by solving them.

Reversible Random Number Generators

random_numberThis news comes by way of John Parkhill, my new colleague here at Notre Dame.

William G. Hoover (of the Nosé-Hoover Thermostat) and Carol G. Hoover issued a $500 challenge on arXiv to generate a time-reversible random number generator.  The challenge itself would be quite remarkable news.  What’s even better is that the challenge (including the source code for an implementation) was solved in 6 days by Frederico Ricci-Tersenghi.

Why is this a big deal?  Most of the equations in physics that govern time evolution of particles obey time-reversal symmetry; the same differential equations that govern molecular or planetary motion will take you back to your starting point if you suddenly reverse the time variable.  This is a usually a fantastic way to check to see if you are doing the physics correctly in your simulations, and also means that collections of  starting points that are related to each other behave in certain predictable ways when they evolve.

Stochastic approaches to physical motion introduce an aspect of randomness to mimic the behavior of complex phenomena like the motion of solvent surrounding the molecule we’re interested in, or to mimic the transitions between different electronic states of a molecule.   The introduction of random numbers has meant we had to give up time-reversibility, and we’ve been willing to live with that for a long time because we can study more complicated phenomena.

If we have access to a time-reversible pseudo-random number generator, however, we get that very powerful tool back in our toolbox.

Now, the Langevin equation,

\(m \frac{d^2 x}{dt^2} = F – \gamma(t) \frac{dx}{dt} + R(t)\)

 

has two things that prevent it from being time-reversible.  Besides the stochastic or random force, \(R(t)\), there’s also a drag or friction force, \(-\gamma(t) \frac{dx}{dt}\), that depends on the velocities of the particles.  There’s no solution yet to time reversibility for this piece (and I have my doubts that there ever will be a way to reverse this).  I suppose if we offer up another $500 prize for time-reversible drag, we’d make some traction on this problem…

(The comic above courtesy of xkcd).

Relax – Molecular dynamics by NMR data analysis

RelaxEdward d’Auvergne pointed out the relax program, which looks like a useful way to connect experimental NMR spectra with molecular dynamics simulations.

relax is designed for the study of molecular dynamics of organic molecules, proteins, RNA, DNA, sugars, and other biomolecules through the analysis of experimental NMR data. It supports exponential curve fitting for the calculation of the R1 and R2 relaxation rates, calculation of the NOE, reduced spectral density mapping, the Lipari and Szabo model-free analysis, study of domain motions via the N-state model or ensemble analysis and frame order dynamics theories using anisotropic NMR parameters such as RDCs and PCSs, and the investigation of stereochemistry.

The Tyranny of Pi day

pigraphicMarch 14th is \(\pi\)-day in the US (and perhaps \(4.\overline{666}\) day in Europe). The idea of a day devoted to celebrating an important irrational number is wonderful — I’d love to see schools celebrate e-day as well, but February 71st isn’t on the calendar. Unfortunately, March 14th has also become the day in which 4th and 5th graders around the US practice for one of the most pointless exercises imaginable – a competition to recite the largest number of digits of \(\pi\).

Memorization of long digit strings is not an exercise that teaches a love of mathematics (or anything else useful about the natural world).  This is solely an exercise in recall, which is perhaps valuable for remembering phone numbers, but not for understanding transcendental constants. For all practical purposes, only the first few digits of \(\pi\) are really necessary – the first 40 digits of \(\pi\) is enough to compute the circumference of the Milky Way galaxy with an error less than the size of an atomic nucleus.

So, because \(\pi\) is a such an accessible entry to mathematics and science, I thought I’d come up with a list of other cool \(\pi\) things that could replace these pointless memory contests:

  • The earliest written approximations of \(\pi\) are found in Egypt and Babylon, and both are within 1 percent of the true value. In Babylon, a clay tablet dated 1900–1600 BC has a geometrical statement that, by implication, treats \(\pi\) as 25/8 = 3.1250. In Egypt, the Rhind Papyrus, dated around 1650 BC, but copied from a document dated to 1850 BC has a formula for the area of a circle that treats \(\pi = \left(\frac{16}{9}\right)^2 \approx 3.1605\).
  • In 220 BC, Archimedes proved that \( \frac{223}{71} < \pi < \frac{22}{7}\).  The mid-point of these fractions is 3.1418.
  • Around 500 AD, the Chinese mathematician Zu Chongzhi  was using a rational approximation for \(\pi \approx 355/113 = 3.14159292\), which is astonishingly accurate.  For most day-to-day uses of \(\pi\) this particular approximation is still sufficient.
  • By 800 AD, the great Persian mathematician, Al-Khwarizmi, was estimating \(\pi \approx 3.1416\)
  • A good mnemonic for the decimal expansion of \(\pi\) is given by the letter count in the words of the sentences: “How I want a drink, alcoholic of course, after the heavy lectures involving quantum mechanics. All of thy geometry, Herr Planck, is fairly hard…”
  • Georges-Louis Leclerc, The Comte de Buffon came up with one of the first “Monte Carlo” methods for computing the value of \(\pi\) in 1777.  This method involves dropping a short needle of length \(\ell\) onto lined paper where the lines are spaced a distance \(d\) apart.  The probability that the needle crosses one of the lines is given by:  \(P = \frac{2 \ell}{\pi d}\).
  • In 1901, the Italian mathematician Mario Lazzarini attempted to compute \(\pi\) using Buffon’s Needle.  Lazzarini spun around and dropped a 2.5 cm needle 3,408 times on a grid of lines spaced 3 cm apart. He got 1,808 crossings and estimated \(\pi = 3.14159292\). This is a remarkably accurate result!   There is now a fair bit of skepticism about Lazzarini’s result, because his estimate reduces to Zu Chongzhi’s rational approximation.  This controversy is covered in great detail in Mathematics Magazine 67, 83 (1994).
  • Another way to estimate \(\pi\) would be to use continued fractions.  Although there are simple continued fractions for \(\pi\), none of them show any obvious patters.  There’s a beautiful (but non-simple) continued fraction for \(\frac{4}{\pi}\):
    \(\frac{4}{\pi} = 1 + \frac{1^2}{2 + \frac{3^2}{2 + \frac{5^2}{2 + \frac{7^2}{2 + …}}}}\)

    Can you spot the pattern?

  • Vi Hart, the wonderful mathemusician, has a persuasive argument that we should instead be celebrating \(\tau\) day on June 28th.   Actually, all of her videos are wonderful.  If my kids spent all day doing nothing but playing with snakes  it would be better than memorizing digits of \(\pi\).Pie Plate Pi
  • Another wonderful way to compute \(\pi\) is to use nested round and square baking dishes (of the correct size) and drop marbles into them randomly from a distance.  Simply count up the number of marbles that land in the circular dish and keep track of the total number of marbles that landed in either the circle or the square. Since the area formulae for squares and circles are related, the value of \(\pi = 4 \frac{N_{circle}}{N_{total}}\).

There are probably 7000 better things to do with \(\pi\) day than digit memory contests. There are lots of creative teachers out there — how are all of you going to celebrate \(\pi\)-day?

Do.abl.es

Do.abl.es

Do.abl.es

Do you want to know you can measure DNA contour lengths using ImageJ?  Perhaps you want to stain a C. Elegans embryo for imaging?  Or possibly, you might want to test whether or not you have gotten an immune response using ELISA?

Martin Fitzpatrick sends word of a cool collection of open access scientific protocols called Do.abl.es.  For the uninitiated, protocols are the recipes that scientists use to carry out experiments in a reproducible way.  The list of protocols posted to Do.abl.es to date has a number of interesting and important biochemistry and biology experiments.

There’s also a neat companion site called Install.abl.es which concentrates on many of the same things we do – the use of open source software in the sciences.