10 years of CDK

Today marks (roughly) the tenth birthday of a fantastically successful open science project called the Chemical Development Kit (CDK).  At the time the skeleton of the project was set down on my office whiteboard, I was still the lead developer of Jmol, and Egon Willighagen and Christoph Steinbeck had contributed code to the Jmol project. Christoph’s pet code was a neat 2-d structure editor called JChemPaint, and Egon was working largely on the Chemical Markup Language (CML), although his code contributions were showing up nearly everywhere. Egon and Christoph were in the US for a “Chemistry and the Internet” conference and made a side trip by train to visit me so we could figure out how to unify these projects and to make a more general and reusable set of chemical objects.

The CDK waterfall whiteboard

The CDK waterfall whiteboard

The CDK design session was a fun weekend. In retrospect, they were some of the purest days of collaborative creativity I’ve ever experienced. We spent many hours and a lot of coffee hashing out some of the basic classes of CDK. The final picture of the whiteboard shows a classic waterfall diagram of what we were going to implement.

I’m the first to admit that my contributions to CDK were minimal. Egon & Chris ran with the design, expanded and improved it, implemented all the missing pieces, and released it to the world. It has become an important piece of scientific software, particularly in the bioinformatics community. Beyond Egon & Chris, Rajarshi Guha has been one of the prime developers of the software.

CDK is, by all objective standards a fantastic success story of open source scientific software. It has a large and vibrant user community, active developers, and a number of people (including myself) who browse the code just to see how it does something difficult. Egon has written a thoughtful piece on where CDK should go from here.

Happy Birthday CDK!