The RosettaCon 2012 Collection: Rosetta Developers Meet the Challenges in Macromodeling Head On

Rosetta2012 Collection ImageReproducibility continues to be one of the major challenges facing computational biologists today. Complicated experiments, massive data sets, scantily described protocols, and constantly evolving code can make experimental documentation and replication very difficult.  In addition, the need for specialized knowledge and access to large computational resources can create barriers when trying to design and model macromolecules.

Every year, the Rosetta developer community meets to discuss these challenges and advancements via Rosetta, a software suite that models and helps design macromolecules. In 2010, PLOS announced the RosettaCon2010 Collection, which made the latest research on protocols used to create macromolecular models available to all. Now, the PLOS ONE RosettaCon 2012 Collection continues to tackle issues related to use, reproducibility and documentation by highlighting new scientific developments within the Rosetta community.

The RosettaCon 2012 Collection comprises 14 articles detailing the scientific advancements made by developers that use Rosetta. In order to address reproducibility and documentation challenges, each article within this Collection includes an archive containing links to the exact version of the code used in the paper, all input data, links to external tools and example scripts.

This year’s Collection marks the tenth anniversary of RosettaCon and focuses on three long-term goals of the community: increase the usability of Rosetta, improve its current methods, and introduce completely new protocols.

Increasing the usability of Rosetta – Rosetta still requires specialized knowledge and large computational resources, but this collection features two articles describing advancements that make it easier for non-experts to use its applications. These articles introduce the Rosetta Online Server that Includes Everyone (ROSIE) workflow, which allows for rapid conversion of Rosetta applications into public web servers, and PyRosetta, a new graphical user interface (GUI) which allows users to run standard Rosetta design tasks.

Improving current prediction methods – Several articles describe improvements to Rosetta’s structure prediction capabilities and design methodologies. Some examples include improvements to loop conformational sampling, and a recently developed ray-casting (DARC) method for small molecule docking now enables virtual screening of large compound libraries.

Introducing new protocols – A number of articles featuring new procedures and applications that debuted at the conference are introduced in the Collection. Highlights include new methods for dealing with ligand docking, advancements to pre-refine scaffold proteins prior to computational design of functional sites, and new protocols to drive Rosetta de novo modeling.

The RosettaCon 2012 Collection continues to help serve the Rosetta community in an effort to ensure that newly developed protocols are as usable as more established workflows, are transparent, and are accurately documented even in an active development environment.

This post has been adapted from “The RosettaCon 2012 Special Collection: Code Writ on Water, Documentation Writ in Stone” which serves as a more in-depth overview of the new collection. To read all that this Collection has to offer, click here.

PLoS ONE Launches the RosettaCon 2010 Collection

Reproducing computational biology protocols is one of the difficult challenges facing computational biologists today. Often times, it is rarely feasible to replicate the computational environment of an original work because of the complexity of macromolecular modeling protocols.  Moreover,  much of this work is new research and not focused solely on the algorithms or workflows.  As a result, published results often contain method descriptions with inconsistently stated protocols and dependencies.

In the new PLoS ONE Collection: RosettaCon 2010, over 15 academic groups from Rosetta Commons have attempted to capture these protocols in a sufficiently complete and formal way. This Collection aims to make several of the latest Rosetta macromolecular modeling protocols from the 2010 Rosetta Developers Meeting accessible to all.

Three main contributions came from the meeting and are represented in the PLoS ONE Collection.

1. New Rosetta applications: Several articles describe specific applications in biology or chemistry, including de novo enzyme design, modeling classes of protein loops, design of temperature sensitive mutations, and design of peptides to inhibit large surface area protein interactions.

2. Rosetta basic science: Several of the contributions in this Collection are Rosetta basic science papers of this type.  Examples include: incorporation of non-canonical amino acids in Rosetta design, multi-state design, new Rosetta kinematics, new protein docking protocols, and anchored design. Each example has the full protocol that lead to the incorrect prediction fully described as well as the correct (“native”) structure; thus these protocols are key elements in defining and judging future improvement in Rosetta and other codes.

3. Rosetta code development: Multiple articles describe new code refactoring, extensions or improvements to the implementation of Rosetta.  Several articles discuss the creation of multi-purpose high level interfaces to the components of Rosetta.  Examples include an XML scripting interface for Rosetta, an interactive python interface to the Rosetta code, and an object oriented API for generating Rosetta fragments.

RosettaCon 2010 provides the larger community direct access to the exact protocols used in each of these papers. This construction was intended to allow other community members and Rosetta users to reproduce the work, allow competing groups to validate and improve upon the work, and finally, make it rapidly accessible to new users with similar biological applications.

It should be noted that this Collection is itself a social experiment and the collaborators wrestled with how best to capture an evolving set of processes in a way that does not overly burden authors, works across a distributed community without a central authority for methods capture, is timely, and is sufficiently self-consistent that readers will invest their time in the results.

The paper, 2010 Rosetta Developers Meeting: Macromolecular Prediction and Design Meets Reproducible Publishing, by Renfrew et al. was adapted to create this post.

Collection Citation: RosettaCon 2010 (2011) PLoS Collections: http://www.ploscollections.org/RosettaCon2010