diff --git a/_posts/2026-06-15-agent-rosetta.md b/_posts/2026-06-15-agent-rosetta.md new file mode 100644 index 0000000..46898ae --- /dev/null +++ b/_posts/2026-06-15-agent-rosetta.md @@ -0,0 +1,141 @@ +--- +layout: post +title: "Protein Design with Agent Rosetta: A Case Study for Specialized Scientific Agents" +authors: Jacopo Teneggi, SM Bargeen Alam Turzo, Tanya Marwah, Alberto Bietti, P. Douglas Renfrew, Vikram Khipple Mulligan, Siavash Golkar +shorttitle: "Protein Design with Agent Rosetta" +date: 2026-06-18 12:00 +smallimage: agent-rosetta/splash.jpg +image: agent-rosetta/splash.jpg +blurb: An LLM agent to execute protein-design tasks with RosettaScripts. +shortblurb: An LLM agent to execute protein-design tasks with RosettaScripts. +splashimage: /images/blog/agent-rosetta/splash.jpg +link: https://arxiv.org/pdf/2603.15952 +github_link: https://github.com/PolymathicAI/agent-rosetta +permalink: /blog/agent-rosetta/ +--- + +When the *reasoning* power of large language models (LLMs) is combined with AI agents that can write, debug, and execute code through iterative feedback, new opportunities emerge for addressing scientific problems beyond the reach of current machine learning (ML) models. + +Many existing AI agents for science focus on high-level tasks such as literature review, hypothesis generation, or experimental planning. But can AI agents directly operate specialized scientific software and participate in real scientific workflows? + +*Protein design provides an ideal test case.* + +--- + +## **Why Protein Design Is a Challenging Task** +
+Protein design and more generally, heteropolymer design play important roles in drug development, enzyme engineering, and nanomaterials. + +Recent machine learning models such as [AlphaFold](https://deepmind.google/blog/putting-the-power-of-alphafold-into-the-worlds-hands/), [ESMFold](https://biohub.ai/esm/protein/), [RFdiffusion](https://www.bakerlab.org/2023/07/11/diffusion-model-for-protein-design/), and [ProteinMPNN](https://www.bakerlab.org/2022/09/16/proteinmpnn-excels-at-creating-new-proteins/) have made strides in the field. However, these approaches are typically specialized for narrow tasks and are largely restricted to the twenty canonical amino acids found in nature. + +Still, one of the most versatile platforms for protein design remains [Rosetta](https://rosettacommons.org/), a widely used physics-based software suite for protein and heteropolymer design, capable of modeling molecular systems beyond the scope of modern machine learning models. + +The challenge is that while Rosetta is powerful, it can be difficult to use. + +--- + +## **Introducing Agent Rosetta** +
+To bridge this gap, we developed Agent Rosetta, a large language model agent paired with a structured environment for interacting with Rosetta. Agent Rosetta iteratively refines designs to achieve user-defined objectives, combining LLM reasoning with Rosetta’s generality. + +

+
+ Figure 1: Schematics of Agent Rosetta’s multi-turn interaction protocol. +

+ +Rather than generating complete Rosetta protocols from scratch, Agent Rosetta operates through an iterative loop (Figure 1): + +- Observe the current design state +- Reason about possible improvements +- Select an action +- Execute the action through Rosetta +- Analyze the results +- Repeat + +This allows the agent to progressively refine designs while leveraging Rosetta’s underlying physics-based capabilities. + +--- + +## **Why Environment Design Matters** +
+One of the things we found most interesting while creating Agent Rosetta is that model intelligence alone is not sufficient. Although frontier language models have encountered Rosetta documentation during training, they frequently struggle to generate valid Rosetta protocols when relying on prompting alone. In many cases, the problem was not scientific reasoning but software interaction: the model understood what it needed to accomplish but failed to express that goal in Rosetta’s specialized scripting language. + +To solve this problem, we built a structured environment that abstracts Rosetta’s complexity into higher-level actions and provides targeted feedback, transforming Rosetta from a difficult scripting language into a system that the agent could reliably use. For example, Figure 2 compares the performance of various LLMs at generating semantically correct composition constraint blocks, fundamental for protein design in Rosetta ([docs](https://docs.rosettacommons.org/docs/latest/rosetta_basics/scoring/AACompositionEnergy)). Without our symplified syntax, LLMs fail at generating semantically correct constraints. + +

+
+ Figure 2: Comparison of various LLMs at generating semantically correct composition constraint blocks (docs) with and without our simplified syntax. +

+ +Agent Rosetta demonstrates that LLMs can perform competitive scientific design work when paired with a carefully engineered environment that converts complex scientific software into a sequence of interpretable observations, actions, and feedback loops, suggesting that environment design may be just as important as model intelligence for many scientific-agent applications. + +This result highlights an important lesson for scientific AI: **performance often depends as much on environment design as on model capability.** + +--- + +## **Matching Specialized Deep Learning Models** +
+First, we evaluated Agent Rosetta on fixed-backbone protein sequence design using the 20 canonical amino acids. This was important to benchmark our agent, which relies on a general-purpose LLM, against specialized deep learning models such as ProteinMPNN and [BoltzGen](https://github.com/HannesStark/boltzgen), before moving onto tasks with non-canonical amino acids. Not only did Agent Rosetta match state-of-the-art alternatives, but also it outperformed a static RosettaScripts protocol written by various Rosetta experts (Figure 3). + +

+
+ Figure 3: Comparison of Agent Rosetta with ProteinMPNN, BoltzGen, and a human-written RosettaScripts protocol on fixed-backbone protein sequence design using the 20 canonical amino acids only. RMSD to the target backbone conformation and pLDDT were computed with ESMFold. +

+ +These results suggest that scientific agents can effectively leverage existing simulation tools rather than learning every task directly from data (Figure 4 includes example designs). + +

+
+ Figure 4: Example designs by all methods on two target backbone conformations. +

+ +--- + +## **Going Beyond Current Machine Learning Models** +
+Perhaps the most compelling result involves non-canonical amino acids. Many real-world protein design applications require synthetic or modified residues that are rare or entirely absent from nature. Because modern machine learning models depend on training data, these molecules are difficult to model when little or no biological data exists. + +Rosetta’s physics-based framework does not have this limitation. By combining Rosetta with an LLM agent, we demonstrated a successful design workflow involving non-canonical residues, enabling tasks that current deep learning approaches cannot reliably perform. In particular, we studied the inclusion of a non-canonical amino acid in the core of an existing protein. We considered [N1-formyl-tryptophan (TRF)](https://www.rcsb.org/ligand/TRF), a post-translational modification of tryptophan compatible with AlphaFold 3 for validation. The challenge in this task is to position the residue such that the protein remains stable. Most off-the-shelf machine learning models for protein design would struggle with this task, so we compared our approach against a human-written RosettaScripts protocol (see Figure 5). Although simple, the task could easily be expanded to include a palette of different non-canonical amino acids at multiple positions, or even fully non-natural heteropolymers containing no canonical amino acids. + +

+
+ Figure 5: Comparison of Agent Rosetta with a human-written protocol on a design task with non-canonical amino acids. RMSD to the native conformation and pLDDT were computed with AlphaFold 3. +

+ +This highlights a key advantage of combining reasoning agents with scientific software: **the ability to operate in domains where data is scarce but physical principles remain available.** + +We validated the structural stability of Agent Rosetta's designs with three replicates of 1μs molecular dynamics (MD) simulations. MD simulations are computational proxies for the stability of a protein conformation over time, and an unphysical (and/or unfavorable) placement of TRF would likely compromise the structural stability of the fold for which the design has been optimized for. Figure 6 includes the best designs in terms of RMSD to the native structure at the end of the MD simulation on three backbones. + +

+
+ Figure 6: Example designs by Agent Rosetta on including one TRF in an existing protein. TRF is colored in purple, and RMSD to the native structure is computed at the end of the MD simulation of 1μs. +

+ +The following animation depicts a the full MD simulation for the best design on 7SQ3. TRF, the glowing residue, remains well-packed in the core of the protein without compromising the global structure. + +

+ +

+ +--- + +## **Looking Forward** +
+Agent Rosetta represents more than a protein design system. It suggests a future in which AI systems do not merely generate scientific ideas, but actively participate in scientific workflows by interacting with the same software tools used by human researchers. As scientific software continues to evolve, building the environments that connect these tools to intelligent agents may become just as important as building the models themselves. + +--- + +## **Open Source Availability** +
+We are actively preparing Agent Rosetta for release to the general public. We will also release all traces and designs generated in our experiments. Upon release, sesources will be available on the [Polymathic AI GitHub](https://github.com/PolymathicAI/agent-rosetta). + +*-- Sophie Barstein, Jacopo Teneggi, and Siavash Golkar --* + +--- + +**Acknowledgements** + +We would like to acknowledge the support of the Simons Foundation and of Schmidt Sciences. This work was supported in part by the AI2050 program at Schmidt Sciences (Grant G-25-70028). PDR and VKM are wholly funded by the Simons Foundation. We thank the Flatiron Institute's Scientific Computing Core for ongoing support. The computations reported in this paper were performed in-part using resources made available by the Flatiron Institute. The Flatiron Institute is a division of the Simons Foundation. We thank Lucy Reading-Ikkanda and Aditya Chhatrala for their contribution to the figures presented in our paper, and Emeline Haroldsen for her comments on an earlier draft of this article. + +Photo by Alexander Lunyov on Unsplash + \ No newline at end of file diff --git a/images/blog/agent-rosetta/aacomp-syntax.jpg b/images/blog/agent-rosetta/aacomp-syntax.jpg new file mode 100644 index 0000000..c7e4c3d Binary files /dev/null and b/images/blog/agent-rosetta/aacomp-syntax.jpg differ diff --git a/images/blog/agent-rosetta/fixed-backbone-sequence-design-grid.jpg b/images/blog/agent-rosetta/fixed-backbone-sequence-design-grid.jpg new file mode 100644 index 0000000..1588ad9 Binary files /dev/null and b/images/blog/agent-rosetta/fixed-backbone-sequence-design-grid.jpg differ diff --git a/images/blog/agent-rosetta/fixed-backbone-sequence-design-performance.jpg b/images/blog/agent-rosetta/fixed-backbone-sequence-design-performance.jpg new file mode 100644 index 0000000..0fcc06f Binary files /dev/null and b/images/blog/agent-rosetta/fixed-backbone-sequence-design-performance.jpg differ diff --git a/images/blog/agent-rosetta/md.png b/images/blog/agent-rosetta/md.png new file mode 100644 index 0000000..686ed2f Binary files /dev/null and b/images/blog/agent-rosetta/md.png differ diff --git a/images/blog/agent-rosetta/overview.jpg b/images/blog/agent-rosetta/overview.jpg new file mode 100644 index 0000000..aa78c2f Binary files /dev/null and b/images/blog/agent-rosetta/overview.jpg differ diff --git a/images/blog/agent-rosetta/pack-ncaa-grid.jpg b/images/blog/agent-rosetta/pack-ncaa-grid.jpg new file mode 100644 index 0000000..9ca883d Binary files /dev/null and b/images/blog/agent-rosetta/pack-ncaa-grid.jpg differ diff --git a/images/blog/agent-rosetta/pack-ncaa-performance.jpg b/images/blog/agent-rosetta/pack-ncaa-performance.jpg new file mode 100644 index 0000000..4e367b7 Binary files /dev/null and b/images/blog/agent-rosetta/pack-ncaa-performance.jpg differ diff --git a/images/blog/agent-rosetta/splash.jpg b/images/blog/agent-rosetta/splash.jpg new file mode 100644 index 0000000..e9839ef Binary files /dev/null and b/images/blog/agent-rosetta/splash.jpg differ