A Complete and Effective Move Set for Simplified Protein Folding

We have developed a new approach for solving a simplified version of the protein-folding problem. Our techniques were able to generate new best solutions for several of the largest benchmarks available in the literature. Our algorithm uses standard search techniques. Our key contribution is a novel set of transformations on protein configurations, which we call "pull moves". Theoretically, we have shown that pull moves have desirable properties for use in search algorithms. Experimentally, we have shown that a generic implementation of tabu search using pull moves performs extremely well on benchmark problems.

Background & Objective:  We have worked on the two-dimensional hydrophobic-hydrophilic (2D HP) model for protein folding, introduced by K.A. Dill in 1989. Although the model is simple, it is thought to capture many of the main features of the protein-folding problem. Thus, techniques that work well on the simple problem may be useful for the real protein-folding problem. This research has been performed in the context of the ongoing Human-Guided Search (HuGS) project on interactive optimization. While we have developed a fully-automatic system, the interactive visualizations of HuGS were critical to allowing the researchers on the project to understand the problem and develop solutions. Pull moves were originally designed to help a user quickly manipulate the current protein configuration but turned out to be very effective for computer search as well.

Technical Discussion:  Solving a 2D HP problem instance involves placing a sequence of amino acids, each labeled as either hydrophobic (a square in the example) or hydrophilic (a circle in the example). Each amino acid must be placed horizontally or vertically adjacent on the grid to its neighbors in the sequence. The goal is to find the configuration with minimal energy, which typically means maximizing the number of adjacent hydrophobic pairs.   Ãƒâ€š  Pull moves typically begin by repositioning two amino acids that are neighbors in the given sequence to new, adjacent grid locations that are currently free. This may introduce a gap between two neighbors in the given sequence, but a valid configuration can be obtained by repositioning additional amino acids into grid locations that were occupied before the pull move began. In the worst case, every amino acid might be repositioned, but often a valid configuration is reached after only repositioning a few amino acids.

Outside Collaborations:  We are collaborating with professors in Harvard University and McGill University.

Contact:  Joseph Katz

Technology Area:  Artificial Intelligence

Modification Date:  September 12, 2007