On the use of distance constraints to fold a protein

Hiroshi Wako, Harold A. Scheraga

Research output: Contribution to journalArticle

32 Citations (Scopus)

Abstract

A simple method is presented to assess the information that is provided by distance constraints for pairs of residues in proteins. The probability that the distance dij between the Cα atoms of residues i and j lies within a given range is computed for all N(N - 1)/2 pairs in a molecule of N residues, and a quantity H is defined in terms of these probabilities; H is a measure of the ambiguity in the computed conformation of the molecule (consistent with the given distance constraints) and is related to the root-mean-square deviation of the computed conformation from the native one. The quantity H is used to determine the number, kind, and quality of the distance constraints required to define the conformation of a protein within given limits of error, using the 58-residue molecule bovine pancreatic trypsin inhibitor as an illustration. For example, to obtain the computed conformation with a root-mean-square deviation of less than 2 Å from the native conformation, the values of dij of more than ∼80 pairs (half of them with 5 ≤ |i - j| ≤ 20 and the other half with 21 ≤ |i - j| ≤ 57) must be known exactly, or of more than ∼150 pairs (half of them with 5 ≤ |i - j| ≤ 20 and the other half with 21 ≤ |i - j| ≤ 57) must be known with an error no greater than ∼2 Å; alternatively, the same root-mean-square deviation of less than 2 Å from the native structure can be achieved by the computed conformation if more than ∼160 pairs are chosen so that 20 Å is assigned as the lower limit for half of these dij's (for those pairs in the native protein that are separated by ≥20 Å) and 10 Å is assigned as the upper limit for the other half of these dij's (for those pairs in the native protein that are separated by ≤10 Å). In all of the above examples, all values of di,i+1 were fixed at 3.8 Å, and all values of di,i+2 were confined to the range 4.5-7.2 Å (the minimum and maximum possible values for a polypeptide chain). We also examined the kind of constraints (in terms of their distance both along the chain and through space) that are most effective to obtain a small root-mean-square deviation. For a given number of constraints, information about pairs with large |i - j| or small dij is more effective in determining the conformation than is information about pairs with small |i - j| or large dij. It is found, however, that information that includes both small and large |i - j| or both small and large dy is the most effective.

Original languageEnglish
Pages (from-to)961-969
Number of pages9
JournalMacromolecules
Volume14
Issue number4
Publication statusPublished - 1981
Externally publishedYes

Fingerprint

Conformations
Proteins
Molecules
Aprotinin
Polypeptides
Atoms
Peptides

ASJC Scopus subject areas

  • Materials Chemistry

Cite this

On the use of distance constraints to fold a protein. / Wako, Hiroshi; Scheraga, Harold A.

In: Macromolecules, Vol. 14, No. 4, 1981, p. 961-969.

Research output: Contribution to journalArticle

Wako, H & Scheraga, HA 1981, 'On the use of distance constraints to fold a protein', Macromolecules, vol. 14, no. 4, pp. 961-969.
Wako, Hiroshi ; Scheraga, Harold A. / On the use of distance constraints to fold a protein. In: Macromolecules. 1981 ; Vol. 14, No. 4. pp. 961-969.
@article{02bc28995c7f4408a8d94b4a9e022042,
title = "On the use of distance constraints to fold a protein",
abstract = "A simple method is presented to assess the information that is provided by distance constraints for pairs of residues in proteins. The probability that the distance dij between the Cα atoms of residues i and j lies within a given range is computed for all N(N - 1)/2 pairs in a molecule of N residues, and a quantity H is defined in terms of these probabilities; H is a measure of the ambiguity in the computed conformation of the molecule (consistent with the given distance constraints) and is related to the root-mean-square deviation of the computed conformation from the native one. The quantity H is used to determine the number, kind, and quality of the distance constraints required to define the conformation of a protein within given limits of error, using the 58-residue molecule bovine pancreatic trypsin inhibitor as an illustration. For example, to obtain the computed conformation with a root-mean-square deviation of less than 2 {\AA} from the native conformation, the values of dij of more than ∼80 pairs (half of them with 5 ≤ |i - j| ≤ 20 and the other half with 21 ≤ |i - j| ≤ 57) must be known exactly, or of more than ∼150 pairs (half of them with 5 ≤ |i - j| ≤ 20 and the other half with 21 ≤ |i - j| ≤ 57) must be known with an error no greater than ∼2 {\AA}; alternatively, the same root-mean-square deviation of less than 2 {\AA} from the native structure can be achieved by the computed conformation if more than ∼160 pairs are chosen so that 20 {\AA} is assigned as the lower limit for half of these dij's (for those pairs in the native protein that are separated by ≥20 {\AA}) and 10 {\AA} is assigned as the upper limit for the other half of these dij's (for those pairs in the native protein that are separated by ≤10 {\AA}). In all of the above examples, all values of di,i+1 were fixed at 3.8 {\AA}, and all values of di,i+2 were confined to the range 4.5-7.2 {\AA} (the minimum and maximum possible values for a polypeptide chain). We also examined the kind of constraints (in terms of their distance both along the chain and through space) that are most effective to obtain a small root-mean-square deviation. For a given number of constraints, information about pairs with large |i - j| or small dij is more effective in determining the conformation than is information about pairs with small |i - j| or large dij. It is found, however, that information that includes both small and large |i - j| or both small and large dy is the most effective.",
author = "Hiroshi Wako and Scheraga, {Harold A.}",
year = "1981",
language = "English",
volume = "14",
pages = "961--969",
journal = "Macromolecules",
issn = "0024-9297",
publisher = "American Chemical Society",
number = "4",

}

TY - JOUR

T1 - On the use of distance constraints to fold a protein

AU - Wako, Hiroshi

AU - Scheraga, Harold A.

PY - 1981

Y1 - 1981

N2 - A simple method is presented to assess the information that is provided by distance constraints for pairs of residues in proteins. The probability that the distance dij between the Cα atoms of residues i and j lies within a given range is computed for all N(N - 1)/2 pairs in a molecule of N residues, and a quantity H is defined in terms of these probabilities; H is a measure of the ambiguity in the computed conformation of the molecule (consistent with the given distance constraints) and is related to the root-mean-square deviation of the computed conformation from the native one. The quantity H is used to determine the number, kind, and quality of the distance constraints required to define the conformation of a protein within given limits of error, using the 58-residue molecule bovine pancreatic trypsin inhibitor as an illustration. For example, to obtain the computed conformation with a root-mean-square deviation of less than 2 Å from the native conformation, the values of dij of more than ∼80 pairs (half of them with 5 ≤ |i - j| ≤ 20 and the other half with 21 ≤ |i - j| ≤ 57) must be known exactly, or of more than ∼150 pairs (half of them with 5 ≤ |i - j| ≤ 20 and the other half with 21 ≤ |i - j| ≤ 57) must be known with an error no greater than ∼2 Å; alternatively, the same root-mean-square deviation of less than 2 Å from the native structure can be achieved by the computed conformation if more than ∼160 pairs are chosen so that 20 Å is assigned as the lower limit for half of these dij's (for those pairs in the native protein that are separated by ≥20 Å) and 10 Å is assigned as the upper limit for the other half of these dij's (for those pairs in the native protein that are separated by ≤10 Å). In all of the above examples, all values of di,i+1 were fixed at 3.8 Å, and all values of di,i+2 were confined to the range 4.5-7.2 Å (the minimum and maximum possible values for a polypeptide chain). We also examined the kind of constraints (in terms of their distance both along the chain and through space) that are most effective to obtain a small root-mean-square deviation. For a given number of constraints, information about pairs with large |i - j| or small dij is more effective in determining the conformation than is information about pairs with small |i - j| or large dij. It is found, however, that information that includes both small and large |i - j| or both small and large dy is the most effective.

AB - A simple method is presented to assess the information that is provided by distance constraints for pairs of residues in proteins. The probability that the distance dij between the Cα atoms of residues i and j lies within a given range is computed for all N(N - 1)/2 pairs in a molecule of N residues, and a quantity H is defined in terms of these probabilities; H is a measure of the ambiguity in the computed conformation of the molecule (consistent with the given distance constraints) and is related to the root-mean-square deviation of the computed conformation from the native one. The quantity H is used to determine the number, kind, and quality of the distance constraints required to define the conformation of a protein within given limits of error, using the 58-residue molecule bovine pancreatic trypsin inhibitor as an illustration. For example, to obtain the computed conformation with a root-mean-square deviation of less than 2 Å from the native conformation, the values of dij of more than ∼80 pairs (half of them with 5 ≤ |i - j| ≤ 20 and the other half with 21 ≤ |i - j| ≤ 57) must be known exactly, or of more than ∼150 pairs (half of them with 5 ≤ |i - j| ≤ 20 and the other half with 21 ≤ |i - j| ≤ 57) must be known with an error no greater than ∼2 Å; alternatively, the same root-mean-square deviation of less than 2 Å from the native structure can be achieved by the computed conformation if more than ∼160 pairs are chosen so that 20 Å is assigned as the lower limit for half of these dij's (for those pairs in the native protein that are separated by ≥20 Å) and 10 Å is assigned as the upper limit for the other half of these dij's (for those pairs in the native protein that are separated by ≤10 Å). In all of the above examples, all values of di,i+1 were fixed at 3.8 Å, and all values of di,i+2 were confined to the range 4.5-7.2 Å (the minimum and maximum possible values for a polypeptide chain). We also examined the kind of constraints (in terms of their distance both along the chain and through space) that are most effective to obtain a small root-mean-square deviation. For a given number of constraints, information about pairs with large |i - j| or small dij is more effective in determining the conformation than is information about pairs with small |i - j| or large dij. It is found, however, that information that includes both small and large |i - j| or both small and large dy is the most effective.

UR - http://www.scopus.com/inward/record.url?scp=0013203662&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0013203662&partnerID=8YFLogxK

M3 - Article

VL - 14

SP - 961

EP - 969

JO - Macromolecules

JF - Macromolecules

SN - 0024-9297

IS - 4

ER -