The position structural displacements are the foundational variables which determine the bioactivity differences of individual proteins

In other words, sharing the similar structural folding pattern is the necessary condition for all members in a protein family. Therefore the structural conservation is more important than the conservation of amino acid composition. The a-amylase protein family is a good example, which has an average sequence length of 420 amino acids. Among the 420 amino acids only 8 to 10 residues are absolutely conservative, and all other residues may be different more or less. On the other hand, the proteins of a-amylase family have a very conservative Sorafenib structure structure region, TIM 8 barrel, and all other structural regions may be different. The differences in biological activity of individual proteins in a family are determined not only by the mutations of amino acids, but also by the structural differences.

For example, all types of neuraminidases of influenza A viruses, which is the drug target of oseltamivir and zanamivir, share the same folding pattern of 3D structures. However, small structural difference at 150-loop in NA subtypes may cause the drug resistant problem. On the other hand, the structural differences at 150-loop of NA subtypes are the structural basis for designing effective drugs against specific subtype of influenza virus. In the previous studies of statistical analysis for functional evolution of protein family, most attentions had focused on the amino acid conservation and mutation. In this study a computational approach, namely structural position correlation analysis, is developed to predict mutual correlations of structural segments and positions, and to find the signal communication network in protein family. We expect that the SPCA approach may find applications in protein engineering and in structure-based rational drug design. Structural conservation is the necessary condition for all members of a protein family, and the local structure differences may be responsible for the functional differences of individual proteins. Taking the structural data into the consideration of statistical analysis for protein evolutionary family certainly can find useful information that cannot be revealed by the amino acid sequence and frequency-based methods.

The theoretical implications of SPCA approach are summarized as follows. The standard protein P of a protein family, in which the position coordinates are the average coordinates of corresponding residues of all proteins and the residues at each position are the most frequent amino acid, keeps the common structural features of the family that are shared by all protein members. The most conservative positions form the structural core, and the amino acids at the most conservative positions perform the biological activity. The residues at other positions provide the physicochemical environment for the functional residues. The influences of non functional residues to the functional residues are determined not only by the amino acid types, but also by their position displacements.

Leave a Reply