Access the full text.
Sign up today, get DeepDyve free for 14 days.
Sequence-to-structure relation is one of the major objects of the analysis of protein folding problem. The pair of two small proteins (domains) of similar structure (-hairpin/-helix/-hairpin) generated by the chains of similar length (about 60 amino acids) with very low sequence similarity (15%) is the object of the comparable analysis of 3D structure. The criterion for similarity estimation is the status of polypeptide chain with respect to the hydrophobic core structure. The fuzzy oil drop model is applied to reveal the differentiated status of fragments of the well-defined secondary structure. This analysis allows the interpretation of the structure in other than the geometric form as it is made based on secondary structure classification. The two compared highly similar proteins appear to be different with respect to the hydrophobic core structure. Keywords: hydrophobicity; protein structure; protein structure prediction; secondary structure. as a hydrophobicity distribution organized according to the idealized hydrophobicity distribution recreating the idealized high concentration in the center of the molecule with a gradual decrease together with the increase of distance versus the center reaching zero level on the surface of protein. The molecule following such organization could be very well soluble in a water environment; however, such molecule seems to be completely deprived of any form of activity. The local discordance, which is of the category of "aim-oriented" local discordance, introduces the local higher entropy causing in consequence the potential ability for structural changes. The problem of similar sequence versus similar structure can be discussed also as the problem of dissimilar sequence versus similar structure. The pair of two proteins representing the latter case is discussed in this paper to reveal that the geometry interpretation of structural forms is not the only one to be discussed in the context of protein structure and function. Materials and methods Data Two proteins are the objects of analysis in this paper: 1PGB (immunoglobulin binding domain of streptococcal protein G)  and 1HZ6 (protein binding, B1 domain of protein l from Peptostreptococcus magnus with a tyrosine substituted by tryptophan) . These two proteins represent a common structural form, which is -hairpin/helix/-hairpin. The sequence similarity is very low (about 15%) [6, 7]. Introduction The protein folding problem is the object of a long-lasting history of research . The Critical Assessment of Structure Prediction (CASP) project (http://predictioncenter. org/)  organized to monitor the progress in this discipline brings the effects that are not recognized as satisfactory . The recently organized project WeFold , which is addressed to the top teams in this discipline, has not solved the problem in a satisfactory degree. It seems that the model is necessary to give a new vision of the protein folding process. The fuzzy oil drop model  is proposed to reveal the hidden form of order. The order is understood *Corresponding author: Irena Roterman, Department of Bioinformatics and Telemedicine, Jagiellonian University Medical College, 31-530 Krakow, Lazarza 16, Poland, E-mail: email@example.com Mateusz Banach: Department of Bioinformatics and Telemedicine, Jagiellonian University Medical College, 31-530 Krakow, Lazarza 16, Poland Leszek Konieczny: Medical Biochemistry, Jagiellonian University Medical College, 31-034 Krakow, Kopernika 7, Poland Fuzzy oil drop model The fuzzy oil drop model is the modification of the oil drop model. The oil drop model, which is of a discrete form, is extended to the continuous model. The two zones in the protein molecule (surface shell of hydrophilic character and hydrophobic core in the central part of the molecule in discrete model) are changed to the continuous form expressed by the 3D Gauss function. The maximum of this function is localized in the center of ellipsoid. The values 118Banach et al.: Protein structure analysis of this function decrease together with the distance versus the center of the molecule reaching the zero level on the surface. The distance center and surface is expressed by the Gauss function parameter calculated for each direction independently. The distribution of hydrophobicity expressed by the 3D Gauss function represents the idealized hydrophobicity in an ideal hydrophobic core. The distribution of observed hydrophobicity in a particular molecule may differ from the idealized distribution. The observed one is calculated according to the function proposed by Levitt . The observed distribution is the effect of interresidual hydrophobic interaction. It depends on the interresidual distance and intrinsic hydrophobicity of interacting residues. Any scale can be applied to calculate the observed hydrophobicity distribution. Each residue is represented by the "effective atom", which is the averaged position of all atoms present in the amino acid. The theoretical as well as observed hydrophobicity is calculated for these points. After normalization, both distributions (sum of all theoretical and observed hydrophobic interactions) can be compared revealing the effective atoms of different status. The expected high hydrophobicity in the central part may be expressed by the low hydrophobicity interaction for hydrophilic residues localized improperly in the central part. Such differentiation reveals the status of each residue in the protein molecule. The differences can be expressed quantitatively according to Kullback-Leibler entropy . Kullback-Leibler entropy expresses the distance between target distribution which is the idealized distribution expressed by the 3D Gauss function and empirical distribution which is the observed distribution. Because the value of entropy cannot be interpreted by itself, the distance entropy can be calculated for other target distribution, which is opposite to the 3D Gauss function in the sense of a complete lack of hydrophobicity concentration. This second target distribution is the constant one, wherein each residue represents the same value equal to 1/N, where N is the number of amino acids in the protein. This is why the O|T is introduced as expressing the distance between "observed" and "theoretical" distribution and O|R is introduced to measure the distribution of observed hydrophobicity versus the constant distribution. To avoid dealing with two quantities, we introduce the parameter RD, which is the relative distance O|T versus O|R. The domains of values of RD<0.5 are interpreted as domains with a well-defined hydrophobic core, whereas the domains of RD>0.5 are interpreted as representing the structure deprived of high concentration of hydrophobicity in the central part of the molecule. The detailed description of the fuzzy oil drop model is available . The overlapped open reading frame (OORF) calculation is similar to the ORF technique applied for nucleic acid analysis. The reading frame, which is the fragment of the polypeptide chain, is also called the "window". The window is of a certain size (usually 10 or 20 amino acids depending on the size of the length of the polypeptide chain). The window is moved from position 110 to position 1112, with the step of 1 amino acid, which makes the reading overlap. The RD value can be calculated for each window revealing the status of sequential fragments of 10 amino acids independently on the secondary form of a particular fragment of the chain. Results The values of RD calculated for both compared proteinsdomains reveal the similar status expressed by RD<0.5. It means that both domains contain a high concentration of hydrophobicity in the center of the molecule and hydrophilic shell on the surface. However, the secondary-structured fragments reveal a different status. In 1HZ6, two -fragments represent the status of RD>0.5. In 1PGB, one helical fragment represents the status above 0.5. -sheet in both proteins appears to represent the status expressed by RD<0.5. It means that -sheets as the whole represent the distribution of hydrophobicity according to the expectations. Other comparative analysis may be expressed using the OORF method; the detailed description can be found in . The results of this analysis are shown in Figure 1. The profile shows the RD values for fragments of 10 amino acids starting with 110 positions in each chain. The O|T and O|R are calculated for each window. The RD value is calculated for each 10-amino acid window. The fragments with RD>0.5 are distinguished by colors according to profiles. The fragments of RD>0.5 for 1PGB are 1223 and 40 43. The fragment of RD>0.5 in 1HZ6 is 416. One fragment (1216) appears to represent a similar status expressed by RD>0.5 in both proteins. One should underline that the end position of each fragment represents the given position and the nine residues following the end position. An interpretation of the profiles given in Figure 1 visualizes the different status of certain fragments of polypeptide chain in the compared domains. The discordance of recognized fragments (RD>0.5) may be interpreted as lower stability. The local disorder versus the ordered hydrophobic core suggests local higher flexibility. Taking Banach et al.: Protein structure analysis119 1PGB RD Window 10 aa 1 1HZ6 RD 0 1 6 11 16 21 26 Windows 31 36 41 46 51 Figure 1:RD profiles for 1PGB (blue) and 1HZ6 (red) in the OORF system. The fragments of RD>0.5 are distinguished using the colors adequate to the colors of profiles. The blue surface distinguishes the fragment in 1PGB characterized by RD>0.5 in the OORF system. The red surface identifies the discordant (RD>0.5) fragment in 1HZ6 (OORF system). The fragment distinguished as turquoise is common for both proteins. It means that the fragment distinguished as turquoise is common in both proteins (domains). This fragment is characterized in both domains as disordered and in consequence less stable (taking the structure of hydrophobic core as the criterion for tertiary structure stability). into account the hydrophobic core as the factor responsible for tertiary stabilization, local discordance may suggest lower stabilization at least from the point of view of the hydrophobic core. The 3D representation of fragments of discordant status (versus the idealized distribution) shown in Figure 2 reveals different areas of lower stability in the two compared molecules. Figure 3 visualizes the fragments of lower stability recognized using the OORF calculation system. A comparison of the 3D graphics makes it clear that different areas are "coded" to be more or less stable. It can be interpreted that the final product (which is the 3D structure), despite similar geometric presentation, codes a different potential local structural flexibility, which is critical for the activity requiring local specific dynamics. Figure 2:3D structure of (A) 1HZ6 and (B) 1PGB with secondary fragments characterized by RD>0.5 given in red. The fragments distinguished as red visualize the fragments of lower stability. Figure 3:3D structure of (A) 1HZ6 and (B) 1PGB with fragments distinguished as red identified by RD>0.5 calculated in the OORF system. 120Banach et al.: Protein structure analysis Table 1:RD values for the two compared domains and RD values for fragments distinguished according to secondary forms. 1HZ6 Domain Secondary structure RD 0.362 0.554 0.603 0.277 0.269 0.491 0.104 0.346 0.474 RD 0.438 0.466 0.455 0.501 0.180 0.236 0.634 0.126 0.339 Secondary structure 1PGB Domain Fragment 19 1220 2237 3841 4246 4749 5056 Fragment 412 1624 2540 4145 4652 5355 5663 -sheet Loop Loop -sheet Fragments of - and -forms are given for both proteins. The secondary fragments are taken according to the PDBSum database. The values in bold distinguish the fragments of status of RD>0.5. Specially, the loops (Table 1) reveal the possible conformational change in the area of -hairpin in 1PGB, whereas in 1HZ6 the potential elasticity can be expected in the loop linking the helix and one -fragment, which is in a shortest distance versus the helix and which is the border -strand of -sheet. Discussion and conclusions The two proteins (the structure of which is discussed in this paper) are the next examples that our traditional classification and interpretation of the protein structure is not sufficient. The geometry seems to be the dominant factor for structure classification. All databases oriented on structure recognition take as the criterion the geometry of ordered fragments (secondary structure) or the mutual relation between secondary structures (supersecondary structure). The fuzzy oil drop model delivers another criterion of the structure order. The specificity of particular fragments is not only related to the atoms (residues) placement in the space. The specificity of particular fragments is also expressed as the status versus the global minimum. The fuzzy oil drop model expresses such global minimum for hydrophobic interaction. It reveals the presence of differentiated status. In consequence, the potential ability for local elasticity and dynamic forms appears different expressing disorder in hydrophobic core organization. On the contrary, the intrinsically disordered fragments (recognized based on the secondary and supersecondary structure organization) appear to be ordered according to the hydrophobic core organization . The two discussed proteins are the members of the large group of proteins recognized as differentiated with respect to the hydrophobic core. The best example is the immunoglobulin family, including also the -sandwich form in other proteins (not necessarily limited to immunoglobulins), which appears to represent significantly the different status versus the hydrophobic core status . A similar situation is observed for flavodoxin fold . The structure of the hydrophobic core treated as responsible for tertiary stabilization may be discussed in the context of SS-bonds, which are also treated as significantly influencing the structure and its stability of proteins. The fragments limited by Cys positions generating SS-bonds define the fragments of status according to the expected hydrophobic core and are sometimes discordant . However, this problem is not the focus of this paper. One may conclude that the pseudoscientific arguments of intelligent design in proteins are of high level. The activity of proteins requires the presence of dynamic forms that must be of high specificity. Thus, the conditions for potential dynamic forms must be controlled. The SS-bonds together with hydrophobic core stability seems to generate a set of conditions creating in this way the controlled conditions for specific activity. The influence of the water environment expressed in force fields traditionally used for protein structure prediction in the form of pairwise interaction (atoms of protein and atoms of water) seems to be not sufficient to express the presence of water. This presence should be active and influencing the entire molecule directing the folding process toward the generation of the hydrophobic core. Acknowledgements: Many thanks to Anan mietaska for technical support. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission. Research funding: The work was financially supported by Collegium Medicum grant system K/ZDS/006363. Employment or leadership: None declared. Honorarium: None declared. Competing interests: The funding organization(s) played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication.
Bio-Algorithms and Med-Systems – de Gruyter
Published: Sep 1, 2016
Access the full text.
Sign up today, get DeepDyve free for 14 days.