β-sheet Topology Prediction Using Probability-based Integer Programming

Document Type : Bioinformatics-Naghibzadeh

Authors

Ferdowsi University of Mashhad

Abstract

β-sheet topology prediction is a major unresolved problem in modern computational biology. It is a challenging intermediate step toward the protein tertiary structure prediction. Different methods have been provided to deal with the problem of determining the β-sheet topology. Here, ab-initio probability-based methods called "BetaProbe1" and "BetaProbe2" are utilized to specify the β-sheet topology. In these methods, the stability and the frequency of β-strand pairwise interaction and β-sheet conformation are spotted. To predict more frequent interactions between β-strand pairs, besides pairwise alignment probability, the probability of occurring β-strand pairwise interaction is considered to compute the score of the interactions. Furthermore, to determine the β-strand pairwise alignment probability more accurately, a dynamic programming approach is utilized. In addition, the integer programming optimization is combined with the probabilities of β-strand pairwise interactions to determine the β-sheet topology. Moreover, the β-sheet conformation probability is considered to give better chances to more observed conformations for selection. Experimental results show that BetaProbe1 and BetaProbe2 significantly outperform the most recent integer programming-based method with respect to β-sheet topology prediction.

Keywords


[1] J. Peng, “Statistical inference for template-based protein structure prediction,” Doctoral thesis, Toyota Technological Institute at Chicago, 2013.
[2] C. W. O’Donnell, “Ensemble modeling of beta-sheet proteins,” PhD thesis, Massachusetts Institute of Technology, 2011.
[3] M. J. Sternberg and J. M. Thornton, “On the conformation of proteins: an analysis of beta-pleated sheets,” J. Mol. Biol., vol. 110, no. 2, pp. 285–296, 1977.
[4] J. Cheng and P. Baldi, “Three-stage prediction of protein β-sheets by neural networks, alignments and graph algorithms,” Bioinformatics, vol. 21, no. suppl 1, pp. i75–i84, 2005.
[5] A. Subramani and C. A. Floudas, “Beta-Sheet Topology Prediction With High Precision and Recall for Beta and Mixed alpha/beta Proteins,” PLoS One, vol. 7, no. 3, 2012.
[6] S. M. Zaremba and L. M. Gregoret, “Context-dependence of Amino Acid Residue Pairing in Antiparallel β-Sheets,” J. Mol. Biol., vol. 291, no. 2, pp. 463–479, 1999.
[7] I. Ruczinski, C. Kooperberg, R. Bonneau, and D. Baker, “Distributions of beta sheets in proteins with application to structure prediction,” Proteins Struct. Funct. Bioinforma., vol. 48, no. 1, pp. 85–97, 2002.
[8] T. Kortemme, “Design of a 20-Amino Acid, Three-Stranded -Sheet Protein,” Science, vol. 281, no. 5374, pp. 253–256, Jul. 1998.
[9] B. Kuhlman, G. Dantas, G. C. Ireton, G. Varani, B. L. Stoddard, and D. Baker, “Design of a novel globular protein fold with atomic-level accuracy,” Science, vol. 302, no. 5649, pp. 1364–1368, 2003.
[10] J. S. Merkel and L. Regan, “Modulating protein folding rates in vivo and in vitro by side-chain interactions between the parallel β strands of green fluorescent protein,” J. Biol. Chem., vol. 275, no. 38, pp. 29200–29206, 2000.
[11] Y. Mandel-Gutfreund, S. M. Zaremba, and L. M. Gregoret, “Contributions of residue pairing to β-sheet formation: conservation and covariation of amino acid residue pairs on antiparallel β-strands,” J. Mol. Biol., vol. 305, no. 5, pp. 1145–1159, 2001.
[12] M. Eghdami, T. Dehghani, and M. Naghibzadeh, “BetaProbe: A probability based method for predicting beta sheet topology using integer programming,” in Computer and Knowledge Engineering (ICCKE), 2015 5th International Conference on, 2015, pp. 152–157.
[13] A. N. Tegge, Z. Wang, J. Eickholt, and J. Cheng, “NNcon: improved protein contact map prediction using 2D-recursive neural networks,” Nucleic Acids Res., vol. 37, no. suppl 2, pp. W515–W518, 2009.
[14] J. Eickholt and J. Cheng, “Predicting protein residue–residue contacts using deep networks and boosting,” Bioinformatics, vol. 28, no. 23, pp. 3066–3072, 2012.
[15] J. Cheng and P. Baldi, “Improved residue contact prediction using support vector machines and a large feature set,” BMC Bioinformatics, vol. 8, no. 1, pp. 113–121, 2007.
[16] D. Baú, A. J. M. Martin, C. Mooney, A. Vullo, I. Walsh, and G. Pollastri, “Distill: a suite of web servers for the prediction of one-, two-and three-dimensional structural features of proteins,” BMC Bioinformatics, vol. 7, no. 1, p. 402, 2006.
[17] P. Di Lena, K. Nagata, and P. Baldi, “Deep architectures for protein contact map prediction,” Bioinformatics, vol. 28, no. 19, pp. 2449–2457, 2012.
[18] D. T. Jones, D. W. A. Buchan, D. Cozzetto, and M. Pontil, “PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments,” Bioinformatics, vol. 28, no. 2, pp. 184–190, 2012.
[19] Z. Wang and J. Xu, “Predicting protein contact map using evolutionary and physical constraints by integer programming,” Bioinformatics, vol. 29, no. 13, pp. i266–i273, 2013.
[20] A. Kumar and L. Cowen, “Recognition of beta-structural motifs using hidden Markov models trained with simulated evolution,” Bioinformatics, vol. 26, no. 12, pp. i287–i293, 2010.
[21] N. M. Daniels, R. Hosur, B. Berger, and L. J. Cowen, “SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone,” Bioinformatics, vol. 28, no. 9, pp. 1216–1222, 2012.
[22] N. M. Daniels, A. Gallant, N. Ramsey, and L. J. Cowen, “MRFy: remote homology detection for beta-structural proteins using Markov random fields and stochastic search,” Comput. Biol. Bioinformatics, IEEE/ACM Trans., vol. 12, no. 1, pp. 4–16, 2015.
[23] T. J. P. Hubbard, “Use of beta-strand interaction pseudo-potentials in protein structure prediction and modelling,” in System Sciences, 1994. Proceedings of the Twenty-Seventh Hawaii International Conference on, 1994, vol. 5, pp. 336–344.
[24] R. E. Steward and J. M. Thornton, “Prediction of strand pairing in antiparallel and parallel beta‐sheets using information theory,” Proteins Struct. Funct. Bioinforma., vol. 48, no. 2, pp. 178–191, 2002.
[25] Z. Aydin, Y. Altunbasak, and H. Erdogan, “Bayesian models and algorithms for protein β-sheet prediction,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 8, no. 2, pp. 395–409, 2011.
[26] C. Savojardo, P. Fariselli, P. L. Martelli, and R. Casadio, “BCov: a method for predicting β-sheet topology using sparse inverse covariance estimation and integer programming,” Bioinformatics, pp. 3151–3157, 2013.
[27] J. Jeong, P. Berman, and T. M. Przytycka, “Improving strand pairing prediction through exploring folding cooperativity,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 5, no. 4, pp. 484–491, 2008.
[28] M. Lippi and P. Frasconi, “Prediction of protein β-residue contacts by Markov logic networks with grounding-specific weights,” Bioinformatics, vol. 25, no. 18, pp. 2326–2333, 2009.
[29] R. Fonseca, G. Helles, and P. Winter, “Ranking Beta Sheet Topologies with Applications to Protein Structure Prediction,” J. Math. Model. Algorithms, vol. 10, no. 4, pp. 357–369, 2011.
[30] R. Rajgaria, Y. Wei, and C. A. Floudas, “Contact prediction for beta and alpha‐beta proteins using integer linear optimization and its impact on the first principles 3D structure prediction method ASTRO-FOLD,” Proteins Struct. Funct. Bioinforma., vol. 78, no. 8, pp. 1825–1846, 2010.
[31] D. T. Jones, “Protein secondary structure prediction based on position-specific scoring matrices,” J. Mol. Biol., vol. 292, no. 2, pp. 195–202, 1999.
[32] I. Ruczinski, “Logic regression and statistical issues related to the protein folding problem,” PhD thesis, University of Washington, 2000.
[33] S. B. Needleman and C. D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” J. Mol. Biol., vol. 48, no. 3, pp. 443–453, 1970.
[34] O. Gotoh, “An improved algorithm for matching biological sequences,” J. Mol. Biol., vol. 162, no. 3, pp. 705–708, 1982.
[35] W. Kabsch and C. Sander, “Dictionary of protein secondary structure: pattern recognition of hydrogen‐bonded and geometrical features,” Biopolymers, vol. 22, no. 12, pp. 2577–2637, 1983.
CAPTCHA Image