The treatment of influenza virus-infected patients is mainly based on increasing resistance against currently approved

We used compositional features including amino acid composition, Split Amino Acid Composition, Di-peptide Compositionand gapped dipeptide composition, as well as evolutionary information from the Position-Specific Scoring Matrixprofiles obtained from Position-Specific Iterative-Basic Local Alignment Search Toolfor training the classifiers. Given the immense biomedical merit and therapeutic potential of CDKIs, we anticipate that this would be a useful tool for the research community. Three iterations of PSI-BLAST were carried out at an E-value threshold of 0.001. Each sequence was used as the query sequence once while the rest were used as the reference database and this was looped over each sequence. It was found that 10 sequences did not find any significant hit, bringing forth that general methods of similarity-based searches do not provide a reliable solution to the identification of CDKIs and a method specific to these proteins should be developed. Therefore, we set forth to explore machinelearning based methods based on various protein features for the prediction of CDKI proteins. PSSM-based SVM classifiers are have been employed for a plethora of classification problems in biology and are well known for their remarkable performance for extremely diverse proteins like lipocalins, nucleic acid binding proteins, etc. Apart from capturing residue composition, the PSSM profiles encapsulate useful information about conservation of residues at crucial positions within the protein sequence, because in evolution the amino acid residues with similar physico-chemical properties tend to be highly conserved due to selective pressure. This is the first report of a machine-learning-based method for identification of CDKI protein sequences. Previously, such approaches have been applied to the computational identification of other components of the cell cycle including cyclinsand CDK phosphorylation substrates. Our tool simply represents a complementary tool to allow the detection of CDK inhibitors, since we prove that PFAM signatures miss a significant number of the already known CDK inhibitors from the Silmitasertib PKC inhibitor non-redundant set. However such machine learning based methods do come at the cost of some false positive predictions, which should be as minimal as possible. We tested this on an independent dataset comprising of randomly picked up nonCDKIs as well as kinases and phosphatases which are the most likely candidates for false positive predictions and indeed obtained a low false positive prediction rate. In this study, we observed that SVM based methods are more efficient than ANN in discrimination of CDKI and non-CDKI sequences, despite the imbalance in the size of the positive and the negative training datasets. This was observed with all the types of features. For an experimenter, a judicious approach would be minimizing the number of CDKIs to be characterized by increasing the threshold to higher SVM score, in order to get only the topmost candidates for further work. Supplementing these with other complementary evidence like domain WZ8040 EGFR/HER2 inhibitor knowledge and sub-cellular localization may provide inroads to the discovery of novel CDKIs and further our understanding of cell cycle regulation and other cellular phenomena. In future, the availability of more sequences and inclusion of more features may further enhance the prediction accuracy. While the current H1N1 influenzapandemic was ongoing in 2010, efforts were made to develop new antiviral agents for influenza treatment that possess an improved spectrum of activity or better pharmacologic profiles, compared to current treatments.

Leave a Reply