Access the full text.
Sign up today, get DeepDyve free for 14 days.
(2015)
CDD: NCBI’s conserved domain databaseNucleic Acids Res, 43
(2007)
Texture analysis of MR image for predicting the firmness of Huanghua pears (Pyrus pyrifolia Nakai, cv. Huanghua) during storage using an artificial neural networkMagn Reson Imaging, 25
(2005)
The 2-His-1-carboxylate facial triad: a versatile platform for dioxygen activation by mononuclear non-heme iron(II) enzymesJ Biol Inorg Chem, 10
(2008)
Prediction of aminoglycoside response against methicillin-resistant Staphylococcus aureus infection in burn patients by artificial neural network modelingBiomed Pharmacother, 62
(2006)
Structural studies on 2-oxoglutarate oxygenases and related double-stranded beta-helix fold proteinsJ Inorg Biochem, 100
(2013)
Artificial neural network models for predicting 1-year mortality in elderly patients with intertrochanteric fractures in ChinaBraz J Med Biol Res, 46
(2012)
Endo-(1,4)-beta-glucanase gene families in the grasses: temporal and spatial co-transcription of orthologous genesBMC Plant Biol, 12
(2016)
Predicting the performance of multi-media filters using artificial neural networksWater Sci Technol, 74
(2014)
Clustal omega, accurate alignment of very large numbers of sequencesMethods Mol Biol, 1079
(2002)
Towards understanding the role of membrane-bound endo-beta-1,4-glucanases in cellulose biosynthesisPlant Cell Physiol, 43
(2013)
Global identification of multiple OsGH9 family members and their involvement in cellulose crystallinity modification in ricePLoS ONE, 8
(2016)
A new secondary structure assignment algorithm using calpha backbone fragmentsInt J Mol Sci, 17
(2012)
Distribution and prediction of catalytic domains in 2-oxoglutarate dependent dioxygenasesBMC Res Notes, 5
(2004)
Phylogenetic analysis of the plant endo-beta-1,4-glucanase gene familyJ Mol Evol, 58
(2015)
HMMER web server: 2015 updateNucleic Acids Res, 43
(2007)
Structural organization and a standardized nomenclature for plant endo-1,4-beta-glucanases (cellulases) of glycosyl hydrolase family 9Plant Physiol, 144
(2017)
Artificial neural networks (ANN) for the simultaneous spectrophotometric determination of fluoxetine and sertraline in pharmaceutical formulations and biological fluidIran J Pharm Res, 16
(2017)
Mathematical basis of improved protein subfamily classification by a HMM-based sequence filterMath Biosci, 293
(2005)
Protein secondary structure assignment revisited: a detailed analysis of different assignment methodsBMC Struct Biol, 5
(2015)
A comparison of logistic regression model and artificial neural networks in predicting of Student’s Academic failureActa Inform Med, 23
(2010)
Circular permutation provides an evolutionary link between two families of calcium-dependent carbohydrate binding modulesJ Biol Chem, 285
(2004)
FeII/alpha-ketoglutarate-dependent hydroxylases and related enzymesCrit Rev Biochem Mol Biol, 39
(2015)
Unity in diversity, a systems approach to regulating plant cell physiology by 2-oxoglutarate-dependent dioxygenasesFront Plant Sci, 6
(1983)
Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical featuresBiopolymers, 22
(2016)
In silico identification and taxonomic distribution of plant class C GH9 endoglucanasesFront Plant Sci, 7
(2014)
The carbohydrate-active enzymes database (CAZy) in 2013Nucleic Acids Res, 42
(1995)
Knowledge-based protein secondary structure assignmentProteins, 23
(2014)
Alignment-Annotator web server: rendering and annotating sequence alignmentsNucleic Acids Res, 42
The accurate annotation of an unknown protein sequence depends on extant data of template sequences. This could be empirical or sets of reference sequences, and provides an exhaustive pool of probable functions. Individual methods of predicting dominant function possess shortcomings such as varying degrees of inter-sequence redundancy, arbitrary domain inclusion thresholds, heterogeneous parameterization protocols, and ill-conditioned input channels. Here, I present a rigorous theoretical derivation of various steps of a generic algorithm that integrates and utilizes several statistical methods to predict the dominant function in unknown protein sequences. The accompanying mathematical proofs, interval definitions, analysis, and numerical computations presented are meant to offer insights not only into the specificity and accuracy of predictions, but also provide details of the operatic mechanisms involved in the integration and its ensuing rigor. The algorithm uses numerically modified raw hidden markov model scores of well defined sets of training sequences and clusters them on the basis of known function. The results are then fed into an artificial neural network, the predictions of which can be refined using the available data. This pipeline is trained recursively and can be used to discern the dominant principal function, and thereby, annotate an unknown protein sequence. Whilst, the approach is complex, the specificity of the final predictions can benefit laboratory workers design their experiments with greater confidence.
Acta Biotheoretica – Springer Journals
Published: Apr 26, 2018
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.