Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Expert, Journal, and Automatic Classification of Full Texts and Annotations of Scientific Articles

Expert, Journal, and Automatic Classification of Full Texts and Annotations of Scientific Articles —In this article we consider a fundamentally new information-theoretic approach to the classification of scientific texts based on compression algorithms. An analysis using the example of the comparative classification of full-text documents from arXiv.org and short annotations from Scopus showed that the accuracy of the proposed method is 87–92% and, in general, is not inferior to the existing ones. These conclusions were confirmed by an expert assessment. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Automatic Documentation and Mathematical Linguistics Springer Journals

Expert, Journal, and Automatic Classification of Full Texts and Annotations of Scientific Articles

Loading next page...
 
/lp/springer-journals/expert-journal-and-automatic-classification-of-full-texts-and-7lwIoi0v5a
Publisher
Springer Journals
Copyright
Copyright © Allerton Press, Inc. 2021. ISSN 0005-1055, Automatic Documentation and Mathematical Linguistics, 2021, Vol. 55, No. 4, pp. 178–189. © Allerton Press, Inc., 2021. Russian Text © The Author(s), 2021, published in Nauchno-Tekhnicheskaya Informatsiya, Seriya 2: Informatsionnye Protsessy i Sistemy, 2021, No. 8, pp. 15–27.
ISSN
0005-1055
eISSN
1934-8371
DOI
10.3103/s0005105521040075
Publisher site
See Article on Publisher Site

Abstract

—In this article we consider a fundamentally new information-theoretic approach to the classification of scientific texts based on compression algorithms. An analysis using the example of the comparative classification of full-text documents from arXiv.org and short annotations from Scopus showed that the accuracy of the proposed method is 87–92% and, in general, is not inferior to the existing ones. These conclusions were confirmed by an expert assessment.

Journal

Automatic Documentation and Mathematical LinguisticsSpringer Journals

Published: Jul 1, 2021

Keywords: text classification methods; data compression algorithms; scientific texts; arXiv.org; Scopus; k-nearest neighbors; logistic regression; random forests; naive Bayesian classification; support vector machines

References