Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Classification by compression: Application of information-theory methods for the identification of themes of scientific texts

Classification by compression: Application of information-theory methods for the identification... A method for automatic classification of scientific texts based on data compression is proposed. The method is implemented and investigated based on the data from an archive of scientific texts (arXiv.org) and in the CyberLeninka scientific electronic library (CyberLeninka.ru). Experiments showed that the method correctly identified the themes of scientific texts with a probability of 75–95%; its accuracy depends on the quality of the original data. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Automatic Documentation and Mathematical Linguistics Springer Journals

Classification by compression: Application of information-theory methods for the identification of themes of scientific texts

Loading next page...
 
/lp/springer-journals/classification-by-compression-application-of-information-theory-SFsx0Fqbrk
Publisher
Springer Journals
Copyright
Copyright © 2017 by Allerton Press, Inc.
Subject
Computer Science; Information Storage and Retrieval
ISSN
0005-1055
eISSN
1934-8371
DOI
10.3103/S0005105517030116
Publisher site
See Article on Publisher Site

Abstract

A method for automatic classification of scientific texts based on data compression is proposed. The method is implemented and investigated based on the data from an archive of scientific texts (arXiv.org) and in the CyberLeninka scientific electronic library (CyberLeninka.ru). Experiments showed that the method correctly identified the themes of scientific texts with a probability of 75–95%; its accuracy depends on the quality of the original data.

Journal

Automatic Documentation and Mathematical LinguisticsSpringer Journals

Published: Aug 19, 2017

References