Access the full text.
Sign up today, get DeepDyve free for 14 days.
Feng Wang, Tianhua Xu, T. Tang, Mengchu Zhou, Haifeng Wang (2017)
Bilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway SystemsIEEE Transactions on Intelligent Transportation Systems, 18
Yanjun Li, C. Luo, S. Chung (2008)
Text Clustering with Feature Selection by Using Statistical DataIEEE Transactions on Knowledge and Data Engineering, 20
Ion Androutsopoulos
Hongfang Zhou, Yingjie Zhang, Hongjiang Liu, Yao Zhang (2018)
Feature Selection Based on Term Frequency Reordering of Document LevelIEEE Access, 6
F. Kermani, E. Eslami, F. Sadeghi (2019)
Global Filter-Wrapper method based on class-dependent correlation for text classificationEng. Appl. Artif. Intell., 85
Hellen Nassuna, Odongo Eyobu, Jaehoon Kim, Dongik Lee (2019)
Feature Selection Based on Variance Distribution of Power Spectral Density for Driving Behavior Recognition2019 14th IEEE Conference on Industrial Electronics and Applications (ICIEA)
T. Gadekallu, A. Soni, Deeptanu Sarkar, Lakshmanna Kuruva (2019)
Application of Sentiment Analysis in Movie reviewsAdvances in Business Information Systems and Analytics
(2017)
2017. 20 Newsgroups. Retrieved from https://www.kaggle.com/crawford/20-newsgroups
ShangChangxing, Limin, FengShengzhong, JiangQingshan, FanJianping (2013)
Feature selection via maximizing global information gain for text classificationKnowledge Based Systems
Andrea Bommert, Xudong Sun, B. Bischl, J. Rahnenführer, Michel Lang (2009)
Computational Statistics and Data Analysis
NLTK Data (2017)
ReutersRetrieved from https://www.kaggle.com/nltkdata/reuters.
George Forman (2003)
An Extensive Empirical Study of Feature Selection Metrics for Text ClassificationJ. Mach. Learn. Res., 3
G. Reddy, M. Reddy, Kuruva Lakshmanna, Rajesh Kaluri, D. Rajput, Gautam Srivastava, T. Baker (2020)
Analysis of Dimensionality Reduction Techniques on Big DataIEEE Access, 8
Rui Huang, Weidong Jiang, Guangling Sun (2018)
Manifold-based constraint Laplacian score for multi-label feature selectionPattern Recognit. Lett., 112
Yun Jiang, Xi Liu, Guolei Yan, Jize Xiao (2017)
Modified Binary Cuckoo Search for Feature Selection: A Hybrid Filter-Wrapper Approach2017 13th International Conference on Computational Intelligence and Security (CIS)
Junaid Rashid, Syed Shah, Aun Irtaza, Toqeer Mahmood, M. Nisar, M. Shafiq, A. Gardezi (2019)
Topic Modeling Technique for Text Mining Over Biomedical Text Corpora Through Hybrid Inverse Documents Frequency and Fuzzy K-Means ClusteringIEEE Access, 7
Lingyun Xiang, Xingming Sun, Gang Luo, Bin Xia (2014)
Linguistic steganalysis using the features derived from synonym frequencyMultimedia Tools and Applications, 71
LeeChangki, LeeGary Geunbae (2006)
Information gain and divergence-based feature selection for machine learning-based text categorizationInformation Processing and Management
M. Parimala, R. M. Swarna Priya, M. Praveen Kumar Reddy, Chiranji Lal Chowdhary, Ravi Kumar Poluru, Suleman Khan
[nSpatiotemporal-based sentiment analysis on tweets for risk assessment of event using deep learning approach. Softw.: Pract. Exp.
Chuan Wan, Yuling Wang, Yao Liu, Jinchao Ji, Guozhong Feng (2019)
Composite Feature Extraction and Selection for Text ClassificationIEEE Access, 7
Kyoungok Kim, See-Young Zzang (2019)
Trigonometric comparison measure: A feature selection method for text categorizationData Knowl. Eng., 119
Fei Peng, Die-lan Zhou, Min Long, Xingming Sun (2017)
Discrimination of natural images and computer generated graphics based on multi-fractal and regression analysisAeu-international Journal of Electronics and Communications, 71
Qian Chen, Gautam Srivastava, R. Parizi, M. Aloqaily, I. Ridhawi (2020)
An incentive-aware blockchain-based solution for internet of fake media thingsInf. Process. Manag., 57
M. Asghar, F. Subhan, Hussain Ahmad, W. Khan, S. Hakak, T. Gadekallu, M. Alazab (2020)
Senti‐eSystem: A sentiment‐based eSystem‐using hybridized fuzzy and deep neural network for measuring customer satisfactionSoftware: Practice and Experience, 51
Yong Zhang, Qi Wang, D. Gong, Xianfang Song (2019)
Nonnegative Laplacian embedding guided subspace learning for unsupervised feature selectionPattern Recognit., 93
Xiaonan Ji, Han-Wei Shen, Alan Ritter, R. Machiraju, Po-Yin Yen (2019)
Visual Exploration of Neural Document Embedding in Information Retrieval: Semantics and Feature SelectionIEEE Transactions on Visualization and Computer Graphics, 25
Andrea Bommert, Xudong Sun, Bernd Bischl, Rahnen, Michel Lang (2020)
Benchmark for filter methods for feature selection in high-dimensional classification dataComput. Stat. Data Anal, 143
F. Sebastiani (2001)
Machine learning in automated text categorizationArXiv, cs.IR/0110053
Wenjun Hu, K. Choi, Yonggen Gu, Shitong Wang (2013)
Minimum-maximum local structure information for feature selectionPattern Recognit. Lett., 34
Daoqiang Zhang, Songcan Chen, Zhi-Hua Zhou (2008)
Constraint Score: A new filter method for feature selection with pairwise constraintsPattern Recognit., 41
Bo Tang, S. Kay, Haibo He (2016)
Toward Optimal Feature Selection in Naive Bayes for Text CategorizationIEEE Transactions on Knowledge and Data Engineering, 28
A. Uysal (2018)
On Two-Stage Feature Selection Methods for Text ClassificationIEEE Access, 6
A. Jović, K. Brkić, N. Bogunovic (2015)
A review of feature selection methods with applications2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)
Mahdieh Labani, P. Moradi, M. Jalili (2020)
A multi-objective genetic algorithm for text feature selection using the relative discriminative criterionExpert Syst. Appl., 149
Sangman Kim, Jusung Park (2018)
Hybrid Feature Selection Method Based on Neural Networks and Cross-Validation for Liver Cancer With MicroarrayIEEE Access, 6
Nouman Azam, Jingtao Yao (2012)
Comparison of term frequency and document frequency based feature selection metrics in text categorizationExpert Syst. Appl., 39
Ron Kohavi, George John (1997)
Wrappers for Feature Subset SelectionArtif. Intell., 97
N. Suchetha, Anupama Nikhil, P. Hrudya (2019)
Comparing the Wrapper Feature Selection Evaluators on Twitter Sentiment Classification2019 International Conference on Computational Intelligence in Data Science (ICCIDS)
Ion Androutsopoulos
[nSpam Dataset. Retrieved from http://www2.aueb.gr/users/ion/data/enron-spam.
M. Parimala, R. Priya, M. Reddy, C. Chowdhary, Ravi Poluru, Suleman Khan (2020)
Spatiotemporal‐based sentiment analysis on tweets for risk assessment of event using deep learning approachSoftware: Practice and Experience, 51
Abdullah Ghareb, A. Bakar, A. Hamdan (2016)
Hybrid feature selection based on enhanced genetic algorithm for text categorizationExpert Syst. Appl., 49
A. Ashour, M. Nour, K. Polat, Yanhui Guo, W. Alsaggaf, A. El-Attar (2020)
A Novel Framework of Two Successive Feature Selection Levels Using Weight-Based Procedure for Voice-Loss Detection in Parkinson’s DiseaseIEEE Access, 8
ThippaReddy Gadekallu, Akshat Soni, Deeptanu Sarkar, Lakshmanna Kuruva (2019)
Application of sentiment analysis in movie reviewsSentiment Analysis and Knowledge Discovery in Contemporary Business. IGI Global
Deepak Agnihotri, K. Verma, Priyanka Tripathi (2017)
Variable Global Feature Selection Scheme for automatic classification of text documentsExpert Syst. Appl., 81
(2019)
Sentiment Analysis and Knowledge Discovery in Contemporary BusinessAdvances in Business Information Systems and Analytics
Hayri Agun, Ozgur Yilmazel (2019)
Incorporating Topic Information in a Global Feature Selection Schema for Authorship AttributionIEEE Access, 7
Changki Lee, G. Lee (2006)
Information gain and divergence-based feature selection for machine learning-based text categorizationInf. Process. Manag., 42
Chris Crawford (2017)
20 NewsgroupsRetrieved from https://www.kaggle.com/crawford/20-newsgroups.
Mariam Kalakech, P. Biela, L. Macaire, D. Hamad (2011)
Constraint scores for semi-supervised feature selection: A comparative studyPattern Recognit. Lett., 32
Lanlan Chen, Ao Zhang, Xiao-guang Lou (2019)
Cross-subject driver status detection from physiological signals based on hybrid feature selection and transfer learningExpert Syst. Appl., 137
L. Abualigah, A. Khader, M. Al-Betar, O. Alomari (2017)
Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clusteringExpert Syst. Appl., 84
Zhaohong Deng, K. Chung, Shitong Wang (2010)
Robust Relief-Feature Weighting, Margin Maximization, and Fuzzy OptimizationIEEE Transactions on Fuzzy Systems, 18
A. Rehman, K. Javed, H. Babri (2017)
Feature selection based on a normalized difference measure for text classificationInf. Process. Manag., 53
Changxing Shang, Min Li, Shengzhong Feng, Qingshan Jiang, Jianping Fan (2013)
Feature selection via maximizing global information gain for text classificationKnowl. Based Syst., 54
A. P, S. G, Praveen Maddikunta, T. Gadekallu, A. Al-Ahmari, M. Abidi (2020)
Location Based Business Recommendation Using Spatial DemandSustainability
Le-Bing Zhang, Fei Peng, Le Qin, Min Long (2018)
Face spoofing detection based on color texture Markov feature and support vector machine recursive feature eliminationJ. Vis. Commun. Image Represent., 51
As the number of digital text documents increases on a daily basis, the classification of text is becoming a challenging task. Each text document consists of a large number of words (or features) that drive down the efficiency of a classification algorithm. This article presents an optimized feature selection algorithm designed to reduce a large number of features to improve the accuracy of the text classification algorithm. The proposed algorithm uses noun-based filtering, a word ranking that enhances the performance of the text classification algorithm. Experiments are carried out on three benchmark datasets, and the results show that the proposed classification algorithm has achieved the maximum accuracy when compared to the existing algorithms. The proposed algorithm is compared to Term Frequency-Inverse Document Frequency, Balanced Accuracy Measure, GINI Index, Information Gain, and Chi-Square. The experimental results clearly show the strength of the proposed algorithm.
ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) – Association for Computing Machinery
Published: May 6, 2021
Keywords: Feature selection
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.