Access the full text.
Sign up today, get DeepDyve free for 14 days.
Yixiang Fang, Haijun Zhang, Yunming Ye, Xutao Li (2014)
Detecting hot topics from Twitter: A multiview approachJournal of Information Science, 40
Kevin Rosa, Jeffrey Ellen (2009)
Text Classification Methodologies Applied to Micro-Text in Military Chat2009 International Conference on Machine Learning and Applications
Yue Wu, S. Hoi, Tao Mei (2014)
Massive-scale Online Feature Selection for Sparse Ultra-high Dimensional DataArXiv, abs/1409.7794
H. Becker, Mor Naaman, L. Gravano (2011)
Beyond Trending Topics: Real-World Event Identification on TwitterProceedings of the International AAAI Conference on Web and Social Media
P. Marsden, N. Friedkin (1993)
Network Studies of Social InfluenceSociological Methods & Research, 22
Hassan Saif, Miriam Fernández, Yulan He, Harith Alani (2014)
On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Online Group Feature Selection
M. McPherson, L. Smith-Lovin, J. Cook (2001)
Birds of a Feather: Homophily in Social NetworksReview of Sociology, 27
Bing-kun Wang, Yongfeng Huang, Wanxia Yang, Xing Li (2012)
Short text classification based on strong feature thesaurusJournal of Zhejiang University SCIENCE C, 13
S. Perkins, J. Theiler (2003)
Online Feature Selection using Grafting
(2014)
Clustering tweets using wikipedia concepts
Zitao Liu, Wenchao Yu, Wei Chen, Shuran Wang, Fengyi Wu (2010)
Short Text Feature Selection for Micro-Blog Mining2010 International Conference on Computational Intelligence and Software Engineering
Salem Alelyani, Jiliang Tang, Huan Liu (2018)
Feature Selection for Clustering: A Review
George Forman (2003)
An Extensive Empirical Study of Feature Selection Metrics for Text ClassificationJ. Mach. Learn. Res., 3
Jiliang Tang, Huan Liu (2012)
Feature Selection with Linked Data in Social Media
Long Jiang, Mo Yu, M. Zhou, Xiaohua Liu, T. Zhao (2011)
Target-dependent Twitter Sentiment Classification
M. Alexandrov, Alexander Gelbukh, Paolo Rosso (2005)
An Approach to Clustering Abstracts
Olena Medelyan, C. Legg, David Milne, I. Witten (2008)
Mining Meaning from WikipediaArXiv, abs/0809.4530
Y Saeys, Inza In, P Larrañaga (2007)
A review of feature selection techniques in bioinformaticsBioinformatics, 23
P. Rafeeque, S. Sendhilkumar (2011)
A survey on Short text analysis in Web2011 Third International Conference on Advanced Computing
Lei Yu, Huan Liu (2004)
Efficient Feature Selection via Analysis of Relevance and RedundancyJ. Mach. Learn. Res., 5
Yiming Yang, Jan Pedersen (1997)
A Comparative Study on Feature Selection in Text Categorization
Jiliang Tang, Xia Hu, Huiji Gao, Huan Liu (2013)
Unsupervised Feature Selection for Multi-View Data in Social Media
Helmut Schmidt (1994)
Probabilistic part-of-speech tagging using decision trees
Jialei Wang, P. Zhao, S. Hoi, Rong Jin (2014)
Online Feature Selection and Its ApplicationsIEEE Transactions on Knowledge and Data Engineering, 26
Xia Hu, Nan Sun, Chao Zhang, Tat-Seng Chua (2009)
Exploiting internal and external semantics for the clustering of short texts using world knowledgeProceedings of the 18th ACM conference on Information and knowledge management
Quanquan Gu, Jiawei Han (2011)
Towards feature selection in network
Jundong Li, Xia Hu, Jiliang Tang, Huan Liu (2015)
Unsupervised Streaming Feature Selection in Social MediaProceedings of the 24th ACM International on Conference on Information and Knowledge Management
Jiliang Tang, Xufei Wang, Huiji Gao, Xia Hu, Huan Liu (2012)
Enriching short text representation in microblog for clusteringFrontiers of Computer Science, 6
Hassan Saif, Yulan He, Harith Alani (2012)
Alleviating Data Sparsity for Twitter Sentiment Analysis
Duyu Tang, Furu Wei, Nan Yang, M. Zhou, Ting Liu, Bing Qin (2014)
Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification
Ou Jin, N. Liu, Kai Zhao, Yong Yu, Qiang Yang (2011)
Transferring topical knowledge from auxiliary long texts for short text clustering
Bioinformatics Advance Access published August 24, 2007 A review of feature selection techniques in bioinformatics
S. Hoi, Jialei Wang, P. Zhao, Rong Jin (2012)
Online feature selection for mining big data
Tomas Mikolov, Ilya Sutskever, Kai Chen, G. Corrado, J. Dean (2013)
Distributed Representations of Words and Phrases and their Compositionality
A. Zubiaga, Damiano Spina, Raquel Martínez-Unanue, Víctor Fresno-Fernández (2014)
Real‐time classification of Twitter trendsJournal of the Association for Information Science and Technology, 66
Isabelle Guyon, A. Elisseeff (2003)
An Introduction to Variable and Feature SelectionJ. Mach. Learn. Res., 3
F. Sebastiani (2001)
Machine learning in automated text categorizationArXiv, cs.IR/0110053
Huan Liu, Lei Yu (2005)
Toward integrating feature selection algorithms for classification and clusteringIEEE Transactions on Knowledge and Data Engineering, 17
Xindong Wu, Kui Yu, Hao Wang, W. Ding (2010)
Online Streaming Feature Selection
Jing Zhou, Dean Foster, R. Stine, L. Ungar (2006)
Streamwise Feature SelectionJ. Mach. Learn. Res., 7
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Short Text Classification Improved by Learning Multi-Granularity Topics
Y Han, L Yu (2012)
A variance reduction framework for stable feature selectionStat Anal Data Min, 5
(2005)
Algorithms, Experimentation
P. Moradi, M. Rostami (2015)
A graph theoretic approach for unsupervised feature selectionEng. Appl. Artif. Intell., 44
Jiliang Tang, Huan Liu (2014)
Feature Selection for Social Media DataACM Trans. Knowl. Discov. Data, 8
(2008)
Chapman & Hall/CRC Data Mining and Knowledge Discovery Series
Chenliang Li, Aixin Sun, Anwitaman Datta (2012)
Twevent: segment-based event detection from tweetsProceedings of the 21st ACM international conference on Information and knowledge management
Yue Han, Lei Yu (2010)
A Variance Reduction Framework for Stable Feature Selection2010 IEEE International Conference on Data Mining
Li Dong, Furu Wei, Chuanqi Tan, Duyu Tang, M. Zhou, Ke Xu (2014)
Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification
Zongyang Ma, Aixin Sun, G. Cong (2013)
On predicting the popularity of newly emerging hashtags in TwitterJ. Assoc. Inf. Sci. Technol., 64
George John, Ron Kohavi, Karl Pfleger (1994)
Irrelevant Features and the Subset Selection Problem
Sudha Verma, Sarah Vieweg, William Corvey, L. Palen, James Martin, Martha Palmer, Aaron Schram, K. Anderson (2011)
Natural Language Processing to the Rescue? Extracting "Situational Awareness" Tweets During Mass EmergencyProceedings of the International AAAI Conference on Web and Social Media
Yansong Peng, Z. Xuefeng, Zhuo Jianyong, Xiao Yumhong (2009)
Lazy learner text categorization algorithm based on embedded feature selectionJournal of Systems Engineering and Electronics, 20
E. Airoldi, D. Blei, S. Fienberg, E. Xing (2007)
Mixed Membership Stochastic BlockmodelsJournal of machine learning research : JMLR, 9
Silvio Amir, Miguel Almeida, Bruno Martins, João Filgueiras, Mário Silva (2014)
TUGAS: Exploiting unlabelled data for Twitter sentiment analysis
Özer Özdikis, P. Senkul, Halit Oğuztüzün (2012)
Semantic Expansion of Tweet Contents for Enhanced Event Detection in Twitter2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
V. Strassen (1969)
Gaussian elimination is not optimalNumerische Mathematik, 13
Jiliang Tang, Salem Alelyani, Huan Liu (2014)
Feature Selection for Classification: A Review
Jeffrey Pennington, R. Socher, Christopher Manning (2014)
GloVe: Global Vectors for Word Representation
Aliaksei Severyn, Alessandro Moschitti (2015)
Twitter Sentiment Analysis with Deep Convolutional Neural NetworksProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval
E. Gabrilovich, Shaul Markovitch (2006)
Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge
Fernando Pérez-Téllez, David Pinto, J. Cardiff, Paolo Rosso (2010)
On the difficulty of clustering company tweets
Jiliang Tang, Huan Liu (2014)
An Unsupervised Feature Selection Framework for Social Media DataIEEE Transactions on Knowledge and Data Engineering, 26
George Forman (2004)
A pitfall and solution in multi-class feature selection for text classificationProceedings of the twenty-first international conference on Machine learning
S. Perkins, Kevin Lacker, J. Theiler (2003)
Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function SpaceJ. Mach. Learn. Res., 3
Salem Alelyani, Huan Liu, Lei Wang (2011)
The Effect of the Characteristics of the Dataset on the Selection Stability2011 IEEE 23rd International Conference on Tools with Artificial Intelligence
P. Ferragina, Ugo Scaiella (2010)
Fast and Accurate Annotation of Short Texts with Wikipedia PagesIEEE Software, 29
Social networking sites such as Facebook or Twitter attract millions of users, who everyday post an enormous amount of content in the form of tweets, comments and posts. Since social network texts are usually short, learning tasks have to deal with a very high dimensional and sparse feature space, in which most features have low frequencies. As a result, extracting useful knowledge from such noisy data is a challenging task, that converts large-scale short-text learning tasks in social environments into one of the most relevant problems in machine learning and data mining. Feature selection is one of the most known and commonly used techniques for reducing the impact of the high dimensional feature space in text learning. A wide variety of feature selection techniques can be found in the literature applied to traditional, long-texts and document collections. However, short-texts coming from the social Web pose new challenges to this well-studied problem as texts’ shortness offers a limited context to extract enough statistical evidence about words relations (e.g. correlation), and instances usually arrive in continuous streams (e.g. Twitter timeline), so that the number of features and instances is unknown, among other problems. This paper surveys feature selection techniques for dealing with short texts in both offline and online settings. Then, open issues and research opportunities for performing online feature selection over social media data are discussed.
Artificial Intelligence Review – Springer Journals
Published: Nov 15, 2016
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.