Access the full text.
Sign up today, get DeepDyve free for 14 days.
J. Weston, S. Chopra, Keith Adams (2014)
#TagSpace: Semantic Embeddings from Hashtags
Zhiguo Gong, Chan Cheang, Leong U (2005)
Web Query Expansion by WordNet
ThelwallMike, BuckleyKevan, PaltoglouGeorgios, CaiDi, Arvid Kappas (2010)
Sentiment in short strength detection informal textJ. Assoc. Inf. Sci. Technol., 61
Y. Liu, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun (2015)
Topical Word Embeddings
E. Huang, R. Socher, Christopher Manning, A. Ng (2012)
Improving Word Representations via Global Context and Multiple Word Prototypes
Shansong Yang, Weiming Lu, Dezhi Yang, Liang Yao, Baogang Wei (2015)
Short Text Understanding by Leveraging Knowledge into Topic Model
Felix Hill, Kyunghyun Cho, A. Korhonen, Yoshua Bengio (2015)
Learning to Understand Phrases by Embedding the DictionaryTransactions of the Association for Computational Linguistics, 4
Sinno Pan, Xiaochuan Ni, Jian-Tao Sun, Qiang Yang, Zheng Chen (2010)
Cross-domain sentiment classification via spectral feature alignment
Rada Mihalcea, Dragomir Radev (2011)
Graph-based Natural Language Processing and Information RetrievalCambridge University Press.
Linfeng Song, Zhiguo Wang, Haitao Mi, D. Gildea (2016)
Sense Embedding Learning for Word Sense InductionArXiv, abs/1606.05409
Figure 6: Global Feature Expansion
Rada Mihalcea, Dragomir Radev (2011)
Graph-Based Natural Language Processing and Information Retrieval: Notations, Properties, and Representations
M. Charikar (2002)
Similarity estimation techniques from rounding algorithms
Tomas Mikolov, Kai Chen, Jeffrey Dean (2013)
Efficient estimation of word representation in vector space, In Proceedings of the International Conference on Learning Representations(CoRR’13)Efficient estimation of word representation in vector space
Lawrence Page, S. Brin, R. Motwani, T. Winograd (1999)
The PageRank Citation Ranking : Bringing Order to the Web, 98
José Camacho-Collados, Mohammad Pilehvar, Roberto Navigli (2015)
NASARI: a Novel Approach to a Semantically-Aware Representation of Items
Zhengdong Lu, Hang Li (2013)
A Deep Architecture for Matching Short Texts
Bing-kun Wang, Yongfeng Huang, Wanxia Yang, Xing Li (2012)
Short text classification based on strong feature thesaurusJournal of Zhejiang University SCIENCE C, 13
Jiwei Li, Dan Jurafsky (2015)
Do multi-sense embeddings improve natural language understanding? In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15)Association for Computational Linguistics
(1999)
Computational Linguistics, Doha, Qatar, 1059–1069
Hui Fang (2008)
A Re-examination of Query Expansion Using Lexical Resources
Wenpeng Hu, Jiajun Zhang, Nan Zheng (2016)
Different Contexts Lead to Different Word Embeddings
Mike Thelwall, K. Buckley, G. Paltoglou, Di Cai, Arvid Kappas (2010)
Sentiment in short strength detection informal textJournal of the Association for Information Science and Technology, 61
Xiaofei He, P. Niyogi (2003)
Locality Preserving Projections
(2012)
Association for Computational Linguistics, Denver, Colorado, 567–577
Yuan Man (2014)
Feature Extension for Short Text Categorization Using Frequent Term Sets
G. Miller (1995)
WordNet: A Lexical Database for EnglishCommun. ACM, 38
Alexandr Andoni, P. Indyk (2006)
Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06)
S. Hochreiter, J. Schmidhuber (1997)
Long Short-Term MemoryNeural Computation, 9
Zichao Dai, Aixin Sun, Xu-Ying Liu (2013)
Crest: Cluster-based Representation Enrichment for Short Text Classification
Christopher D. Manning, Hinrich Schutze (1999)
Foundations of Statistical Natural Language ProcessingMIT Press
C. Santos, M. Gatti (2014)
Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts
Discovery and Data Mining.
Shaohua Li, Tat-Seng Chua, Jun Zhu, C. Miao (2016)
Generative Topic Embedding: a Continuous Representation of DocumentsArXiv, abs/1606.02979
Ryan Kiros, Yukun Zhu, R. Salakhutdinov, R. Zemel, R. Urtasun, A. Torralba, S. Fidler (2015)
Skip-Thought Vectors
Dani Yogatama, Noah Smith (2014)
Making the Most of Bag of Words: Sentence Regularization with Alternating Direction Method of Multipliers
(2012)
University-SCIENCE C (Computers and Electronics
B. Pang, Lillian Lee (2005)
Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales
Arvind Neelakantan, Jeevan Shankar, Alexandre Passos, A. McCallum (2014)
Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector SpaceArXiv, abs/1504.06654
(2015)
Tat-Seng Chua, and Maosong Sun
Richard Johansson, Luis Piña (2015)
Embedding a Semantic Network in a Word Space
J. Reisinger, R. Mooney (2010)
Multi-Prototype Vector-Space Models of Word Meaning
John Blitzer, Mark Dredze, Fernando Pereira (2007)
Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification
Kyunghyun Cho, Bart Merrienboer, Dzmitry Bahdanau, Yoshua Bengio (2014)
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
John Blitzer, Ryan McDonald, Fernando Pereira (2006)
Domain Adaptation with Structural Correspondence Learning
Minqing Hu, Bing Liu (2004)
Mining and summarizing customer reviewsProceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Article 55. Publication date
Adithya Renduchintala, Rebecca Knowles, Philipp Koehn, Jason Eisner (2016)
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)The Association for Computational Linguistics
Sariel Har-Peled, P. Indyk, R. Motwani (1998)
Approximate nearest neighbors: towards removing the curse of dimensionalityTheory Comput., 8
Takeshi Sakaki, M. Okazaki, Y. Matsuo (2010)
Earthquake shakes Twitter users: real-time event detection by social sensors
Nitish Srivastava, Geoffrey Hinton, A. Krizhevsky, Ilya Sutskever, R. Salakhutdinov (2014)
Dropout: a simple way to prevent neural networks from overfittingJ. Mach. Learn. Res., 15
Juzheng Li, Jun Zhu, Bo Zhang (2016)
Discriminative Deep Random Walk for Network Classification, 1
Bryan Perozzi, Rami Al-Rfou, S. Skiena (2014)
DeepWalk: online learning of social representationsProceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
G. Salton, M. McGill (1983)
Introduction to Modern Information Retrieval
Felix Hill, Kyunghyun Cho, A. Korhonen (2016)
Learning Distributed Representations of Sentences from Unlabelled DataArXiv, abs/1602.03483
D. Blei, A. Ng, Michael Jordan (2009)
Latent Dirichlet Allocation
Yoon Kim (2014)
Convolutional Neural Networks for Sentence Classification
Ignacio Iacobacci, Mohammad Pilehvar, Roberto Navigli (2015)
SensEmbed: Learning Sense Embeddings for Word and Relational Similarity
Jiwei Li, Dan Jurafsky (2015)
Do Multi-Sense Embeddings Improve Natural Language Understanding?
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, Q. Mei (2015)
LINE: Large-scale Information Network EmbeddingProceedings of the 24th International Conference on World Wide Web
Quoc Le, Tomas Mikolov (2014)
Distributed Representations of Sentences and Documents
Aniket Rangrej, Sayali Kulkarni, Ashish Tendulkar (2011)
Comparative study of clustering techniques for short text documentsProceedings of the 20th international conference companion on World wide web
Yukun Zhu, Ryan Kiros, R. Zemel, R. Salakhutdinov, R. Urtasun, A. Torralba, S. Fidler (2015)
Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books2015 IEEE International Conference on Computer Vision (ICCV)
Deepak Ravichandran, Patrick Pantel, E. Hovy (2005)
Randomized Algorithms and NLP: Using Locality Sensitive Hash Functions for High Speed Noun Clustering
Pengfei Liu, Xipeng Qiu, Xuanjing Huang (2015)
Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model
Danushka Bollegala, Y. Matsuo, M. Ishizuka (2007)
Measuring semantic similarity between words using web search engines
Claudio Carpineto, Giovanni Romano (2012)
A Survey of Automatic Query Expansion in Information RetrievalACM Comput. Surv., 44
Jeffrey Pennington, R. Socher, Christopher Manning (2014)
GloVe: Global Vectors for Word Representation
Zornitsa Kozareva, E. Hovy (2010)
Not All Seeds Are Equal: Measuring the Quality of Text Mining Seeds
Haewoon Kwak, Changhyun Lee, Hosung Park, S. Moon (2010)
What is Twitter, a social network or a news media?
Hu Guan, Jingyu Zhou, Minyi Guo (2009)
A class-feature-centroid classifier for text categorization
G. Cong, Long Wang, Chin-Yew Lin, Young-In Song, Yueheng Sun (2008)
Finding question-answer pairs from online forums
Tomas Mikolov, Kai Chen, G. Corrado, J. Dean (2013)
Efficient Estimation of Word Representations in Vector Space
Christopher Manning, Hinrich Schütze (1999)
Book Reviews: Foundations of Statistical Natural Language Processing
B. Pang, Lillian Lee (2004)
A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum CutsArXiv, cs.CL/0409058
Bei Shi, Wai Lam, Shoaib Jameel, S. Schockaert, K. Lai (2017)
Jointly Learning Word Embeddings and Latent TopicsProceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
Xiaohui Yan, J. Guo, Yanyan Lan, Xueqi Cheng (2013)
A biterm topic model for short textsProceedings of the 22nd international conference on World Wide Web
Jiang Su, Jelber Shirab, S. Matwin (2011)
Large Scale Text Classification using Semisupervised Multinomial Naive Bayes
Ronan Collobert, J. Weston, L. Bottou, Michael Karlen, K. Kavukcuoglu, P. Kuksa (2011)
Natural Language Processing (Almost) from ScratchArXiv, abs/1103.0398
Short and sparse texts such as tweets, search engine snippets, product reviews, and chat messages are abundant on the Web. Classifying such short-texts into a pre-defined set of categories is a common problem that arises in various contexts, such as sentiment classification, spam detection, and information recommendation. The fundamental problem in short-text classification is feature sparseness -- the lack of feature overlap between a trained model and a test instance to be classified. We propose ClassiNet -- a network of classifiers trained for predicting missing features in a given instance, to overcome the feature sparseness problem. Using a set of unlabeled training instances, we first learn binary classifiers as feature predictors for predicting whether a particular feature occurs in a given instance. Next, each feature predictor is represented as a vertex vi in the ClassiNet, where a one-to-one correspondence exists between feature predictors and vertices. The weight of the directed edge eij connecting a vertex vi to a vertex vj represents the conditional probability that given vi exists in an instance, vj also exists in the same instance. We show that ClassiNets generalize word co-occurrence graphs by considering implicit co-occurrences between features. We extract numerous features from the trained ClassiNet to overcome feature sparseness. In particular, for a given instance x, we find similar features from ClassiNet that did not appear in x, and append those features in the representation of x. Moreover, we propose a method based on graph propagation to find features that are indirectly related to a given short-text. We evaluate ClassiNets on several benchmark datasets for short-text classification. Our experimental results show that by using ClassiNet, we can statistically significantly improve the accuracy in short-text classification tasks, without having to use any external resources such as thesauri for finding related features.
ACM Transactions on Knowledge Discovery from Data (TKDD) – Association for Computing Machinery
Published: Jun 27, 2018
Keywords: Classifier networks
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.