Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Classifying dialogue in high-dimensional space

Classifying dialogue in high-dimensional space Classifying Dialogue in High-Dimensional Space JOSE P. GONZALEZ-BRENES and JACK MOSTOW, Carnegie Mellon University The richness of multimodal dialogue makes the space of possible features required to describe it very large relative to the amount of training data. However, conventional classi er learners require large amounts of data to avoid over tting, or do not generalize well to unseen examples. To learn dialogue classi ers using a rich feature set and fewer data points than features, we apply a recent technique, 1 -regularized logistic regression. We demonstrate this approach empirically on real data from Project LISTEN ™s Reading Tutor, which displays a story on a computer screen and listens to a child read aloud. We train a classi er to predict task completion (i.e., whether the student will nish reading the story) with 71% accuracy on a balanced, unseen test set. To characterize differences in the behavior of children when they choose the story they read, we likewise train and test a classi er that with 73.6% accuracy infers who chose the story based on the ensuing dialogue. Both classi ers signi cantly outperform baselines and reveal relevant features of the dialogue. Categories and Subject Descriptors: I.2.m [Arti http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on Speech and Language Processing (TSLP) Association for Computing Machinery

Loading next page...
 
/lp/association-for-computing-machinery/classifying-dialogue-in-high-dimensional-space-tAo4rEBLbf
Publisher
Association for Computing Machinery
Copyright
Copyright © 2011 by ACM Inc.
ISSN
1550-4875
DOI
10.1145/1966407.1966413
Publisher site
See Article on Publisher Site

Abstract

Classifying Dialogue in High-Dimensional Space JOSE P. GONZALEZ-BRENES and JACK MOSTOW, Carnegie Mellon University The richness of multimodal dialogue makes the space of possible features required to describe it very large relative to the amount of training data. However, conventional classi er learners require large amounts of data to avoid over tting, or do not generalize well to unseen examples. To learn dialogue classi ers using a rich feature set and fewer data points than features, we apply a recent technique, 1 -regularized logistic regression. We demonstrate this approach empirically on real data from Project LISTEN ™s Reading Tutor, which displays a story on a computer screen and listens to a child read aloud. We train a classi er to predict task completion (i.e., whether the student will nish reading the story) with 71% accuracy on a balanced, unseen test set. To characterize differences in the behavior of children when they choose the story they read, we likewise train and test a classi er that with 73.6% accuracy infers who chose the story based on the ensuing dialogue. Both classi ers signi cantly outperform baselines and reveal relevant features of the dialogue. Categories and Subject Descriptors: I.2.m [Arti

Journal

ACM Transactions on Speech and Language Processing (TSLP)Association for Computing Machinery

Published: May 1, 2011

There are no references for this article.