Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Querying Discriminative and Representative Samples for Batch Mode Active Learning

Querying Discriminative and Representative Samples for Batch Mode Active Learning Querying Discriminative and Representative Samples for Batch Mode Active Learning ZHENG WANG and JIEPING YE, Arizona State University Empirical risk minimization (ERM) provides a principled guideline for many machine learning and data mining algorithms. Under the ERM principle, one minimizes an upper bound of the true risk, which is approximated by the summation of empirical risk and the complexity of the candidate classifier class. To guarantee a satisfactory learning performance, ERM requires that the training data are i.i.d. sampled from the unknown source distribution. However, this may not be the case in active learning, where one selects the most informative samples to label, and these data may not follow the source distribution. In this article, we generalize the ERM principle to the active learning setting. We derive a novel form of upper bound for the true risk in the active learning setting; by minimizing this upper bound, we develop a practical batch mode active learning method. The proposed formulation involves a nonconvex integer programming optimization problem. We solve it efficiently by an alternating optimization method. Our method is shown to query the most informative samples while preserving the source distribution as much as possible, thus identifying the most http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on Knowledge Discovery from Data (TKDD) Association for Computing Machinery

Querying Discriminative and Representative Samples for Batch Mode Active Learning

Loading next page...
 
/lp/association-for-computing-machinery/querying-discriminative-and-representative-samples-for-batch-mode-145SV0VJgR

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Association for Computing Machinery
Copyright
Copyright © 2015 by ACM Inc.
ISSN
1556-4681
DOI
10.1145/2700408
Publisher site
See Article on Publisher Site

Abstract

Querying Discriminative and Representative Samples for Batch Mode Active Learning ZHENG WANG and JIEPING YE, Arizona State University Empirical risk minimization (ERM) provides a principled guideline for many machine learning and data mining algorithms. Under the ERM principle, one minimizes an upper bound of the true risk, which is approximated by the summation of empirical risk and the complexity of the candidate classifier class. To guarantee a satisfactory learning performance, ERM requires that the training data are i.i.d. sampled from the unknown source distribution. However, this may not be the case in active learning, where one selects the most informative samples to label, and these data may not follow the source distribution. In this article, we generalize the ERM principle to the active learning setting. We derive a novel form of upper bound for the true risk in the active learning setting; by minimizing this upper bound, we develop a practical batch mode active learning method. The proposed formulation involves a nonconvex integer programming optimization problem. We solve it efficiently by an alternating optimization method. Our method is shown to query the most informative samples while preserving the source distribution as much as possible, thus identifying the most

Journal

ACM Transactions on Knowledge Discovery from Data (TKDD)Association for Computing Machinery

Published: Feb 17, 2015

There are no references for this article.