Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Algorithm portfolio selection as a bandit problem with unbounded losses

Algorithm portfolio selection as a bandit problem with unbounded losses We propose a method that learns to allocate computation time to a given set of algorithms, of unknown performance, with the aim of solving a given sequence of problem instances in a minimum time. Analogous meta-learning techniques are typically based on models of algorithm performance, learned during a separate offline training sequence, which can be prohibitively expensive. We adopt instead an online approach, named GAMBLETA, in which algorithm performance models are iteratively updated, and used to guide allocation on a sequence of problem instances. GAMBLETA is a general method for selecting among two or more alternative algorithm portfolios. Each portfolio has its own way of allocating computation time to the available algorithms, possibly based on performance models, in which case its performance is expected to improve over time, as more runtime data becomes available. The resulting exploration-exploitation trade-off is represented as a bandit problem. In our previous work, the algorithms corresponded to the arms of the bandit, and allocations evaluated by the different portfolios were mixed, using a solver for the bandit problem with expert advice, but this required the setting of an arbitrary bound on algorithm runtimes, invalidating the optimal regret of the solver. In this paper, we propose a simpler version of GAMBLETA, in which the allocators correspond to the arms, such that a single portfolio is selected for each instance. The selection is represented as a bandit problem with partial information, and an unknown bound on losses. We devise a solver for this game, proving a bound on its expected regret. We present experiments based on results from several solver competitions, in various domains, comparing GAMBLETA with another online method. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Annals of Mathematics and Artificial Intelligence Springer Journals

Algorithm portfolio selection as a bandit problem with unbounded losses

Loading next page...
 
/lp/springer-journals/algorithm-portfolio-selection-as-a-bandit-problem-with-unbounded-0ipLRgyKv6

References (59)

Publisher
Springer Journals
Copyright
Copyright © 2011 by Springer Science+Business Media B.V.
Subject
Computer Science; Computer Science, general; Artificial Intelligence (incl. Robotics); Statistical Physics, Dynamical Systems and Complexity; Mathematics, general
ISSN
1012-2443
eISSN
1573-7470
DOI
10.1007/s10472-011-9228-z
Publisher site
See Article on Publisher Site

Abstract

We propose a method that learns to allocate computation time to a given set of algorithms, of unknown performance, with the aim of solving a given sequence of problem instances in a minimum time. Analogous meta-learning techniques are typically based on models of algorithm performance, learned during a separate offline training sequence, which can be prohibitively expensive. We adopt instead an online approach, named GAMBLETA, in which algorithm performance models are iteratively updated, and used to guide allocation on a sequence of problem instances. GAMBLETA is a general method for selecting among two or more alternative algorithm portfolios. Each portfolio has its own way of allocating computation time to the available algorithms, possibly based on performance models, in which case its performance is expected to improve over time, as more runtime data becomes available. The resulting exploration-exploitation trade-off is represented as a bandit problem. In our previous work, the algorithms corresponded to the arms of the bandit, and allocations evaluated by the different portfolios were mixed, using a solver for the bandit problem with expert advice, but this required the setting of an arbitrary bound on algorithm runtimes, invalidating the optimal regret of the solver. In this paper, we propose a simpler version of GAMBLETA, in which the allocators correspond to the arms, such that a single portfolio is selected for each instance. The selection is represented as a bandit problem with partial information, and an unknown bound on losses. We devise a solver for this game, proving a bound on its expected regret. We present experiments based on results from several solver competitions, in various domains, comparing GAMBLETA with another online method.

Journal

Annals of Mathematics and Artificial IntelligenceSpringer Journals

Published: Apr 1, 2011

There are no references for this article.