Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A Bandit-Based Ensemble Framework for Exploration/Exploitation of Diverse Recommendation Components

A Bandit-Based Ensemble Framework for Exploration/Exploitation of Diverse Recommendation Components This work presents an extension of Thompson Sampling bandit policy for orchestrating the collection of base recommendation algorithms for e-commerce. We focus on the problem of item-to-item recommendations, for which multiple behavioral and attribute-based predictors are provided to an ensemble learner. In addition, we detail the construction of a personalized predictor based on k-Nearest Neighbors (kNN), with temporal decay capabilities and event weighting. We show how to adapt Thompson Sampling to realistic situations when neither action availability nor reward stationarity is guaranteed. Furthermore, we investigate the effects of priming the sampler with pre-set parameters of reward probability distributions by utilizing the product catalog and/or event history, when such information is available. We report our experimental results based on the analysis of three real-world e-commerce datasets. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on Interactive Intelligent Systems (TiiS) Association for Computing Machinery

A Bandit-Based Ensemble Framework for Exploration/Exploitation of Diverse Recommendation Components

Loading next page...
 
/lp/association-for-computing-machinery/a-bandit-based-ensemble-framework-for-exploration-exploitation-of-XI4v9AbAcE

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Association for Computing Machinery
Copyright
Copyright © 2019 ACM
ISSN
2160-6455
eISSN
2160-6463
DOI
10.1145/3237187
Publisher site
See Article on Publisher Site

Abstract

This work presents an extension of Thompson Sampling bandit policy for orchestrating the collection of base recommendation algorithms for e-commerce. We focus on the problem of item-to-item recommendations, for which multiple behavioral and attribute-based predictors are provided to an ensemble learner. In addition, we detail the construction of a personalized predictor based on k-Nearest Neighbors (kNN), with temporal decay capabilities and event weighting. We show how to adapt Thompson Sampling to realistic situations when neither action availability nor reward stationarity is guaranteed. Furthermore, we investigate the effects of priming the sampler with pre-set parameters of reward probability distributions by utilizing the product catalog and/or event history, when such information is available. We report our experimental results based on the analysis of three real-world e-commerce datasets.

Journal

ACM Transactions on Interactive Intelligent Systems (TiiS)Association for Computing Machinery

Published: Aug 9, 2019

Keywords: E-commerce recommender systems

References