Dissertação de Mestrado
Performance Prediction for Enhancing Ensemble Learning
Fecha
2018-08-31Autor
Gustavo Penha
Institución
Resumen
Ensembling machine-learned models has shown to be a useful technique for improving the effectiveness of tasks such as classification, ad-hoc retrieval, and recommendation. Stacking, for instance, learns to weight and combine the predictions of base models for improved predictions. One limitation of stacking in its standard formulation is that it has no information on the context of instances that make a model perform better than others, weighting these models purely based on their overall performance. In this dissertation, inspired by work on performance prediction, we propose to use auxiliary models capable of predicting the performance of each base model in the ensemble for new learning instances. Current approaches are based on handcrafting meta-features for predicting the performance of base models and using them as additional features for the stacking layer, which has the burden of understanding when each model outperforms others. Unlike them, our novel approaches in both search and recommendation facilitate the stacking layer job with a discriminative set of features. For ad-hoc retrieval, we demonstrate through simulations that there is a prediction accuracy bar that must be overcome for query performance prediction to become useful. Moreover, we show that machine-learned query performance predictors for each base model are able to pass this bar when leveraged as meta-features for stacking individual ranking models via learning to rank. For recommendation, we propose to directly estimate the performance of base models for a user given his or her historical ratings, instead of handcrafting discriminative features for predicting it. Experiments on real-world datasets from multiple domains demonstrate that using performance estimates as additional features can significantly improve the accuracy of current ensemblers based on pointwise learning. Moreover, when used with pairwise and listwise ensemblers, exploiting performance estimates achieves state-of-the-art recommendation effectiveness.