info:eu-repo/semantics/article
Learning budget assignment policies for autoscaling scientific workflows in the cloud
Fecha
2019-02Registro en:
Garí Núñez, Yisel; Monge Bosdari, David Antonio; Mateos Diaz, Cristian Maximiliano; Garcia Garino, Carlos Gabriel; Learning budget assignment policies for autoscaling scientific workflows in the cloud; Springer; Cluster Computing-the Journal Of Networks Software Tools And Applications; 23; 1; 2-2019; 87-105
1386-7857
CONICET Digital
CONICET
Autor
Garí Núñez, Yisel
Monge Bosdari, David Antonio
Mateos Diaz, Cristian Maximiliano
Garcia Garino, Carlos Gabriel
Resumen
Autoscalers exploit cloud-computing elasticity to cope with the dynamic computational demands of scientific workflows. Autoscalers constantly acquire or terminate virtual machines (VMs) on-the-fly to execute workflows minimizing makespan and economic cost. One key problem of workflow autoscaling under budget constraints (i.e. with a maximum limit in cost) is determining the right proportion between: (a) expensive but reliable VMs called on-demand instances, and (b) cheaper but subject-to-failure VMs called spot instances. Spot instances can potentially provide huge parallelism possibilities at low costs but they must be used wisely as they can fail unexpectedly hindering makespan. Given the unpredictability of failures and the inherent performance variability of clouds, designing a policy for assigning the budget for each kind of instance is not a trivial task. For such reason we formalize the described problem as a Markov decision process that allows us learning near-optimal policies from the experience of other baseline policies. Experiments over four well-known scientific workflows, demonstrate that learned policies outperform the baseline policies considering the aggregated relative percentage difference of makespan and execution cost. These promising results encourage the future study of new strategies aiming to find optimal budget policies applied to the execution of workflows in the cloud.