Dynamic load-balancing : a new strategy for weather forecast models
Rodrigues, Eduardo Rocha
Weather forecasting models are computationally intensive applications and traditionally they are executed in parallel machines. However, some issues prevent these models from fully exploiting the available computing power. One of such issues is load imbalance, i.e., the uneven distribution of load across the processors of the parallel machine. Since weather models are typically synchronous applications, that is, all tasks synchronize at every time-step, the execution time is determined by the slowest task. The causes of such imbalance are either static (e.g. topography) or dynamic (e.g. shortwave radiation, moving thunderstorms). Various techniques, often embedded in the application’s source code, have been used to address both sources. However, these techniques are inflexible and hard to use in legacy codes. In this thesis, we explore the concept of processor virtualization for dynamically balancing the load in weather models. This means that the domain is over-decomposed in more tasks than the available processors. Assuming that many tasks can be safely executed in a single processor, each processor is put in charge of a set of tasks. In addition, the system can migrate some of them from overloaded processors to underloaded ones when it detects load imbalance. This approach has the advantage of decoupling the application from the load balancing strategy. Our objective is to show that processor virtualization can be applied to weather models as long as an appropriate strategy for migrations is used. Our proposal takes into account the communication pattern of the application in addition to the load of each processor. In this text, we present the techniques used to minimize the amount of change needed in order to apply processor virtualization to a real-world application. Furthermore, we analyze the effects caused by the frequency at which the load balancer is invoked and a threshold that activates rebalancing. We propose an automatic strategy to find an optimal threshold to trigger load balancing. These strategies are centralized and work well for moderately large machines. For larger machines, we present a fully distributed algorithm and analyze its performance. As a study case, we demonstrate the effectiveness of our approach for dynamically balancing the load in Brams, a mesoscale weather forecasting model based on MPI parallelization. We choose this model because it presents a considerable load imbalance caused by localized thunderstorms. In addition, we analyze how other effects of processor virtualization can improve performance.