Artículos de revistas
An online data access prediction and optimization approach for distributed systems
Date
2012Registration in:
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, LOS ALAMITOS, v. 23, n. 6, p. 1017-1029, JUN, 2012
1045-9219
10.1109/TPDS.2011.256
Author
Ishii, Renato Porfirio
Mello, Rodrigo Fernandes de
Institutions
Abstract
Current scientific applications have been producing large amounts of data. The processing, handling and analysis of such data require large-scale computing infrastructures such as clusters and grids. In this area, studies aim at improving the performance of data-intensive applications by optimizing data accesses. In order to achieve this goal, distributed storage systems have been considering techniques of data replication, migration, distribution, and access parallelism. However, the main drawback of those studies is that they do not take into account application behavior to perform data access optimization. This limitation motivated this paper which applies strategies to support the online prediction of application behavior in order to optimize data access operations on distributed systems, without requiring any information on past executions. In order to accomplish such a goal, this approach organizes application behaviors as time series and, then, analyzes and classifies those series according to their properties. By knowing properties, the approach selects modeling techniques to represent series and perform predictions, which are, later on, used to optimize data access operations. This new approach was implemented and evaluated using the OptorSim simulator, sponsored by the LHC-CERN project and widely employed by the scientific community. Experiments confirm this new approach reduces application execution time in about 50 percent, specially when handling large amounts of data.