Dynamic Load Balancing for Parallel and Distributed Systems
Abstract
Adaptive Mesh Refinement (AMR) is a type of multiscale algorithm that achieves high resolution in localized regions of dynamic, multidimensional numerical simulations. One of the key issues related to AMR is dynamic load balancing (DLB), which allows large-scale adaptive applications to run efficiently on parallel and distributed systems. In this talk, I will first provide a detailed adaptive characterization of the structured AMR (SAMR) applications, then two schemes are proposed for SAMR applications to efficiently redistribute workload among parallel and distributed systems respectively. For the parallel systems, the proposed scheme integrates a grid-splitting technique with direct grid movements. For the distributed systems, both the heterogeneous and dynamic features are addressed and a heuristic method is presented to evaluate the computational gain and redistribution cost for global redistribution. Experiments show that by using the proposed DLB schemes, the execution time can be reduced by up to 57% and the quality of load balancing can be improved by a factor of four.