Please use this identifier to cite or link to this item:
http://repositorio.ufla.br/jspui/handle/1/48110
Title: | An evaluation of MPI and OpenMP paradigms in finite-difference explicit methods for PDEs on shared-memory multi- and manycore systems |
Keywords: | High-performance computing Multicore architectures Parallelism Parallel processing Computação de alto desempenho Arquiteturas multicore Paralelismo Processamento paralelo |
Issue Date: | 25-Oct-2020 |
Publisher: | Wiley |
Citation: | CABRAL, F. L. et al. An evaluation of MPI and OpenMP paradigms in finite-difference explicit methods for PDEs on shared-memory multi- and manycore systems. Concurrency and Computation: Practice and Experience, Chichester, v. 32, n. 20, e5642, 25 Oct. 2020. Special Issue. DOI: 10.1002/cpe.5642. |
Abstract: | This paper focuses on parallel implementations of three two-dimensional explicit numerical methods on Intel® Xeon® Scalable Processor and the coprocessor Knights Landing. In this study, the performance of a hybrid parallel programming with message passing interface (MPI) and Open Multi-Processing (OpenMP) and a pure MPI implementation used with two thread binding policies is compared with an improved OpenMP-based implementation in three explicit finite-difference methods for solving partial differential equations on shared-memory multicore and manycore systems. Specifically, the improved OpenMP-based version is a strategy that synchronizes adjacent threads and eliminates the implicit barriers of a naïve OpenMP-based implementation. The experiments show that the most suitable approach depends on several characteristics related to the nonuniform memory access (NUMA) effect and load balancing, such as the size of the MPI domain and the number of synchronization points used in the parallel implementation. In algorithms that use four and five synchronization points, hybrid MPI/OpenMP approaches yielded better speedups than the other versions did in runs performed on both systems. The pure MPI-based strategy, however, achieved better results than the other proposed approaches did in the method that employs only one synchronization point. |
URI: | https://doi.org/10.1002/cpe.5642 http://repositorio.ufla.br/jspui/handle/1/48110 |
Appears in Collections: | DCC - Artigos publicados em periódicos |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
Admin Tools