Using Mint to Explore Performance Portability Strategies in HPCMP CREATE Kestrel

Zagaris, George (Air Force Research Laboratory)

Holst, Kevin
Tyson, William
Starr, Robert
McNally, Robert P.
Lamberson, Steve
Bond, Ryan
Sitaraman, Jayanarayanan

Incorporation of GPUs/Accelerators into Physics-based Codes

Trends in High-Performance Computing (HPC) architecture design are moving towards heterogeneous system architectures as a way to boost application performance. Heterogeneous systems combine conventional multi-core CPUs with specialized accelerators, e.g., GPUs, that collectively expose thousands of cores to the application within a single node. The advent of heterogeneous HPC clusters as the dominant computing platform for advanced simulation & computing has enabled high fidelity simulations at unprecedented scale and resolution, leading to new insights and enabling solutions to problems that were previously not tractable.

However, harnessing the massive parallelism of heterogeneous systems, in a portable manner and across a diverse set of architectures presents significant software engineering challenges. Large scale multi-physics codes are a foundational pillar to core missions in the DoD. These codes are employed to address questions of national and global interest and are required to achieve high performance across a diverse set of architectures, ranging from laptops, high-end engineering workstations, conventional clusters with commodity multi-processors, to current heterogeneous peta-scale and exa-scale systems.

Due to the sheer size, complexity and high inertia of developing and validating a multi-physics code, it is desirable to maintain a single-source codebase that is readily parallel and portable across different architectures. Moreover, the code needs to be easily maintainable over time and across multiple platform generations. The impetus for a single-source codebase has prompted the development of libraries and abstraction layers that insulate application developers from the underlying architecture.

One such development effort is Mint, which provides a comprehensive mesh data model and mesh-aware, fine-grain parallel execution model that insulates application developers from the programming model and architecture details. This enables the implementation of computational kernels that are born parallel and portable without tying the kernel directly to any specific architecture. In this talk, we present Mint and its use in a mini-app that solves the Reynold's Averaged Navier-Stokes equations on an unstructured mesh that will help inform the evolution of CREATE-AV Kestrel's flow solvers.