LEADER, GRAISON, and the Multi-Modal Connection Engine

The High-Performance Computing Modernization Program (HPCMP) portal offers a plethora of benefits for standard users, yet navigating its complexities can be daunting. This presentation aims to educate users on the many advantages of utilizing the HPCMP portal to its fullest extent while effectively addressing potential drawbacks stemming from misconceptions or lack of knowledge. Additionally, this presentation is a free-to-use package technology officers can use to encourage greater HPCMP use in their organization.

One such advantage discussed is HPCMP’s access to NVIDIA A100, a cutting-edge Graphics Processing Unit (GPU), offering up to 249 times the speed of single-core Central Processing Unit (CPU). Our paper goes into further depth about how this, along with GPUs allowing for parallelization, aids in program acceleration, thereby speeding up machine learning models’ evaluation and training processes.

We also explain how the Version Control Systems (VCS) integrated into the HPCMP infrastructure further augment users’ productivity and workflow efficiency. They enable programmers to easily debug by reverting to previous code versions, save time through collaborative code merging, and mitigate the risk of data loss via redundant storage mechanisms.

One of HPCMP’s greatest strengths is its enormous processing power. This can lead to the common user misconception that doubling the number of CPU cores halves simulation time. Our paper explains that as the number of processors increases, the potential speedup becomes limited in reality, reaching an asymptote. To combat such limitation, we show how users can leverage HPCMP’s Parallel Computing Toolbox to implement parallel operations and further accelerate simulations instead of relying on more cores.

Additionally, computational complexity and software architecture pose challenges, as increasing the number of tasks often results in growing computation time. To navigate these limitations, we discuss why users must carefully define requirements, select appropriate design strategies, and consider the implications of their choices on code parallelization and overall performance.

PRESENTER

Lee, Andrew
andrew.m.lee37.civ@us.navy.mil
347-585-5484

NIWC LANT

CO-AUTHORS

Stefkovich, Ryan
ryan.j.stefkovich.mil@us.navy.mil

Coffey, Neil
neil.m.coffey.civ@us.navy.mil

CATEGORY

Other: Education and HPCMP Adoption

SYSTEMS USED

Narwhal

SECRET

No