LEADER, GRAISON, and the Multi-Modal Connection Engine
The High-Performance Computing Modernization Program (HPCMP) portal offers a plethora of benefits for standard users, yet navigating its complexities can be daunting. This presentation aims to educate users on the many advantages of utilizing the HPCMP portal to its fullest extent while effectively addressing potential drawbacks stemming from misconceptions or lack of knowledge. Additionally, this presentation is a free-to-use package technology officers can use to encourage greater HPCMP use in their organization.
One such advantage discussed is HPCMP’s access to NVIDIA A100, a cutting-edge Graphics Processing Unit (GPU), offering up to 249 times the speed of single-core Central Processing Unit (CPU). Our paper goes into further depth about how this, along with GPUs allowing for parallelization, aids in program acceleration, thereby speeding up machine learning models’ evaluation and training processes.
We also explain how the Version Control Systems (VCS) integrated into the HPCMP infrastructure further augment users’ productivity and workflow efficiency. They enable programmers to easily debug by reverting to previous code versions, save time through collaborative code merging, and mitigate the risk of data loss via redundant storage mechanisms.
One of HPCMP’s greatest strengths is its enormous processing power. This can lead to the common user misconception that doubling the number of CPU cores halves simulation time. Our paper explains that as the number of processors increases, the potential speedup becomes limited in reality, reaching an asymptote. To combat such limitation, we show how users can leverage HPCMP’s Parallel Computing Toolbox to implement parallel operations and further accelerate simulations instead of relying on more cores.
Additionally, computational complexity and software architecture pose challenges, as increasing the number of tasks often results in growing computation time. To navigate these limitations, we discuss why users must carefully define requirements, select appropriate design strategies, and consider the implications of their choices on code parallelization and overall performance.
PRESENTER
Lee, Andrew
andrew.m.lee37.civ@us.navy.mil
347-585-5484
NIWC LANT
CO-AUTHORS
Stefkovich, Ryan
ryan.j.stefkovich.mil@us.navy.mil
Coffey, Neil
neil.m.coffey.civ@us.navy.mil
CATEGORY
Other: Education and HPCMP Adoption
SYSTEMS USED
Narwhal
SECRET
No