Implementation of 4DWX Using Hybrid Computing Environments Featuring High-Priority Queue and the Persistent Services Framework

Sheu, Rong-Shyang (National Center for Atmospheric Research)

Halvorson, Scott
Frauenhoffer, Michael
Shaw, Justin
Knievel, Jason

Climate/Weather/Ocean Modeling

The Four-Dimensional Weather System (4DWX), developed at the National Center for Atmospheric Research (NCAR), has been used by the US Army Test and Evaluation Command (ATEC) to provide weather diagnostics and forecasts in real-time from numerical weather prediction (NWP) models in support of the outdoor testing at various Army Test Centers. The required computing hardware and environments have moved from the dedicated high-performance computing (HPC) clusters at Dugway Proving Ground (DPG) to the shared resources sponsored by the Department of Defense (DoD) at two Supercomputing Resource Centers (DSRCs), the Army Research Laboratory (ARL) and the Navy, in the form of Dedicated Support Partitions (DSPs). In lieu of the DSP at ARL, ATEC is exploring a hybrid approach to host the 4DWX to provide critical weather analysis and forecasting in real time. This approach involves running the Weather Research and Forecasting (WRF) model, the core NWP engine of the 4DWX, through the high-priority queue on Centennial and performing various model post-processing tasks on the virtual machines (VMs) managed by Open Virtualization under the persistent services framework (PSF), before pushing a suite of model products to the Army Test Centers for visualization and other downstream applications. The choice of high-priority queue is critical in that it reduces the likelihood that any time-sensitive step in the weather forecasting cycle is delayed or even aborted due to unavailability of required computing resources. For this hybrid approach to work, the model output files from the 4DWX written to the Lustre file system on Centennial need to be accessible to the VMs under the PSF. Several options were explored and a final solution was determined by exporting a volume of the Lustre file system through a common network to the PSF VMs. Compared to the DSP approach, the implementation of this hybrid approach involves additional work, mainly on the PSF side, for NCAR software engineers, including configuring required VM resources (e.g., number of processors, memory, local storage, and networking), managing groups and accounts, and installing and managing required software/library packages. An instance of the 4DWX using this hybrid approach has been built. The VM used by this initial instance will be used as a template to create several more instances to meet the operational requirements of the Army Test Centers.