Analysis of the NEC Vector Engine for Legacy Hypersonic Codes

Khine, Yu Yu (US Naval Research Laboratory)

Obenschain, Keith
Rosenberg, Robert
Mathur, Raghunandan
Patnaik, Gopal
Stehley, Talya
Garmann, Daniel


Many codes that are still in production use trace their origins to code developed during the vector supercomputing era from the 1970's to 1990's. Many of those codes are still in use with vector friendly constructs in their codebase. The codes that currently employ GPUs do not cover many of the required aspects for hypersonic modeling. The recently released NEC Vector Engine (VE) provides an opportunity to exploit this vector heritage. The VE can potentially provide state of the art performance without a complete rewrite of a well-validated codebase. Given the time and cost required to port or rewrite codes, this is an attractive solution. The NEC VE provides an opportunity to make full use of these well-validated codes to impact hypersonic vehicle design by giving them a much-needed performance boost. The greater performance will reduce turnaround time for a simulation, increase resolution, and allow for higher-fidelity representation of the physics.

NEC has been a major provider in the supercomputing domain for over 37 years and starting from the early 1980s, and NEC has developed a product line of vector computers. NEC's SX-6, better known as The Earth Simulator, later secured the top position on the Top500 list of super-computers from the year 2002 to 2004. NEC has evolved this vector technology into an accelerator card that fits into a PCI-e card known as NEC Vector Engine [1].

The legacy CFD solver for hypersonic applications used to evaluate the performance of VE is FDL3DI that was developed at the Air Force Research Laboratory in the late 1990s [2, 3]. It was originally vectorized and optimized for efficient operation on vector processing machines. FDL3DI is an extensively validated high-order flow solver in which the compressible Navier-Stokes equations in a curvilinear coordinate system are discretized through a high-order compact-differencing approach [4].

The NEC Vector Engine's architecture, high memory bandwidth and ability to compile Fortran was the primary motivation for this evaluation. In this presentation, a supersonic test case will be discussed and the performance of NEC Vector Engine and its ease of use are compared with that of existing CPU architectures at HPCMP DSRCs, such as, processors from AMD and Intel. In addition, scaling results, roofline analysis, and power and energy efficiency studies will be presented [5].

References are listed in the Comments section.