HPAI: Intersecting HPC and AI

As everywhere in research, scientists at CERN, the largest nuclear research centre in Europe, are investigating the use of Artificial Intelligence (AI) and Machine Learning (ML) to identify particles generated by the compact linear collider (CLIC). As a potential replacement for the Monte Carlo methods currently employed.

The example shows – artificial Intelligence (AI) enriches supercomputing: Data analytics has long been helping to prepare growing amounts of data generated by sensors, experiments and High Performance Computing (HPC) simulations. Now the use of AI algorithms starts to replace parts of the computationally intensive simulations, such as “the 3 body problem: “Mathematics combines computer-driven modeling and data-driven analytics from AI,” explains David Brayford, senior HPC and AI scientist at the Leibniz Supercomputing Center in Garching near Munich. Both technologies are based on the same calculation methods, but differ in their matrices.”

Different platforms increase effort

Experiences at CERN confirm this: Monte Carlo simulations took around 17.000 milliseconds to analyze an electron shower, and AI made it in seven.” The power of the processors is exhausted,” concludes Brayford, “the next developments in HPC will be achieved in the area of integrating AI algorithms into existing scientific simulations and enabling a simple transition of AI workflows from the desktop to the supercomputer.” The portability of algorithms and applications is gaining in importance and cloud and container techniques are becoming more important in HPC.

Typically data will not be located on the HPC system. “This can cause the first problems during integration on the HPC systems,” observes Brayford. “Actually, the computation should run on the system where the data is located.” This woulden sure better performance as data movement is typically the most expensive operation, but this is hardly possible in research. In addition, not all simulations of a project can be calculated by a particular high-performance computer. However, moving and exchanging terabytes and petabytes of data takes a lot of time and effort.

Containers increase flexibility

The situation is made more difficult by the fact that the typical data scientists are not familiar with the unique requirements and characteristics of HPC environments. They usually develop their applications with high level scripting languages or frameworks such as TensorFlow and the installation processes often require connection to external systems to download open source software during the build. HPC environments, on the other hand, are often based on closed source applications that incorporate parallel and distributed computing API’s such as MPI and OpenMP, while users have restricted administrator privileges, and face security restrictions such as not allowing access to external systems.

Container technologies help to build the necessary bridges between AI and HPC: Containers provide a mechanism to easily transition the typical AI workloads from the laptop to the supercomputer, without compromising security on the HPC systems and ensuring that the AI software packages take full advantage of the hardware and optimized libraries available on the system. According to initial experience, containers at the LRZ help – the transition of AI workflows from the desktop to the supercomputer by significantly reducing the time and effort required to migrate applications and workflows to new systems.