Poster Abstract

P6.5 Henk-Jan Zilverberg (University of Groningen)

Theme: Local and global cloud infrastructure for processing and storage

HPC and 24/7 mission critical workloads together on one platform with vGPU

The demand for ever larger and more capable infrastructures to support modern astronomy projects is insatiable. In order to maximise the return on investment, the University of Groningen has developed a solution which allows the combination of our server virtualisation infrastructure with HPC.

The most common research infrastructures are large dedicated expensive HPC solutions. But is this the only solution? At the University of Groningen we supplement our HPC cluster (Peregrine) by combining our HPC GPU platform with our server virtualisation platform to create one universal platform, thus dissolving traditional boundaries. A platform with lots of GPU’s spread across all three of our data centers, which is possible by using today's technology in a new and innovative way.

Because the workload characteristics of server virtualisation and HPC differ a great deal, combining them into one platform will utilize the hardware much more efficiently. Our mission critical platform is scaled in a way to allow the outage of any one of our three data centers. In our case this means we have at least a third of our hardware doing nothing by design.

By virtualisation of HPC GPUs we are able to make better use of the resources and gain flexibility. For example, apart from AI and machine learning, we can temporarily give a virtual research workspace the power of GPU acceleration, which enables our researchers to work interactively.