Performance under differnet NUMA node number in AMD EPYC CPU 2nd Generation
According to this, there have recommended Node Per Socket(NPS) settings for different workload type. I am particularly interested in the following sentence:
ᅳ“Highly parallel workloads like many HPC use cases might benefit from NPS4”
The picture shows the performance impact of different NPS number under different mesh size of the same model. The result show that setting nps to 4 will bring the benefit of computing acceleration when the case is large mesh size and 32 cores are used. However, in a few case, when the mesh size is smaller and 32 cores are used, the epalsed time increasely slightly.
performance impact of different nps number under different mesh size of same model
Summary: setting nps to 4 is a better choice