I recently worked on some hashtable lookup code that could benefit from SIMD optimizations and microbenchmarking of modulus and hash functions to improve the code quality. However, modern CPUs are complex and have various components that cause fluctuations during benchmarks, such as core design, access times of the CPU Cache Hierarchy, CPU frequency adjustments for thermal balancing, etc.

To get more stable benchmarking results on my CPU, an AMD 7950X3D with two types of cores, I looked into the LLVM Benchmarking Tips page, which has excellent information. It suggests using cset shield to isolate the CPU cores for exclusive benchmark runs, but unfortunately, it relies on cgroup v1 while my system (Ubuntu 22.04) uses systemd 249 with cgroup v2.

I created a helper script to set up an isolated CPU partition and stabilize execution fluctuations as much as possible for more reliable benchmark runs. This assumes a CPU with a similar number of cores as the AMD 7950X3D hybrid core CPU, which has 16 cores (32 SMT threads) with eight cores having a large cache and the other eight supporting higher frequencies. The script isolates physical cores 6,7, and 8,9 for benchmarks while disabling their SMT siblings 22-25 to avoid hyperthreading during benchmark runs. That provides me with two cores of each kind to run benchmarks on.

The script does the following:

When called without an argument, the script undoes the above settings. For benchmarking, it is invoked with a process id, normally a shell PID, to move the shell onto the isolated cores. Benchmark runs started from within this shell will inherit the CPU partition. Using the taskset utility, benchmarks can be forced to run on a particular core, e.g.:

bash> echo $$
67207
bash> ./benchmarking.sh 67207
[...core.isolation...]
bash> taskset -c 9 perf stat -d -r 3 ./my-benchmark
[...stats...]

This script is not a silver bullet for perfect benchmarking, but it can help minimize execution noise and get more stable results on contemporary CPUs. The number of cores and SMT siblings is easy to adapt, with a bit of help from lscpu or similar tools. Feel free to let me know if you have any suggestions or improvements to this script:

benchmarking.sh

For more information about benchmarking on Linux, please refer to the following resources:

Post comment via email