Benchmarks: Integrate lmbench/lat_mem_rd to fastpath (!40) · Merge requests · Tooling / Fastpath

Aishwarya Rambhadran requested to merge aisram01/latmemrd_implemnt into main Nov 24, 2025

Add the necessary files for integrating the lmbench/lat_mem_rd workloads to Fastpath benchmarks library. This benchmark suite measures memory read latency as a function of working-set size. It works by traversing a linked list of pointers spread across a region of memory & returns the time it takes to perform each dereference.

Update benchmarks library with lat-mem-rd.yaml plan to run basic latency tests under lmbench, utilizing options to set working-set size (MB), stride(s) in bytes, parallel process count, number of repeats for each test, & true memory read (random access, -t) flag. The plan supports params such as working set size sweep (start, end, multiplicative step), parallel, repeat, work-set-stride and work-set-random for tuning the lat-mem-rd tests as required by the user. Keep parallel,repeat,work-set-size -start & work-set-random as mandatory params in the plan.

Add a Dockerfile to create a containerized environment for running lmbench/lat_mem_rd benchmark suites. The Dockerfile installs dependencies, clones the lmbench repository, and sets up exec.py script as the container entry point.

Implement exec.py script to handle benchmark execution and result processing. Generate lat-mem-rd test cases using working set size (MB) sweep values, count of parallel processes for each working set size, repeats, stride(s) in bytes and true memory read (random access) flag. For each work set size, generate test(s) as per the random access flag provided in plan. If random access flag is 'both', generate test cases separately with the flag enabled & disabled. Working set size sweep is capped at 50% of total RAM size on the selected SUT. Store the output of each lat_mem_rd test in text file named after the configurations used for the test. One output file will contain results for each stride as a separate section. Then, parse memory read latency (ns) for each stride from the output files of working set sizes tested. Process & save test case results after splitting it into Fastpath result classes based on random flag(s) configured for the test. Name result class as sequential_read_latency for test with flag disabled & random_read_latency for test with flag enabled.

Signed-off-by: Aishwarya Rambhadran aishwarya.rambhadran@arm.com

Benchmarks: Integrate lmbench/lat_mem_rd to fastpath

Merge request reports