| perf-bench(1) |
| ============= |
| |
| NAME |
| ---- |
| perf-bench - General framework for benchmark suites |
| |
| SYNOPSIS |
| -------- |
| [verse] |
| 'perf bench' [<common options>] <subsystem> <suite> [<options>] |
| |
| DESCRIPTION |
| ----------- |
| This 'perf bench' command is a general framework for benchmark suites. |
| |
| COMMON OPTIONS |
| -------------- |
| -r:: |
| --repeat=:: |
| Specify number of times to repeat the run (default 10). |
| |
| -f:: |
| --format=:: |
| Specify format style. |
| Current available format styles are: |
| |
| 'default':: |
| Default style. This is mainly for human reading. |
| --------------------- |
| % perf bench sched pipe # with no style specified |
| (executing 1000000 pipe operations between two tasks) |
| Total time:5.855 sec |
| 5.855061 usecs/op |
| 170792 ops/sec |
| --------------------- |
| |
| 'simple':: |
| This simple style is friendly for automated |
| processing by scripts. |
| --------------------- |
| % perf bench --format=simple sched pipe # specified simple |
| 5.988 |
| --------------------- |
| |
| SUBSYSTEM |
| --------- |
| |
| 'sched':: |
| Scheduler and IPC mechanisms. |
| |
| 'syscall':: |
| System call performance (throughput). |
| |
| 'mem':: |
| Memory access performance. |
| |
| 'numa':: |
| NUMA scheduling and MM benchmarks. |
| |
| 'futex':: |
| Futex stressing benchmarks. |
| |
| 'epoll':: |
| Eventpoll (epoll) stressing benchmarks. |
| |
| 'internals':: |
| Benchmark internal perf functionality. |
| |
| 'all':: |
| All benchmark subsystems. |
| |
| SUITES FOR 'sched' |
| ~~~~~~~~~~~~~~~~~~ |
| *messaging*:: |
| Suite for evaluating performance of scheduler and IPC mechanisms. |
| Based on hackbench by Rusty Russell. |
| |
| Options of *messaging* |
| ^^^^^^^^^^^^^^^^^^^^^^ |
| -p:: |
| --pipe:: |
| Use pipe() instead of socketpair() |
| |
| -t:: |
| --thread:: |
| Be multi thread instead of multi process |
| |
| -g:: |
| --group=:: |
| Specify number of groups |
| |
| -l:: |
| --nr_loops=:: |
| Specify number of loops |
| |
| Example of *messaging* |
| ^^^^^^^^^^^^^^^^^^^^^^ |
| |
| --------------------- |
| % perf bench sched messaging # run with default |
| options (20 sender and receiver processes per group) |
| (10 groups == 400 processes run) |
| |
| Total time:0.308 sec |
| |
| % perf bench sched messaging -t -g 20 # be multi-thread, with 20 groups |
| (20 sender and receiver threads per group) |
| (20 groups == 800 threads run) |
| |
| Total time:0.582 sec |
| --------------------- |
| |
| *pipe*:: |
| Suite for pipe() system call. |
| Based on pipe-test-1m.c by Ingo Molnar. |
| |
| Options of *pipe* |
| ^^^^^^^^^^^^^^^^^ |
| -l:: |
| --loop=:: |
| Specify number of loops. |
| |
| Example of *pipe* |
| ^^^^^^^^^^^^^^^^^ |
| |
| --------------------- |
| % perf bench sched pipe |
| (executing 1000000 pipe operations between two tasks) |
| |
| Total time:8.091 sec |
| 8.091833 usecs/op |
| 123581 ops/sec |
| |
| % perf bench sched pipe -l 1000 # loop 1000 |
| (executing 1000 pipe operations between two tasks) |
| |
| Total time:0.016 sec |
| 16.948000 usecs/op |
| 59004 ops/sec |
| --------------------- |
| |
| SUITES FOR 'syscall' |
| ~~~~~~~~~~~~~~~~~~ |
| *basic*:: |
| Suite for evaluating performance of core system call throughput (both usecs/op and ops/sec metrics). |
| This uses a single thread simply doing getppid(2), which is a simple syscall where the result is not |
| cached by glibc. |
| |
| |
| SUITES FOR 'mem' |
| ~~~~~~~~~~~~~~~~ |
| *memcpy*:: |
| Suite for evaluating performance of simple memory copy in various ways. |
| |
| Options of *memcpy* |
| ^^^^^^^^^^^^^^^^^^^ |
| -l:: |
| --size:: |
| Specify size of memory to copy (default: 1MB). |
| Available units are B, KB, MB, GB and TB (case insensitive). |
| |
| -f:: |
| --function:: |
| Specify function to copy (default: default). |
| Available functions are depend on the architecture. |
| On x86-64, x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported. |
| |
| -l:: |
| --nr_loops:: |
| Repeat memcpy invocation this number of times. |
| |
| -c:: |
| --cycles:: |
| Use perf's cpu-cycles event instead of gettimeofday syscall. |
| |
| *memset*:: |
| Suite for evaluating performance of simple memory set in various ways. |
| |
| Options of *memset* |
| ^^^^^^^^^^^^^^^^^^^ |
| -l:: |
| --size:: |
| Specify size of memory to set (default: 1MB). |
| Available units are B, KB, MB, GB and TB (case insensitive). |
| |
| -f:: |
| --function:: |
| Specify function to set (default: default). |
| Available functions are depend on the architecture. |
| On x86-64, x86-64-unrolled, x86-64-stosq and x86-64-stosb are supported. |
| |
| -l:: |
| --nr_loops:: |
| Repeat memset invocation this number of times. |
| |
| -c:: |
| --cycles:: |
| Use perf's cpu-cycles event instead of gettimeofday syscall. |
| |
| SUITES FOR 'numa' |
| ~~~~~~~~~~~~~~~~~ |
| *mem*:: |
| Suite for evaluating NUMA workloads. |
| |
| SUITES FOR 'futex' |
| ~~~~~~~~~~~~~~~~~~ |
| *hash*:: |
| Suite for evaluating hash tables. |
| |
| *wake*:: |
| Suite for evaluating wake calls. |
| |
| *wake-parallel*:: |
| Suite for evaluating parallel wake calls. |
| |
| *requeue*:: |
| Suite for evaluating requeue calls. |
| |
| *lock-pi*:: |
| Suite for evaluating futex lock_pi calls. |
| |
| SUITES FOR 'epoll' |
| ~~~~~~~~~~~~~~~~~~ |
| *wait*:: |
| Suite for evaluating concurrent epoll_wait calls. |
| |
| *ctl*:: |
| Suite for evaluating multiple epoll_ctl calls. |
| |
| SUITES FOR 'internals' |
| ~~~~~~~~~~~~~~~~~~~~~~ |
| *synthesize*:: |
| Suite for evaluating perf's event synthesis performance. |
| |
| SEE ALSO |
| -------- |
| linkperf:perf[1] |