| perf-mem(1) |
| =========== |
| |
| NAME |
| ---- |
| perf-mem - Profile memory accesses |
| |
| SYNOPSIS |
| -------- |
| [verse] |
| 'perf mem' [<options>] (record [<command>] | report) |
| |
| DESCRIPTION |
| ----------- |
| "perf mem record" runs a command and gathers memory operation data |
| from it, into perf.data. Perf record options are accepted and are passed through. |
| |
| "perf mem report" displays the result. It invokes perf report with the |
| right set of options to display a memory access profile. By default, loads |
| and stores are sampled. Use the -t option to limit to loads or stores. |
| |
| Note that on Intel systems the memory latency reported is the use-latency, |
| not the pure load (or store latency). Use latency includes any pipeline |
| queuing delays in addition to the memory subsystem latency. |
| |
| On Arm64 this uses SPE to sample load and store operations, therefore hardware |
| and kernel support is required. See linkperf:perf-arm-spe[1] for a setup guide. |
| Due to the statistical nature of SPE sampling, not every memory operation will |
| be sampled. |
| |
| COMMON OPTIONS |
| -------------- |
| -f:: |
| --force:: |
| Don't do ownership validation |
| |
| -t:: |
| --type=<type>:: |
| Select the memory operation type: load or store (default: load,store) |
| |
| -v:: |
| --verbose:: |
| Be more verbose (show counter open errors, etc) |
| |
| -p:: |
| --phys-data:: |
| Record/Report sample physical addresses |
| |
| --data-page-size:: |
| Record/Report sample data address page size |
| |
| RECORD OPTIONS |
| -------------- |
| <command>...:: |
| Any command you can specify in a shell. |
| |
| -e:: |
| --event <event>:: |
| Event selector. Use 'perf mem record -e list' to list available events. |
| |
| -K:: |
| --all-kernel:: |
| Configure all used events to run in kernel space. |
| |
| -U:: |
| --all-user:: |
| Configure all used events to run in user space. |
| |
| --ldlat <n>:: |
| Specify desired latency for loads event. Supported on Intel and Arm64 |
| processors only. Ignored on other archs. |
| |
| REPORT OPTIONS |
| -------------- |
| -i:: |
| --input=<file>:: |
| Input file name. |
| |
| -C:: |
| --cpu=<cpu>:: |
| Monitor only on the list of CPUs provided. Multiple CPUs can be provided as a |
| comma-separated list with no space: 0,1. Ranges of CPUs are specified with - |
| like 0-2. Default is to monitor all CPUS. |
| |
| -D:: |
| --dump-raw-samples:: |
| Dump the raw decoded samples on the screen in a format that is easy to parse with |
| one sample per line. |
| |
| -s:: |
| --sort=<key>:: |
| Group result by given key(s) - multiple keys can be specified |
| in CSV format. The keys are specific to memory samples are: |
| symbol_daddr, symbol_iaddr, dso_daddr, locked, tlb, mem, snoop, |
| dcacheline, phys_daddr, data_page_size, blocked. |
| |
| - symbol_daddr: name of data symbol being executed on at the time of sample |
| - symbol_iaddr: name of code symbol being executed on at the time of sample |
| - dso_daddr: name of library or module containing the data being executed |
| on at the time of the sample |
| - locked: whether the bus was locked at the time of the sample |
| - tlb: type of tlb access for the data at the time of the sample |
| - mem: type of memory access for the data at the time of the sample |
| - snoop: type of snoop (if any) for the data at the time of the sample |
| - dcacheline: the cacheline the data address is on at the time of the sample |
| - phys_daddr: physical address of data being executed on at the time of sample |
| - data_page_size: the data page size of data being executed on at the time of sample |
| - blocked: reason of blocked load access for the data at the time of the sample |
| |
| And the default sort keys are changed to local_weight, mem, sym, dso, |
| symbol_daddr, dso_daddr, snoop, tlb, locked, blocked, local_ins_lat. |
| |
| -T:: |
| --type-profile:: |
| Show data-type profile result instead of code symbols. This requires |
| the debug information and it will change the default sort keys to: |
| mem, snoop, tlb, type. |
| |
| -U:: |
| --hide-unresolved:: |
| Only display entries resolved to a symbol. |
| |
| -x:: |
| --field-separator=<separator>:: |
| Specify the field separator used when dump raw samples (-D option). By default, |
| The separator is the space character. |
| |
| In addition, for report all perf report options are valid, and for record |
| all perf record options. |
| |
| SEE ALSO |
| -------- |
| linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-arm-spe[1] |