| perf-script(1) |
| ============= |
| |
| NAME |
| ---- |
| perf-script - Read perf.data (created by perf record) and display trace output |
| |
| SYNOPSIS |
| -------- |
| [verse] |
| 'perf script' [<options>] |
| 'perf script' [<options>] record <script> [<record-options>] <command> |
| 'perf script' [<options>] report <script> [script-args] |
| 'perf script' [<options>] <script> <required-script-args> [<record-options>] <command> |
| 'perf script' [<options>] <top-script> [script-args] |
| |
| DESCRIPTION |
| ----------- |
| This command reads the input file and displays the trace recorded. |
| |
| There are several variants of perf script: |
| |
| 'perf script' to see a detailed trace of the workload that was |
| recorded. |
| |
| You can also run a set of pre-canned scripts that aggregate and |
| summarize the raw trace data in various ways (the list of scripts is |
| available via 'perf script -l'). The following variants allow you to |
| record and run those scripts: |
| |
| 'perf script record <script> <command>' to record the events required |
| for 'perf script report'. <script> is the name displayed in the |
| output of 'perf script --list' i.e. the actual script name minus any |
| language extension. If <command> is not specified, the events are |
| recorded using the -a (system-wide) 'perf record' option. |
| |
| 'perf script report <script> [args]' to run and display the results |
| of <script>. <script> is the name displayed in the output of 'perf |
| script --list' i.e. the actual script name minus any language |
| extension. The perf.data output from a previous run of 'perf script |
| record <script>' is used and should be present for this command to |
| succeed. [args] refers to the (mainly optional) args expected by |
| the script. |
| |
| 'perf script <script> <required-script-args> <command>' to both |
| record the events required for <script> and to run the <script> |
| using 'live-mode' i.e. without writing anything to disk. <script> |
| is the name displayed in the output of 'perf script --list' i.e. the |
| actual script name minus any language extension. If <command> is |
| not specified, the events are recorded using the -a (system-wide) |
| 'perf record' option. If <script> has any required args, they |
| should be specified before <command>. This mode doesn't allow for |
| optional script args to be specified; if optional script args are |
| desired, they can be specified using separate 'perf script record' |
| and 'perf script report' commands, with the stdout of the record step |
| piped to the stdin of the report script, using the '-o -' and '-i -' |
| options of the corresponding commands. |
| |
| 'perf script <top-script>' to both record the events required for |
| <top-script> and to run the <top-script> using 'live-mode' |
| i.e. without writing anything to disk. <top-script> is the name |
| displayed in the output of 'perf script --list' i.e. the actual |
| script name minus any language extension; a <top-script> is defined |
| as any script name ending with the string 'top'. |
| |
| [<record-options>] can be passed to the record steps of 'perf script |
| record' and 'live-mode' variants; this isn't possible however for |
| <top-script> 'live-mode' or 'perf script report' variants. |
| |
| See the 'SEE ALSO' section for links to language-specific |
| information on how to write and run your own trace scripts. |
| |
| OPTIONS |
| ------- |
| <command>...:: |
| Any command you can specify in a shell. |
| |
| -D:: |
| --dump-raw-trace=:: |
| Display verbose dump of the trace data. |
| |
| --dump-unsorted-raw-trace=:: |
| Same as --dump-raw-trace but not sorted in time order. |
| |
| -L:: |
| --Latency=:: |
| Show latency attributes (irqs/preemption disabled, etc). |
| |
| -l:: |
| --list=:: |
| Display a list of available trace scripts. |
| |
| -s ['lang']:: |
| --script=:: |
| Process trace data with the given script ([lang]:script[.ext]). |
| If the string 'lang' is specified in place of a script name, a |
| list of supported languages will be displayed instead. |
| |
| -g:: |
| --gen-script=:: |
| Generate perf-script.[ext] starter script for given language, |
| using current perf.data. |
| |
| --dlfilter=<file>:: |
| Filter sample events using the given shared object file. |
| Refer linkperf:perf-dlfilter[1] |
| |
| --dlarg=<arg>:: |
| Pass 'arg' as an argument to the dlfilter. --dlarg may be repeated |
| to add more arguments. |
| |
| --list-dlfilters:: |
| Display a list of available dlfilters. Use with option -v (must come |
| before option --list-dlfilters) to show long descriptions. |
| |
| -a:: |
| Force system-wide collection. Scripts run without a <command> |
| normally use -a by default, while scripts run with a <command> |
| normally don't - this option allows the latter to be run in |
| system-wide mode. |
| |
| -i:: |
| --input=:: |
| Input file name. (default: perf.data unless stdin is a fifo) |
| |
| -d:: |
| --debug-mode:: |
| Do various checks like samples ordering and lost events. |
| |
| -F:: |
| --fields:: |
| Comma separated list of fields to print. Options are: |
| comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff, |
| srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, |
| brstackinsn, brstackinsnlen, brstackoff, callindent, insn, insnlen, synth, |
| phys_addr, metric, misc, srccode, ipc, data_page_size, code_page_size, ins_lat, |
| machine_pid, vcpu, cgroup, retire_lat. |
| Field list can be prepended with the type, trace, sw or hw, |
| to indicate to which event type the field list applies. |
| e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace |
| |
| perf script -F <fields> |
| |
| is equivalent to: |
| |
| perf script -F trace:<fields> -F sw:<fields> -F hw:<fields> |
| |
| i.e., the specified fields apply to all event types if the type string |
| is not given. |
| |
| In addition to overriding fields, it is also possible to add or remove |
| fields from the defaults. For example |
| |
| -F -cpu,+insn |
| |
| removes the cpu field and adds the insn field. Adding/removing fields |
| cannot be mixed with normal overriding. |
| |
| The arguments are processed in the order received. A later usage can |
| reset a prior request. e.g.: |
| |
| -F trace: -F comm,tid,time,ip,sym |
| |
| The first -F suppresses trace events (field list is ""), but then the |
| second invocation sets the fields to comm,tid,time,ip,sym. In this case a |
| warning is given to the user: |
| |
| "Overriding previous field request for all events." |
| |
| Alternatively, consider the order: |
| |
| -F comm,tid,time,ip,sym -F trace: |
| |
| The first -F sets the fields for all events and the second -F |
| suppresses trace events. The user is given a warning message about |
| the override, and the result of the above is that only S/W and H/W |
| events are displayed with the given fields. |
| |
| It's possible tp add/remove fields only for specific event type: |
| |
| -Fsw:-cpu,-period |
| |
| removes cpu and period from software events. |
| |
| For the 'wildcard' option if a user selected field is invalid for an |
| event type, a message is displayed to the user that the option is |
| ignored for that type. For example: |
| |
| $ perf script -F comm,tid,trace |
| 'trace' not valid for hardware events. Ignoring. |
| 'trace' not valid for software events. Ignoring. |
| |
| Alternatively, if the type is given an invalid field is specified it |
| is an error. For example: |
| |
| perf script -v -F sw:comm,tid,trace |
| 'trace' not valid for software events. |
| |
| At this point usage is displayed, and perf-script exits. |
| |
| The flags field is synthesized and may have a value when Instruction |
| Trace decoding. The flags are "bcrosyiABExghDt" which stand for branch, |
| call, return, conditional, system, asynchronous, interrupt, |
| transaction abort, trace begin, trace end, in transaction, VM-Entry, |
| VM-Exit, interrupt disabled and interrupt disable toggle respectively. |
| Known combinations of flags are printed more nicely e.g. |
| "call" for "bc", "return" for "br", "jcc" for "bo", "jmp" for "b", |
| "int" for "bci", "iret" for "bri", "syscall" for "bcs", "sysret" for "brs", |
| "async" for "by", "hw int" for "bcyi", "tx abrt" for "bA", "tr strt" for "bB", |
| "tr end" for "bE", "vmentry" for "bcg", "vmexit" for "bch". |
| However the "x", "D" and "t" flags will be displayed separately in those |
| cases e.g. "jcc (xD)" for a condition branch within a transaction |
| with interrupts disabled. Note, interrupts becoming disabled is "t", |
| whereas interrupts becoming enabled is "Dt". |
| |
| The callindent field is synthesized and may have a value when |
| Instruction Trace decoding. For calls and returns, it will display the |
| name of the symbol indented with spaces to reflect the stack depth. |
| |
| When doing instruction trace decoding insn and insnlen give the |
| instruction bytes and the instruction length of the current |
| instruction. |
| |
| The synth field is used by synthesized events which may be created when |
| Instruction Trace decoding. |
| |
| The ipc (instructions per cycle) field is synthesized and may have a value when |
| Instruction Trace decoding. |
| |
| The machine_pid and vcpu fields are derived from data resulting from using |
| perf inject to insert a perf.data file recorded inside a virtual machine into |
| a perf.data file recorded on the host at the same time. |
| |
| The cgroup fields requires sample having the cgroup id which is saved |
| when "--all-cgroups" option is passed to 'perf record'. |
| |
| Finally, a user may not set fields to none for all event types. |
| i.e., -F "" is not allowed. |
| |
| The brstack output includes branch related information with raw addresses using the |
| /v/v/v/v/cycles syntax in the following order: |
| FROM: branch source instruction |
| TO : branch target instruction |
| M/P/-: M=branch target mispredicted or branch direction was mispredicted, P=target predicted or direction predicted, -=not supported |
| X/- : X=branch inside a transactional region, -=not in transaction region or not supported |
| A/- : A=TSX abort entry, -=not aborted region or not supported |
| cycles |
| |
| The brstacksym is identical to brstack, except that the FROM and TO addresses are printed in a symbolic form if possible. |
| |
| When brstackinsn is specified the full assembler sequences of branch sequences for each sample |
| is printed. This is the full execution path leading to the sample. This is only supported when the |
| sample was recorded with perf record -b or -j any. |
| |
| Use brstackinsnlen to print the brstackinsn lenght. For example, you |
| can’t know the next sequential instruction after an unconditional branch unless |
| you calculate that based on its length. |
| |
| The brstackoff field will print an offset into a specific dso/binary. |
| |
| With the metric option perf script can compute metrics for |
| sampling periods, similar to perf stat. This requires |
| specifying a group with multiple events defining metrics with the :S option |
| for perf record. perf will sample on the first event, and |
| print computed metrics for all the events in the group. Please note |
| that the metric computed is averaged over the whole sampling |
| period (since the last sample), not just for the sample point. |
| |
| For sample events it's possible to display misc field with -F +misc option, |
| following letters are displayed for each bit: |
| |
| PERF_RECORD_MISC_KERNEL K |
| PERF_RECORD_MISC_USER U |
| PERF_RECORD_MISC_HYPERVISOR H |
| PERF_RECORD_MISC_GUEST_KERNEL G |
| PERF_RECORD_MISC_GUEST_USER g |
| PERF_RECORD_MISC_MMAP_DATA* M |
| PERF_RECORD_MISC_COMM_EXEC E |
| PERF_RECORD_MISC_SWITCH_OUT S |
| PERF_RECORD_MISC_SWITCH_OUT_PREEMPT Sp |
| |
| $ perf script -F +misc ... |
| sched-messaging 1414 K 28690.636582: 4590 cycles ... |
| sched-messaging 1407 U 28690.636600: 325620 cycles ... |
| sched-messaging 1414 K 28690.636608: 19473 cycles ... |
| misc field ___________/ |
| |
| -k:: |
| --vmlinux=<file>:: |
| vmlinux pathname |
| |
| --kallsyms=<file>:: |
| kallsyms pathname |
| |
| --symfs=<directory>:: |
| Look for files with symbols relative to this directory. |
| |
| -G:: |
| --hide-call-graph:: |
| When printing symbols do not display call chain. |
| |
| --stop-bt:: |
| Stop display of callgraph at these symbols |
| |
| -C:: |
| --cpu:: Only report samples for the list of CPUs provided. Multiple CPUs can |
| be provided as a comma-separated list with no space: 0,1. Ranges of |
| CPUs are specified with -: 0-2. Default is to report samples on all |
| CPUs. |
| |
| -c:: |
| --comms=:: |
| Only display events for these comms. CSV that understands |
| file://filename entries. |
| |
| --pid=:: |
| Only show events for given process ID (comma separated list). |
| |
| --tid=:: |
| Only show events for given thread ID (comma separated list). |
| |
| -I:: |
| --show-info:: |
| Display extended information about the perf.data file. This adds |
| information which may be very large and thus may clutter the display. |
| It currently includes: cpu and numa topology of the host system. |
| It can only be used with the perf script report mode. |
| |
| --show-kernel-path:: |
| Try to resolve the path of [kernel.kallsyms] |
| |
| --show-task-events |
| Display task related events (e.g. FORK, COMM, EXIT). |
| |
| --show-mmap-events |
| Display mmap related events (e.g. MMAP, MMAP2). |
| |
| --show-namespace-events |
| Display namespace events i.e. events of type PERF_RECORD_NAMESPACES. |
| |
| --show-switch-events |
| Display context switch events i.e. events of type PERF_RECORD_SWITCH or |
| PERF_RECORD_SWITCH_CPU_WIDE. |
| |
| --show-lost-events |
| Display lost events i.e. events of type PERF_RECORD_LOST. |
| |
| --show-round-events |
| Display finished round events i.e. events of type PERF_RECORD_FINISHED_ROUND. |
| |
| --show-bpf-events |
| Display bpf events i.e. events of type PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT. |
| |
| --show-cgroup-events |
| Display cgroup events i.e. events of type PERF_RECORD_CGROUP. |
| |
| --show-text-poke-events |
| Display text poke events i.e. events of type PERF_RECORD_TEXT_POKE and |
| PERF_RECORD_KSYMBOL. |
| |
| --demangle:: |
| Demangle symbol names to human readable form. It's enabled by default, |
| disable with --no-demangle. |
| |
| --demangle-kernel:: |
| Demangle kernel symbol names to human readable form (for C++ kernels). |
| |
| --header |
| Show perf.data header. |
| |
| --header-only |
| Show only perf.data header. |
| |
| --itrace:: |
| Options for decoding instruction tracing data. The options are: |
| |
| include::itrace.txt[] |
| |
| To disable decoding entirely, use --no-itrace. |
| |
| --full-source-path:: |
| Show the full path for source files for srcline output. |
| |
| --max-stack:: |
| Set the stack depth limit when parsing the callchain, anything |
| beyond the specified depth will be ignored. This is a trade-off |
| between information loss and faster processing especially for |
| workloads that can have a very long callchain stack. |
| Note that when using the --itrace option the synthesized callchain size |
| will override this value if the synthesized callchain size is bigger. |
| |
| Default: 127 |
| |
| --ns:: |
| Use 9 decimal places when displaying time (i.e. show the nanoseconds) |
| |
| -f:: |
| --force:: |
| Don't do ownership validation. |
| |
| --time:: |
| Only analyze samples within given time window: <start>,<stop>. Times |
| have the format seconds.nanoseconds. If start is not given (i.e. time |
| string is ',x.y') then analysis starts at the beginning of the file. If |
| stop time is not given (i.e. time string is 'x.y,') then analysis goes |
| to end of file. Multiple ranges can be separated by spaces, which |
| requires the argument to be quoted e.g. --time "1234.567,1234.789 1235," |
| |
| Also support time percent with multiple time ranges. Time string is |
| 'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. |
| |
| For example: |
| Select the second 10% time slice: |
| perf script --time 10%/2 |
| |
| Select from 0% to 10% time slice: |
| perf script --time 0%-10% |
| |
| Select the first and second 10% time slices: |
| perf script --time 10%/1,10%/2 |
| |
| Select from 0% to 10% and 30% to 40% slices: |
| perf script --time 0%-10%,30%-40% |
| |
| --max-blocks:: |
| Set the maximum number of program blocks to print with brstackinsn for |
| each sample. |
| |
| --reltime:: |
| Print time stamps relative to trace start. |
| |
| --deltatime:: |
| Print time stamps relative to previous event. |
| |
| --per-event-dump:: |
| Create per event files with a "perf.data.EVENT.dump" name instead of |
| printing to stdout, useful, for instance, for generating flamegraphs. |
| |
| --inline:: |
| If a callgraph address belongs to an inlined function, the inline stack |
| will be printed. Each entry has function name and file/line. Enabled by |
| default, disable with --no-inline. |
| |
| --insn-trace:: |
| Show instruction stream for intel_pt traces. Combine with --xed to |
| show disassembly. |
| |
| --xed:: |
| Run xed disassembler on output. Requires installing the xed disassembler. |
| |
| -S:: |
| --symbols=symbol[,symbol...]:: |
| Only consider the listed symbols. Symbols are typically a name |
| but they may also be hexadecimal address. |
| |
| The hexadecimal address may be the start address of a symbol or |
| any other address to filter the trace records |
| |
| For example, to select the symbol noploop or the address 0x4007a0: |
| perf script --symbols=noploop,0x4007a0 |
| |
| Support filtering trace records by symbol name, start address of |
| symbol, any hexadecimal address and address range. |
| |
| The comparison order is: |
| |
| 1. symbol name comparison |
| 2. symbol start address comparison. |
| 3. any hexadecimal address comparison. |
| 4. address range comparison (see --addr-range). |
| |
| --addr-range:: |
| Use with -S or --symbols to list traced records within address range. |
| |
| For example, to list the traced records within the address range |
| [0x4007a0, 0x0x4007a9]: |
| perf script -S 0x4007a0 --addr-range 10 |
| |
| --dsos=:: |
| Only consider symbols in these DSOs. |
| |
| --call-trace:: |
| Show call stream for intel_pt traces. The CPUs are interleaved, but |
| can be filtered with -C. |
| |
| --call-ret-trace:: |
| Show call and return stream for intel_pt traces. |
| |
| --graph-function:: |
| For itrace only show specified functions and their callees for |
| itrace. Multiple functions can be separated by comma. |
| |
| --switch-on EVENT_NAME:: |
| Only consider events after this event is found. |
| |
| --switch-off EVENT_NAME:: |
| Stop considering events after this event is found. |
| |
| --show-on-off-events:: |
| Show the --switch-on/off events too. |
| |
| --stitch-lbr:: |
| Show callgraph with stitched LBRs, which may have more complete |
| callgraph. The perf.data file must have been obtained using |
| perf record --call-graph lbr. |
| Disabled by default. In common cases with call stack overflows, |
| it can recreate better call stacks than the default lbr call stack |
| output. But this approach is not foolproof. There can be cases |
| where it creates incorrect call stacks from incorrect matches. |
| The known limitations include exception handing such as |
| setjmp/longjmp will have calls/returns not match. |
| |
| :GMEXAMPLECMD: script |
| :GMEXAMPLESUBCMD: |
| include::guest-files.txt[] |
| |
| SEE ALSO |
| -------- |
| linkperf:perf-record[1], linkperf:perf-script-perl[1], |
| linkperf:perf-script-python[1], linkperf:perf-intel-pt[1], |
| linkperf:perf-dlfilter[1] |