pax_global_header 0000666 0000000 0000000 00000000064 13614503575 0014523 g ustar 00root root 0000000 0000000 52 comment=49b8cdf64f4b8b64e726f25d641e368dfae678a8
perf-tools-unstable-1.0.1~20200130+git49b8cdf/ 0000775 0000000 0000000 00000000000 13614503575 0020300 5 ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/LICENSE 0000664 0000000 0000000 00000043151 13614503575 0021311 0 ustar 00root root 0000000 0000000 GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Lesser General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
{description}
Copyright (C) {year} {fullname}
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
{signature of Ty Coon}, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. perf-tools-unstable-1.0.1~20200130+git49b8cdf/README.md 0000664 0000000 0000000 00000027640 13614503575 0021570 0 ustar 00root root 0000000 0000000 perf-tools
==========
A miscellaneous collection of in-development and unsupported performance analysis tools for Linux ftrace and perf_events (aka the "perf" command). Both ftrace and perf are core Linux tracing tools, included in the kernel source. Your system probably has ftrace already, and perf is often just a package add (see Prerequisites).
These tools are designed to be easy to install (fewest dependencies), provide advanced performance observability, and be simple to use: do one thing and do it well. This collection was created by Brendan Gregg (author of the DTraceToolkit).
Many of these tools employ workarounds so that functionality is possible on existing Linux kernels. Because of this, many tools have caveats (see man pages), and their implementation should be considered a placeholder until future kernel features, or new tracing subsystems, are added.
These are intended for Linux 3.2 and newer kernels. For Linux 2.6.x, see Warnings.
## Presentation
These tools were introduced in the USENIX LISA 2014 presentation: Linux Performance Analysis: New Tools and Old Secrets
- slides: http://www.slideshare.net/brendangregg/linux-performance-analysis-new-tools-and-old-secrets
- video: https://www.usenix.org/conference/lisa14/conference-program/presentation/gregg
## Contents
Using ftrace:
- [iosnoop](iosnoop): trace disk I/O with details including latency. [Examples](examples/iosnoop_example.txt).
- [iolatency](iolatency): summarize disk I/O latency as a histogram. [Examples](examples/iolatency_example.txt).
- [execsnoop](execsnoop): trace process exec() with command line argument details. [Examples](examples/execsnoop_example.txt).
- [opensnoop](opensnoop): trace open() syscalls showing filenames. [Examples](examples/opensnoop_example.txt).
- [killsnoop](killsnoop): trace kill() signals showing process and signal details. [Examples](examples/killsnoop_example.txt).
- fs/[cachestat](fs/cachestat): basic cache hit/miss statistics for the Linux page cache. [Examples](examples/cachestat_example.txt).
- net/[tcpretrans](net/tcpretrans): show TCP retransmits, with address and other details. [Examples](examples/tcpretrans_example.txt).
- system/[tpoint](system/tpoint): trace a given tracepoint. [Examples](examples/tpoint_example.txt).
- kernel/[funccount](kernel/funccount): count kernel function calls, matching a string with wildcards. [Examples](examples/funccount_example.txt).
- kernel/[functrace](kernel/functrace): trace kernel function calls, matching a string with wildcards. [Examples](examples/functrace_example.txt).
- kernel/[funcslower](kernel/funcslower): trace kernel functions slower than a threshold. [Examples](examples/funcslower_example.txt).
- kernel/[funcgraph](kernel/funcgraph): trace a graph of kernel function calls, showing children and times. [Examples](examples/funcgraph_example.txt).
- kernel/[kprobe](kernel/kprobe): dynamically trace a kernel function call or its return, with variables. [Examples](examples/kprobe_example.txt).
- user/[uprobe](user/uprobe): dynamically trace a user-level function call or its return, with variables. [Examples](examples/uprobe_example.txt).
- tools/[reset-ftrace](tools/reset-ftrace): reset ftrace state if needed. [Examples](examples/reset-ftrace_example.txt).
Using perf_events:
- misc/[perf-stat-hist](misc/perf-stat-hist): power-of aggregations for tracepoint variables. [Examples](examples/perf-stat-hist_example.txt).
- [syscount](syscount): count syscalls by syscall or process. [Examples](examples/syscount_example.txt).
- disk/[bitesize](disk/bitesize): histogram summary of disk I/O size. [Examples](examples/bitesize_example.txt).
Using eBPF:
- As a preview of things to come, see the bcc tracing [Tools section](https://github.com/iovisor/bcc/blob/master/README.md#tracing). These use [bcc](https://github.com/iovisor/bcc), a front end for using [eBPF](http://www.brendangregg.com/blog/2015-05-15/ebpf-one-small-step.html). bcc+eBPF will allow some of these tools to be rewritten and improved, and additional tools to be created.
## Screenshots
Showing new processes and arguments:
# ./execsnoop
Tracing exec()s. Ctrl-C to end.
PID PPID ARGS
22898 22004 man ls
22905 22898 preconv -e UTF-8
22908 22898 pager -s
22907 22898 nroff -mandoc -rLL=164n -rLT=164n -Tutf8
22906 22898 tbl
22911 22910 locale charmap
22912 22907 groff -mtty-char -Tutf8 -mandoc -rLL=164n -rLT=164n
22913 22912 troff -mtty-char -mandoc -rLL=164n -rLT=164n -Tutf8
22914 22912 grotty
Measuring block device I/O latency from queue insert to completion:
# ./iolatency -Q
Tracing block I/O. Output every 1 seconds. Ctrl-C to end.
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 1913 |######################################|
1 -> 2 : 438 |######### |
2 -> 4 : 100 |## |
4 -> 8 : 145 |### |
8 -> 16 : 43 |# |
16 -> 32 : 43 |# |
32 -> 64 : 1 |# |
[...]
Tracing the block:block_rq_insert tracepoint, with kernel stack traces, and only for reads:
# ./tpoint -s block:block_rq_insert 'rwbs ~ "*R*"'
cksum-11908 [000] d... 7269839.919098: block_rq_insert: 202,1 R 0 () 736560 + 136 [cksum]
cksum-11908 [000] d... 7269839.919107:
=> __elv_add_request
=> blk_flush_plug_list
=> blk_finish_plug
=> __do_page_cache_readahead
=> ondemand_readahead
=> page_cache_async_readahead
=> generic_file_read_iter
=> new_sync_read
=> vfs_read
=> SyS_read
=> system_call_fastpath
[...]
Count kernel function calls beginning with "bio_", summarize every second:
# ./funccount -i 1 'bio_*'
Tracing "bio_*"... Ctrl-C to end.
FUNC COUNT
bio_attempt_back_merge 26
bio_get_nr_vecs 361
bio_alloc 536
bio_alloc_bioset 536
bio_endio 536
bio_free 536
bio_fs_destructor 536
bio_init 536
bio_integrity_enabled 536
bio_put 729
bio_add_page 1004
[...]
There are many more examples in the [examples](examples) directory. Also see the [man pages](man/man8).
## Prerequisites
The intent is as few as possible. Eg, a Linux 3.2 server without debuginfo. See the tool man page for specifics.
### ftrace
FTRACE configured in the kernel. You may already have this configured and available in your kernel version, as FTRACE was first added in 2.6.27. This requires CONFIG_FTRACE and other FTRACE options depending on the tool. Some tools (eg, funccount) require CONFIG_FUNCTION_PROFILER.
### perf_events
Requires the "perf" command to be installed. This is in the linux-tools-common package. After installing that, perf may tell you to install an additional linux-tools package (linux-tools-_kernel_version_). perf can also be built under tools/perf in the kernel source. See [perf_events Prerequisites](http://www.brendangregg.com/perf.html#Prerequisites) for more details about getting perf_events to work fully.
### debugfs
Requires a kernel with CONFIG_DEBUG_FS option enabled. As with FTRACE, this may already be enabled (debugfs was added in 2.6.10-rc3). The debugfs also needs to be mounted:
```
# mount -t debugfs none /sys/kernel/debug
```
### awk
Many of there scripts use awk, and will try to use either mawk or gawk depending on the desired behavior: mawk for buffered output (because of its speed), and gawk for synchronous output (as fflush() works, allowing more efficient grouping of writes).
## Install
These are just scripts. Either grab everything:
```
git clone --depth 1 https://github.com/brendangregg/perf-tools
```
Or use the raw links on github to download individual scripts. Eg:
```
wget https://raw.githubusercontent.com/brendangregg/perf-tools/master/iosnoop
```
This preserves tabs (which copy-n-paste can mess up).
## Warnings
Ftrace was first added to Linux 2.6.27, and perf_events to Linux 2.6.31. These early versions had kernel bugs, and lockups and panics have been reported on 2.6.32 series kernels. This includes CentOS 6.x. If you must analyze older kernels, these tools may only be useful in a fault-tolerant environment, such as a lab with simulated issues. These tools have been primarily developed on Linux 3.2 and later kernels.
Depending on the tool, there may also be overhead incurred. See the next section.
## Internals and Overhead
perf_events is evolving. This collection began development circa Linux 3.16, with Linux 3.2 servers as the main target, at a time when perf_events lacks certain programmatic capabilities (eg, custom in-kernel aggregations). It's possible these will be added in a forthcoming kernel release. Until then, many of these tools employ workarounds, tricks, and hacks in order to work. Some of these tools pass event data to user space for post-processing, which costs much higher overhead than in-kernel aggregations. The overhead of each tool is described in its man page.
__WARNING__: In _extreme_ cases, your target application may run 5x slower when using these tools. Depending on the tool and kernel version, there may also be the risk of kernel panics. Read the program header for warnings, and test before use.
If the overhead is a problem, these tools can be improved. If a tool doesn't already, it could be rewritten in C to use perf_events_open() and mmap() for the trace buffer. It could also implement frequency counts in C, and operate on mmap() directly, rather than using awk/Perl/Python. Additional improvements are possible for ftrace-based tools, such as use of snapshots and per-instance buffers.
Some of these tools are intended as short-term workarounds until more kernel capabilities exist, at which point they can be substantially rewritten. Older versions of these tools will be kept in this repository, for older kernel versions.
As my main target is a fleet of Linux 3.2 servers that do not have debuginfo, these tools try not to require it. At times, this makes the tool more brittle than it needs to be, as I'm employing workarounds (that may be kernel version and platform specific) instead of using debuginfo information (which can be generic). See the man page for detailed prerequisites for each tool.
I've tried to use perf_events ("perf") where possible, since that interface has been developed for multi-user use. For various reasons I've often needed to use ftrace instead. ftrace is surprisingly powerful (thanks Steven Rostedt!), and not all of its features are exposed via perf, or in common usage. This tool collection is in some ways a demonstration of hidden Linux features using ftrace.
Since things are changing, it's very possible you may find some tools don't work on your Linux kernel version. Some expertise and assembly will be required to fix them.
## Links
A case study and summary:
- 13 Aug 2014: http://lwn.net/Articles/608497 Ftrace: The hidden light switch
Related articles:
- 28 Jun 2015: http://www.brendangregg.com/blog/2015-06-28/linux-ftrace-uprobe.html
- 31 Dec 2014: http://www.brendangregg.com/blog/2014-12-31/linux-page-cache-hit-ratio.html
- 06 Sep 2014: http://www.brendangregg.com/blog/2014-09-06/linux-ftrace-tcp-retransmit-tracing.html
- 28 Jul 2014: http://www.brendangregg.com/blog/2014-07-28/execsnoop-for-linux.html
- 25 Jul 2014: http://www.brendangregg.com/blog/2014-07-25/opensnoop-for-linux.html
- 23 Jul 2014: http://www.brendangregg.com/blog/2014-07-23/linux-iosnoop-latency-heat-maps.html
- 16 Jul 2014: http://www.brendangregg.com/blog/2014-07-16/iosnoop-for-linux.html
- 10 Jul 2014: http://www.brendangregg.com/blog/2014-07-10/perf-hacktogram.html
perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/ 0000775 0000000 0000000 00000000000 13614503575 0021050 5 ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/bitesize 0000777 0000000 0000000 00000000000 13614503575 0025512 2../disk/bitesize ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/cachestat 0000777 0000000 0000000 00000000000 13614503575 0025432 2../fs/cachestat ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/execsnoop 0000777 0000000 0000000 00000000000 13614503575 0025132 2../execsnoop ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/funccount 0000777 0000000 0000000 00000000000 13614503575 0026414 2../kernel/funccount ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/funcgraph 0000777 0000000 0000000 00000000000 13614503575 0026336 2../kernel/funcgraph ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/funcslower 0000777 0000000 0000000 00000000000 13614503575 0026762 2../kernel/funcslower ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/functrace 0000777 0000000 0000000 00000000000 13614503575 0026330 2../kernel/functrace ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/iolatency 0000777 0000000 0000000 00000000000 13614503575 0025102 2../iolatency ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/iosnoop 0000777 0000000 0000000 00000000000 13614503575 0024300 2../iosnoop ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/killsnoop 0000777 0000000 0000000 00000000000 13614503575 0025150 2../killsnoop ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/kprobe 0000777 0000000 0000000 00000000000 13614503575 0025150 2../kernel/kprobe ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/opensnoop 0000777 0000000 0000000 00000000000 13614503575 0025164 2../opensnoop ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/perf-stat-hist 0000777 0000000 0000000 00000000000 13614503575 0027603 2../misc/perf-stat-hist ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/reset-ftrace 0000777 0000000 0000000 00000000000 13614503575 0027234 2../tools/reset-ftrace ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/syscount 0000777 0000000 0000000 00000000000 13614503575 0024702 2../syscount ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/tcpretrans 0000777 0000000 0000000 00000000000 13614503575 0026264 2../net/tcpretrans ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/tpoint 0000777 0000000 0000000 00000000000 13614503575 0025302 2../system/tpoint ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/bin/uprobe 0000777 0000000 0000000 00000000000 13614503575 0024672 2../user/uprobe ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/deprecated/ 0000775 0000000 0000000 00000000000 13614503575 0022400 5 ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/deprecated/README.md 0000664 0000000 0000000 00000000036 13614503575 0023656 0 ustar 00root root 0000000 0000000 Deprecated versions of tools.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/deprecated/execsnoop-proc 0000664 0000000 0000000 00000011021 13614503575 0025262 0 ustar 00root root 0000000 0000000 #!/usr/bin/perl
#
# execsnoop - trace process exec() with arguments. /proc version.
# Written using Linux ftrace.
#
# This shows the execution of new processes, especially short-lived ones that
# can be missed by sampling tools such as top(1).
#
# USAGE: ./execsnoop [-h] [-n name]
#
# REQUIREMENTS: FTRACE CONFIG, sched:sched_process_exec tracepoint (you may
# already have these on recent kernels), and Perl.
#
# This traces exec() from the fork()->exec() sequence, which means it won't
# catch new processes that only fork(), and, it will catch processes that
# re-exec. This instruments sched:sched_process_exec without buffering, and then
# in user-space (this program) reads PPID and process arguments asynchronously
# from /proc.
#
# If the process traced is very short-lived, this program may miss reading
# arguments and PPID details. In that case, ">" and "?" will be printed
# respectively. This program is best-effort, and should be improved in the
# future when other kernel capabilities are made available. If you need a
# more reliable tool now, then consider other tracing alternatives (eg,
# SystemTap). This tool is really a proof of concept to see what ftrace can
# currently do.
#
# From perf-tools: https://github.com/brendangregg/perf-tools
#
# See the execsnoop(8) man page (in perf-tools) for more info.
#
# COPYRIGHT: Copyright (c) 2014 Brendan Gregg.
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
#
# (http://www.gnu.org/copyleft/gpl.html)
#
# 07-Jul-2014 Brendan Gregg Created this.
use strict;
use warnings;
use POSIX qw(strftime);
use Getopt::Long;
my $tracing = "/sys/kernel/debug/tracing";
my $flock = "/var/tmp/.ftrace-lock";
my $tpdir = "sched/sched_process_exec";
my $tptext = $tpdir; $tptext =~ s/\//:/;
local $SIG{INT} = \&cleanup;
local $SIG{QUIT} = \&cleanup;
local $SIG{TERM} = \&cleanup;
local $SIG{PIPE} = \&cleanup;
local $SIG{HUP} = \&cleanup;
$| = 1;
### options
my ($name, $help);
GetOptions("name=s" => \$name,
"help" => \$help)
or usage();
usage() if $help;
sub usage {
print STDERR "USAGE: execsnoop [-h] [-n name]\n";
print STDERR " eg,\n";
print STDERR " execsnoop -n ls # show \"ls\" cmds only.\n";
exit;
}
sub ldie {
unlink $flock;
die @_;
}
sub writeto {
my ($string, $file) = @_;
open FILE, ">$file" or return 0;
print FILE $string or return 0;
close FILE or return 0;
}
### check permissions
chdir "$tracing" or ldie "ERROR: accessing tracing. Root? Kernel has FTRACE?" .
"\ndebugfs mounted? (mount -t debugfs debugfs /sys/kernel/debug)";
### ftrace lock
if (-e $flock) {
open FLOCK, $flock; my $fpid = ; chomp $fpid; close FLOCK;
die "ERROR: ftrace may be in use by PID $fpid ($flock)";
}
writeto "$$", $flock or die "ERROR: unable to write $flock.";
### setup and begin tracing
writeto "nop", "current_tracer" or ldie "ERROR: disabling current_tracer.";
writeto "1", "events/$tpdir/enable" or ldie "ERROR: enabling tracepoint " .
"\"$tptext\" (tracepoint missing in this kernel version?)";
open TPIPE, "trace_pipe" or warn "ERROR: opening trace_pipe.";
printf "%-8s %6s %6s %s\n", "TIME", "PID", "PPID", "ARGS";
while () {
my ($taskpid, $rest) = split;
my ($task, $pid) = $taskpid =~ /(.*)-(\d+)/;
next if (defined $name and $name ne $task);
my $args = "$task >";
if (open CMDLINE, "/proc/$pid/cmdline") {
my $arglist = ;
if (defined $arglist) {
$arglist =~ s/\000/ /g;
$args = $arglist;
}
close CMDLINE;
}
my $ppid = "?";
if (open STAT, "/proc/$pid/stat") {
my $fields = ;
if (defined $fields) {
$ppid = (split ' ', $fields)[3];
}
close STAT;
}
my $now = strftime "%H:%M:%S", localtime;
printf "%-8s %6s %6s %s\n", $now, $pid, $ppid, $args;
}
### end tracing
cleanup();
sub cleanup {
print "\nEnding tracing...\n";
close TPIPE;
writeto "0", "events/$tpdir/enable" or
ldie "ERROR: disabling \"$tptext\"";
writeto "", "trace";
unlink $flock;
exit;
}
perf-tools-unstable-1.0.1~20200130+git49b8cdf/deprecated/execsnoop-proc.8 0000664 0000000 0000000 00000005072 13614503575 0025441 0 ustar 00root root 0000000 0000000 .TH execsnoop\-proc 8 "2014-07-07" "USER COMMANDS"
.SH NAME
execsnoop\-proc \- trace process exec() with arguments. Uses Linux ftrace. /proc version.
.SH SYNOPSIS
.B execsnoop\-proc
[\-h] [\-n name]
.SH DESCRIPTION
execsnoop\-proc traces process execution, showing PID, PPID, and argument details
if possible.
This traces exec() from the fork()->exec() sequence, which means it won't
catch new processes that only fork(), and, it will catch processes that
re-exec. This instruments sched:sched_process_exec without buffering, and then
in user-space (this program) reads PPID and process arguments asynchronously
from /proc.
If the process traced is very short-lived, this program may miss reading
arguments and PPID details. In that case, ">" and "?" will be printed
respectively.
This program is best-effort (a hack), and should be improved in the future when
other kernel capabilities are made available. It may be useful in the meantime.
If you need a more reliable tool now, consider other tracing alternates (eg,
SystemTap). This tool is really a proof of concept to see what ftrace can
currently do.
See execsnoop(8) for another version that reads arguments from registers
instead of /proc.
Since this uses ftrace, only the root user can use this tool.
.SH REQUIREMENTS
FTRACE CONFIG and the sched:sched_process_exec tracepoint, which you may already
have enabled and available on recent kernels, and Perl.
.SH OPTIONS
\-n name
Only show processes that match this name. This is filtered in user space.
.TP
\-h
Print usage message.
.SH EXAMPLES
.TP
Trace all new processes and arguments (if possible):
.B execsnoop\-proc
.TP
Trace all new processes with process name "sed":
.B execsnoop\-proc -n sed
.SH FIELDS
.TP
TIME
Time of process exec(): HH:MM:SS.
.TP
PID
Process ID.
.TP
PPID
Parent process ID, if this was able to be read (may be missed for short-lived
processes). If it is unable to be read, "?" is printed.
.TP
ARGS
Command line arguments, if these were able to be read in time (may be missed
for short-lived processes). If they are unable to be read, ">" is printed.
.SH OVERHEAD
This reads and processes exec() events in user space as they occur. Since the
rate of exec() is expected to be low (< 500/s), the overhead is expected to
be small or negligible.
.SH SOURCE
This is from the perf-tools collection.
.IP
https://github.com/brendangregg/perf-tools
.PP
Also look under the examples directory for a text file containing example
usage, output, and commentary for this tool.
.SH OS
Linux
.SH STABILITY
Unstable - in development.
.SH AUTHOR
Brendan Gregg
.SH SEE ALSO
execsnoop(8), top(1)
perf-tools-unstable-1.0.1~20200130+git49b8cdf/deprecated/execsnoop-proc_example.txt 0000664 0000000 0000000 00000003034 13614503575 0027620 0 ustar 00root root 0000000 0000000 Demonstrations of execsnoop-proc, the Linux ftrace version.
Here's execsnoop showing what's really executed by "man ls":
# ./execsnoop
TIME PID PPID ARGS
17:52:37 22406 25781 man ls
17:52:37 22413 22406 preconv -e UTF-8
17:52:37 22416 22406 pager -s
17:52:37 22415 22406 /bin/sh /usr/bin/nroff -mandoc -rLL=162n -rLT=162n -Tutf8
17:52:37 22414 22406 tbl
17:52:37 22419 22418 locale charmap
17:52:37 22420 22415 groff -mtty-char -Tutf8 -mandoc -rLL=162n -rLT=162n
17:52:37 22421 22420 troff -mtty-char -mandoc -rLL=162n -rLT=162n -Tutf8
17:52:37 22422 22420 grotty
These are short-lived processes, where the argument and PPID details are often
missed by execsnoop:
# ./execsnoop
TIME PID PPID ARGS
18:00:33 26750 1961 multilog >
18:00:33 26749 1972 multilog >
18:00:33 26749 1972 multilog >
18:00:33 26751 ? mkdir >
18:00:33 26749 1972 multilog >
18:00:33 26752 ? chown >
18:00:33 26750 1961 multilog >
18:00:33 26750 1961 multilog >
18:00:34 26753 1961 multilog >
18:00:34 26754 1972 multilog >
[...]
This will be fixed in a later version, but likely requires some kernel or
tracer changes first (fetching cmdline as the probe fires).
The previous examples were on Linux 3.14 and 3.16 kernels. Here's a 3.2 system
I'm running:
# ./execsnoop
ERROR: enabling tracepoint "sched:sched_process_exec" (tracepoint missing in this kernel version?) at ./execsnoop line 78.
This kernel version is missing the sched_process_exec probe, which is pretty
annoying.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/disk/ 0000775 0000000 0000000 00000000000 13614503575 0021232 5 ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/disk/bitesize 0000775 0000000 0000000 00000011556 13614503575 0023006 0 ustar 00root root 0000000 0000000 #!/bin/bash
#
# bitesize - show disk I/O size as a histogram.
# Written using Linux perf_events (aka "perf").
#
# This can be used to characterize the distribution of block device I/O
# sizes. To study I/O in more detail, see iosnoop(8).
#
# USAGE: bitesize [-h] [-b buckets] [seconds]
# eg,
# ./bitesize 10
#
# Run "bitesize -h" for full usage.
#
# REQUIREMENTS: perf_events and block:block_rq_issue tracepoint, which you may
# already have on recent kernels.
#
# This uses multiple counting tracepoints with different filters, one for each
# histogram bucket. While this is summarized in-kernel, the use of multiple
# tracepoints does add addiitonal overhead, which is more evident if you add
# more buckets. In the future this functionality will be available in an
# efficient way in the kernel, and this tool can be rewritten.
#
# From perf-tools: https://github.com/brendangregg/perf-tools
#
# COPYRIGHT: Copyright (c) 2014 Brendan Gregg.
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
#
# (http://www.gnu.org/copyleft/gpl.html)
#
# 22-Jul-2014 Brendan Gregg Created this.
duration=0
buckets=(1 8 64 128)
secsz=512
trap ':' INT QUIT TERM PIPE HUP
function usage {
cat <<-END >&2
USAGE: bitesize [-h] [-b buckets] [seconds]
-b buckets # specify histogram buckets (Kbytes)
-h # this usage message
eg,
bitesize # trace I/O size until Ctrl-C
bitesize 10 # trace I/O size for 10 seconds
bitesize -b "8 16 32" # specify custom bucket points
END
exit
}
function die {
echo >&2 "$@"
exit 1
}
### process options
while getopts b:h opt
do
case $opt in
b) buckets=($OPTARG) ;;
h|?) usage ;;
esac
done
shift $(( $OPTIND - 1 ))
tpoint=block:block_rq_issue
var=nr_sector
duration=$1
### convert buckets (Kbytes) to disk sectors
i=0
sectors=(${buckets[*]})
((max_i = ${#buckets[*]} - 1))
while (( i <= max_i )); do
(( sectors[$i] = ${sectors[$i]} * 1024 / $secsz ))
# avoid negative array index errors for old version bash
if (( i > 0 ));then
if (( ${sectors[$i]} <= ${sectors[$i - 1]} )); then
die "ERROR: bucket list must increase in size."
fi
fi
(( i++ ))
done
### build list of tracepoints and filters for each histogram bucket
max_b=${buckets[$max_i]}
max_s=${sectors[$max_i]}
tpoints="-e $tpoint --filter \"$var < ${sectors[0]}\""
awkarray=
i=0
while (( i < max_i )); do
tpoints="$tpoints -e $tpoint --filter \"$var >= ${sectors[$i]} && "
tpoints="$tpoints $var < ${sectors[$i + 1]}\""
awkarray="$awkarray buckets[$i]=${buckets[$i]};"
(( i++ ))
done
awkarray="$awkarray buckets[$max_i]=${buckets[$max_i]};"
tpoints="$tpoints -e $tpoint --filter \"$var >= ${sectors[$max_i]}\""
### prepare to run
if (( duration )); then
etext="for $duration seconds"
cmd="sleep $duration"
else
etext="until Ctrl-C"
cmd="sleep 999999"
fi
echo "Tracing block I/O size (bytes), $etext..."
### run perf
out="-o /dev/stdout" # a workaround needed in linux 3.2; not by 3.4.15
stat=$(eval perf stat $tpoints -a $out $cmd 2>&1)
if (( $? != 0 )); then
echo >&2 "ERROR running perf:"
echo >&2 "$stat"
exit
fi
### find max value for ASCII histogram
most=$(echo "$stat" | awk -v tpoint=$tpoint '
$2 == tpoint { gsub(/,/, ""); if ($1 > m) { m = $1 } }
END { print m }'
)
### process output
echo
echo "$stat" | awk -v tpoint=$tpoint -v max_i=$max_i -v most=$most '
function star(sval, smax, swidth) {
stars = ""
# using int could avoid error on gawk
if (int(smax) == 0) return ""
for (si = 0; si < (swidth * sval / smax); si++) {
stars = stars "#"
}
return stars
}
BEGIN {
'"$awkarray"'
printf(" %-15s: %-8s %s\n", "Kbytes", "I/O",
"Distribution")
}
/Performance counter stats/ { i = -1 }
# reverse order of rule set is important
{ ok = 0 }
$2 == tpoint { num = $1; gsub(/,/, "", num); ok = 1 }
ok && i >= max_i {
printf(" %10.1f -> %-10s: %-8s |%-38s|\n",
buckets[i], "", num, star(num, most, 38))
next
}
ok && i >= 0 && i < max_i {
printf(" %10.1f -> %-10.1f: %-8s |%-38s|\n",
buckets[i], buckets[i+1] - 0.1, num,
star(num, most, 38))
i++
next
}
ok && i == -1 {
printf(" %10s -> %-10.1f: %-8s |%-38s|\n", "",
buckets[0] - 0.1, num, star(num, most, 38))
i++
}
'
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/ 0000775 0000000 0000000 00000000000 13614503575 0022116 5 ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/bitesize_example.txt 0000664 0000000 0000000 00000004677 13614503575 0026226 0 ustar 00root root 0000000 0000000 Demonstrations of bitesize, the Linux perf_events version.
bitesize traces block I/O issued, and reports a histogram of I/O size. By
default five buckets are used to gather statistics on common I/O sizes:
# ./bitesize
Tracing block I/O size (bytes), until Ctrl-C...
^C
Kbytes : I/O Distribution
-> 0.9 : 0 | |
1.0 -> 7.9 : 38 |# |
8.0 -> 63.9 : 10108 |######################################|
64.0 -> 127.9 : 13 |# |
128.0 -> : 1 |# |
In this case, most of the I/O was between 8 and 63.9 Kbytes. The "63.9"
really means "less than 64".
Specifying custom buckets to examine the I/O size in more detail:
# ./bitesize -b "8 16 24 32"
Tracing block I/O size (bytes), until Ctrl-C...
^C
Kbytes : I/O Distribution
-> 7.9 : 89 |# |
8.0 -> 15.9 : 14665 |######################################|
16.0 -> 23.9 : 657 |## |
24.0 -> 31.9 : 661 |## |
32.0 -> : 376 |# |
The I/O is mostly between 8 and 15.9 Kbytes
It's probably 8 Kbytes. Checking:
# ./bitesize -b "8 9"
Tracing block I/O size (bytes), until Ctrl-C...
^C
Kbytes : I/O Distribution
-> 7.9 : 62 |# |
8.0 -> 8.9 : 11719 |######################################|
9.0 -> : 1358 |##### |
It is.
The overhead of this tool is relative to the number of buckets used, hence only
using what is necessary.
To study this I/O in more detail, I can use iosnoop(8) and capture it to a file
for post-processing.
Use -h to print the USAGE message:
# ./bitesize -h
USAGE: bitesize [-h] [-b buckets] [seconds]
-b buckets # specify histogram buckets (Kbytes)
-h # this usage message
eg,
bitesize # trace I/O size until Ctrl-C
bitesize 10 # trace I/O size for 10 seconds
bitesize -b "8 16 32" # specify custom bucket points
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/cachestat_example.txt 0000664 0000000 0000000 00000005042 13614503575 0026332 0 ustar 00root root 0000000 0000000 Demonstrations of cachestat, the Linux ftrace version.
Here is some sample output showing file system cache statistics, followed by
the workload that caused it:
# ./cachestat -t
Counting cache functions... Output every 1 seconds.
TIME HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB
08:28:57 415 0 0 100.0% 1 191
08:28:58 411 0 0 100.0% 1 191
08:28:59 362 97 0 78.9% 0 8
08:29:00 411 0 0 100.0% 0 9
08:29:01 775 20489 0 3.6% 0 89
08:29:02 411 0 0 100.0% 0 89
08:29:03 6069 0 0 100.0% 0 89
08:29:04 15249 0 0 100.0% 0 89
08:29:05 411 0 0 100.0% 0 89
08:29:06 411 0 0 100.0% 0 89
08:29:07 411 0 3 100.0% 0 89
[...]
I used the -t option to include the TIME column, to make describing the output
easier.
The workload was:
# echo 1 > /proc/sys/vm/drop_caches; sleep 2; cksum 80m; sleep 2; cksum 80m
At 8:28:58, the page cache was dropped by the first command, which can be seen
by the drop in size for "CACHE_MB" (page cache size) from 191 Mbytes to 8.
After a 2 second sleep, a cksum command was issued at 8:29:01, for an 80 Mbyte
file (called "80m"), which caused a total of ~20,400 misses ("MISSES" column),
and the page cache size to grow by 80 Mbytes. The hit ratio during this dropped
to 3.6%. Finally, after another 2 second sleep, at 8:29:03 the cksum command
was run a second time, this time hitting entirely from cache.
Instrumenting all file system cache accesses does cost some overhead, and this
tool might slow your target system by 2% or so. Test before use if this is a
concern.
This tool also uses dynamic tracing, and is tied to Linux kernel implementation
details. If it doesn't work for you, it probably needs fixing.
Use -h to print the USAGE message:
# ./cachestat -h
USAGE: cachestat [-Dht] [interval]
-D # print debug counters
-h # this usage message
-t # include timestamp
interval # output interval in secs (default 1)
eg,
cachestat # show stats every second
cachestat 5 # show stats every 5 seconds
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/execsnoop_example.txt 0000664 0000000 0000000 00000015314 13614503575 0026401 0 ustar 00root root 0000000 0000000 Demonstrations of execsnoop, the Linux ftrace version.
Here's execsnoop showing what's really executed by "man ls":
# ./execsnoop
Tracing exec()s. Ctrl-C to end.
PID PPID ARGS
22898 22004 man ls
22905 22898 preconv -e UTF-8
22908 22898 pager -s
22907 22898 nroff -mandoc -rLL=164n -rLT=164n -Tutf8
22906 22898 tbl
22911 22910 locale charmap
22912 22907 groff -mtty-char -Tutf8 -mandoc -rLL=164n -rLT=164n
22913 22912 troff -mtty-char -mandoc -rLL=164n -rLT=164n -Tutf8
22914 22912 grotty
Many commands. This is particularly useful for understanding application
startup.
Another use for execsnoop is identifying short-lived processes. Eg, with the -t
option to see timestamps:
# ./execsnoop -t
Tracing exec()s. Ctrl-C to end.
TIMEs PID PPID ARGS
7419756.154031 8185 8181 mawk -W interactive -v o=1 -v opt_name=0 -v name= [...]
7419756.154131 8186 8184 cat -v trace_pipe
7419756.245264 8188 1698 ./run
7419756.245691 8189 1696 ./run
7419756.246212 8187 1689 ./run
7419756.278993 8190 1693 ./run
7419756.278996 8191 1692 ./run
7419756.288430 8192 1695 ./run
7419756.290115 8193 1691 ./run
7419756.292406 8194 1699 ./run
7419756.293986 8195 1690 ./run
7419756.294149 8196 1686 ./run
7419756.296527 8197 1687 ./run
7419756.296973 8198 1697 ./run
7419756.298356 8200 1685 ./run
7419756.298683 8199 1688 ./run
7419757.269883 8201 1696 ./run
[...]
So we're running many "run" commands every second. The PPID is included, so I
can debug this further (they are "supervise" processes).
Short-lived processes can consume CPU and not be visible from top(1), and can
be the source of hidden performance issues.
Here's another example: I noticed CPU usage was high in top(1), but couldn't
see the responsible process:
$ top
top - 00:04:32 up 78 days, 15:41, 3 users, load average: 0.85, 0.29, 0.14
Tasks: 123 total, 1 running, 121 sleeping, 0 stopped, 1 zombie
Cpu(s): 15.7%us, 34.9%sy, 0.0%ni, 49.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.2%st
Mem: 7629464k total, 7537216k used, 92248k free, 1376492k buffers
Swap: 0k total, 0k used, 0k free, 5432356k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7225 bgregg-t 20 0 29480 6196 2128 S 3 0.1 0:02.64 ec2rotatelogs
1 root 20 0 24320 2256 1340 S 0 0.0 0:01.23 init
2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0 0.0 1:19.61 ksoftirqd/0
4 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/0:0
5 root 20 0 0 0 0 S 0 0.0 0:00.01 kworker/u:0
6 root RT 0 0 0 0 S 0 0.0 0:16.00 migration/0
7 root RT 0 0 0 0 S 0 0.0 0:17.29 watchdog/0
8 root RT 0 0 0 0 S 0 0.0 0:15.85 migration/1
9 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/1:0
[...]
See the line starting with "Cpu(s):". So there's about 50% CPU utilized (this
is a two CPU server, so that's equivalent to one full CPU), but this CPU usage
isn't visible from the process listing.
vmstat agreed, showing the same average CPU usage statistics:
# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
2 0 0 92816 1376476 5432188 0 0 0 3 2 1 0 1 99 0
1 0 0 92676 1376484 5432264 0 0 0 24 6573 6130 12 38 49 0
1 0 0 91964 1376484 5432272 0 0 0 0 6529 6097 16 35 49 0
1 0 0 92692 1376484 5432272 0 0 0 0 6192 5775 17 35 49 0
1 0 0 92692 1376484 5432272 0 0 0 0 6554 6121 14 36 50 0
1 0 0 91940 1376484 5432272 0 0 0 12 6546 6101 13 38 49 0
1 0 0 92560 1376484 5432272 0 0 0 0 6201 5769 15 35 49 0
1 0 0 92676 1376484 5432272 0 0 0 0 6524 6123 17 34 49 0
1 0 0 91932 1376484 5432272 0 0 0 0 6546 6107 10 40 49 0
1 0 0 92832 1376484 5432272 0 0 0 0 6057 5710 13 38 49 0
1 0 0 92248 1376484 5432272 0 0 84 28 6592 6183 16 36 48 1
1 0 0 91504 1376492 5432348 0 0 0 12 6540 6098 18 33 49 1
[...]
So this could be caused by short-lived processes, who vanish before they are
seen by top(1). Do I have my execsnoop handy? Yes:
# ~/perf-tools/bin/execsnoop
Tracing exec()s. Ctrl-C to end.
PID PPID ARGS
10239 10229 gawk -v o=0 -v opt_name=0 -v name= -v opt_duration=0 [...]
10240 10238 cat -v trace_pipe
10242 7225 sh [?]
10243 10242 /usr/sbin/lsof -X /logs/tomcat/cores/threaddump.20141215.201201.3122.txt
10245 7225 sh [?]
10246 10245 /usr/sbin/lsof -X /logs/tomcat/cores/threaddump.20141215.202201.3122.txt
10248 7225 sh [?]
10249 10248 /usr/sbin/lsof -X /logs/tomcat/cores/threaddump.20141215.203201.3122.txt
10251 7225 sh [?]
10252 10251 /usr/sbin/lsof -X /logs/tomcat/cores/threaddump.20141215.204201.3122.txt
10254 7225 sh [?]
10255 10254 /usr/sbin/lsof -X /logs/tomcat/cores/threaddump.20141215.205201.3122.txt
10257 7225 sh [?]
10258 10257 /usr/sbin/lsof -X /logs/tomcat/cores/threaddump.20141215.210201.3122.txt
10260 7225 sh [?]
10261 10260 /usr/sbin/lsof -X /logs/tomcat/cores/threaddump.20141215.211201.3122.txt
10263 7225 sh [?]
10264 10263 /usr/sbin/lsof -X /logs/tomcat/cores/threaddump.20141215.212201.3122.txt
10266 7225 sh [?]
10267 10266 /usr/sbin/lsof -X /logs/tomcat/cores/threaddump.20141215.213201.3122.txt
[...]
The output scrolled quickly, showing that many shell and lsof processes were
being launched. If you check the PID and PPID columns carefully, you can see that
these are ultimately all from PID 7225. We saw that earlier in the top output:
ec2rotatelogs, at 3% CPU. I now know the culprit.
I should have used "-t" to show the timestamps with this example.
Run -h to print the USAGE message:
# ./execsnoop -h
USAGE: execsnoop [-hrt] [-a argc] [-d secs] [name]
-d seconds # trace duration, and use buffers
-a argc # max args to show (default 8)
-r # include re-execs
-t # include time (seconds)
-h # this usage message
name # process name to match (REs allowed)
eg,
execsnoop # watch exec()s live (unbuffered)
execsnoop -d 1 # trace 1 sec (buffered)
execsnoop grep # trace process names containing grep
execsnoop 'log$' # filenames ending in "log"
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/funccount_example.txt 0000664 0000000 0000000 00000010544 13614503575 0026402 0 ustar 00root root 0000000 0000000 Demonstrations of funccount, the Linux ftrace version.
Tracing all kernel functions that start with "bio_" (which would be block
interface functions), and counting how many times they were executed until
Ctrl-C is hit:
# ./funccount 'bio_*'
Tracing "bio_*"... Ctrl-C to end.
^C
FUNC COUNT
bio_attempt_back_merge 26
bio_get_nr_vecs 361
bio_alloc 536
bio_alloc_bioset 536
bio_endio 536
bio_free 536
bio_fs_destructor 536
bio_init 536
bio_integrity_enabled 536
bio_put 729
bio_add_page 1004
Note that these counts are performed in-kernel context, using the ftrace
function profiler, which means this is a (relatively) low overhead technique.
Test yourself to quantify overhead.
As was demonstrated here, wildcards can be used. Individual functions can also
be specified. For example, all of the following are valid arguments:
bio_init
bio_*
*init
*bio*
A "*" within a string (eg, "bio*init") is not supported.
The full list of what can be traced is in:
/sys/kernel/debug/tracing/available_filter_functions, which can be grep'd to
check what is there. Note that grep uses regular expressions, whereas
funccount uses globbing for wildcards.
Counting all "tcp_" kernel functions, and printing a summary every one second:
# ./funccount -i 1 -t 5 'tcp_*'
Tracing "tcp_*". Top 5 only... Ctrl-C to end.
FUNC COUNT
tcp_cleanup_rbuf 386
tcp_service_net_dma 386
tcp_established_options 549
tcp_v4_md5_lookup 560
tcp_v4_md5_do_lookup 890
FUNC COUNT
tcp_service_net_dma 498
tcp_cleanup_rbuf 499
tcp_established_options 664
tcp_v4_md5_lookup 672
tcp_v4_md5_do_lookup 1071
[...]
Neat.
Tracing all "ext4*" kernel functions for 10 seconds, and printing the top 25:
# ./funccount -t 25 -d 10 'ext4*'
Tracing "ext4*" for 10 seconds. Top 25 only...
FUNC COUNT
ext4_inode_bitmap 840
ext4_meta_trans_blocks 840
ext4_ext_drop_refs 843
ext4_find_entry 845
ext4_discard_preallocations 1008
ext4_free_inodes_count 1120
ext4_group_desc_csum 1120
ext4_group_desc_csum_set 1120
ext4_getblk 1128
ext4_es_free_extent 1328
ext4_map_blocks 1471
ext4_es_lookup_extent 1751
ext4_mb_check_limits 1873
ext4_es_lru_add 2031
ext4_data_block_valid 2312
ext4_journal_check_start 3080
ext4_mark_inode_dirty 5320
ext4_get_inode_flags 5955
ext4_get_inode_loc 5955
ext4_mark_iloc_dirty 5955
ext4_reserve_inode_write 5955
ext4_inode_table 7076
ext4_get_group_desc 8476
ext4_has_inline_data 9492
ext4_inode_touch_time_cmp 38980
Ending tracing...
So ext4_inode_touch_time_cmp() was called the most frequently, at 38,980 times.
This may be normal, this may not. The purpose of this tool is to give you one
view of how one or many kernel functions are executed. Previously I had little
idea what ext4 was doing internally. Now I know the top 25 functions, and their
rate, and can begin researching them from the source code.
Use -h to print the USAGE message:
# ./funccount -h
USAGE: funccount [-hT] [-i secs] [-d secs] [-t top] funcstring
-d seconds # total duration of trace
-h # this usage message
-i seconds # interval summary
-t top # show top num entries only
-T # include timestamp (for -i)
eg,
funccount 'vfs*' # trace all funcs that match "vfs*"
funccount -d 5 'tcp*' # trace "tcp*" funcs for 5 seconds
funccount -t 10 'ext3*' # show top 10 "ext3*" funcs
funccount -i 1 'ext3*' # summary every 1 second
funccount -i 1 -d 5 'ext3*' # 5 x 1 second summaries
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/funcgraph_example.txt 0000664 0000000 0000000 00000460717 13614503575 0026366 0 ustar 00root root 0000000 0000000 Demonstrations of funcgraph, the Linux ftrace version.
I'll start by showing do_nanosleep(), since it's usually a low frequency
function that can be easily triggered (run "vmstat 1"):
# ./funcgraph do_nanosleep
Tracing "do_nanosleep"... Ctrl-C to end.
0) | do_nanosleep() {
0) | hrtimer_start_range_ns() {
0) | __hrtimer_start_range_ns() {
0) | lock_hrtimer_base.isra.24() {
0) 0.198 us | _raw_spin_lock_irqsave();
0) 0.908 us | }
0) 0.061 us | idle_cpu();
0) 0.117 us | ktime_get();
0) 0.371 us | enqueue_hrtimer();
0) 0.075 us | _raw_spin_unlock_irqrestore();
0) 3.447 us | }
0) 3.998 us | }
0) | schedule() {
0) | __schedule() {
0) 0.050 us | rcu_note_context_switch();
0) 0.055 us | _raw_spin_lock_irq();
0) | deactivate_task() {
0) | dequeue_task() {
0) 0.142 us | update_rq_clock();
0) | dequeue_task_fair() {
0) | dequeue_entity() {
0) | update_curr() {
0) 0.086 us | cpuacct_charge();
0) 0.757 us | }
0) 0.052 us | clear_buddies();
0) 0.103 us | update_cfs_load();
0) | update_cfs_shares() {
0) | reweight_entity() {
0) 0.077 us | update_curr();
0) 0.438 us | }
0) 0.794 us | }
0) 3.067 us | }
0) 0.064 us | set_next_buddy();
0) 0.066 us | update_cfs_load();
0) 0.085 us | update_cfs_shares();
0) | hrtick_update() {
0) 0.063 us | hrtick_start_fair();
0) 0.367 us | }
0) 5.188 us | }
0) 5.923 us | }
0) 6.228 us | }
0) | put_prev_task_fair() {
0) 0.078 us | put_prev_entity();
0) | put_prev_entity() {
0) 0.070 us | update_curr();
0) 0.074 us | __enqueue_entity();
0) 0.737 us | }
0) 1.367 us | }
0) | pick_next_task_fair() {
0) | pick_next_entity() {
0) 0.052 us | wakeup_preempt_entity.isra.95();
0) 0.070 us | clear_buddies();
0) 0.676 us | }
0) | set_next_entity() {
0) 0.052 us | update_stats_wait_end();
0) 0.435 us | }
0) | pick_next_entity() {
0) 0.065 us | clear_buddies();
0) 0.376 us | }
0) | set_next_entity() {
0) 0.067 us | update_stats_wait_end();
0) 0.374 us | }
0) 0.051 us | hrtick_start_fair();
0) 3.879 us | }
0) 0.057 us | paravirt_start_context_switch();
0) | xen_load_sp0() {
0) 0.050 us | paravirt_get_lazy_mode();
0) 0.057 us | __xen_mc_entry();
0) 0.056 us | paravirt_get_lazy_mode();
0) 1.441 us | }
0) | xen_load_tls() {
0) 0.049 us | paravirt_get_lazy_mode();
0) 0.051 us | paravirt_get_lazy_mode();
0) | load_TLS_descriptor() {
0) | arbitrary_virt_to_machine() {
0) 0.081 us | __virt_addr_valid();
0) 0.052 us | __phys_addr();
0) 0.084 us | get_phys_to_machine();
0) 1.115 us | }
0) 0.053 us | __xen_mc_entry();
0) 1.744 us | }
0) | load_TLS_descriptor() {
0) | arbitrary_virt_to_machine() {
0) 0.053 us | __virt_addr_valid();
0) 0.056 us | __phys_addr();
0) 0.057 us | get_phys_to_machine();
0) 0.990 us | }
0) 0.053 us | __xen_mc_entry();
0) 1.583 us | } /* load_TLS_descriptor */
0) | load_TLS_descriptor() {
0) | arbitrary_virt_to_machine() {
0) 0.057 us | __virt_addr_valid();
0) 0.051 us | __phys_addr();
0) 0.053 us | get_phys_to_machine();
0) 0.978 us | }
0) 0.052 us | __xen_mc_entry();
0) 1.586 us | }
0) 0.052 us | paravirt_get_lazy_mode();
0) 6.630 us | }
0) | xen_end_context_switch() {
0) 0.666 us | xen_mc_flush();
0) 0.050 us | paravirt_end_context_switch();
0) 1.286 us | }
0) 0.172 us | xen_write_msr_safe();
------------------------------------------
0) platfor-3210 => vmstat-2854
------------------------------------------
0) | do_nanosleep() {
0) | hrtimer_start_range_ns() {
0) | __hrtimer_start_range_ns() {
0) | lock_hrtimer_base.isra.24() {
0) 0.217 us | _raw_spin_lock_irqsave();
0) 0.831 us | }
0) 0.066 us | idle_cpu();
0) 0.123 us | ktime_get();
0) 1.172 us | enqueue_hrtimer();
0) 0.089 us | _raw_spin_unlock_irqrestore();
0) 4.050 us | }
0) 4.523 us | }
[...]
The default output shows the function call graph, including all child kernel
functions, along with the function duration times. These times are printed on
either the return line for the function ("}"), or for leaf functions, on the
same line.
The format of this output is documented in the function graph section of the
kernel source file Documentation/trace/ftrace.txt.
This particular example shows the workings of do_nanosleep, in the first dozen
lines, and then schedule() is called to sleep this thread and run another. The
inner workings of schedule() is included in the output.
This output is great for determining the behavior of a certain kernel function,
and to identify functions that can be studied in more details using other, lower
overhead, tools (eg, funccount(8), functrace(8), kprobe(8)). The overheads
of funcgraph are moderate, since all kernel functions are traced in case
they are executed, then included in the output if they are.
Now, if you want to start understanding the general behavior of the kernel,
without a certain kernel function in mind, you may be better to begin with
CPU stack profiling using perf and generating a flame graph. Such an approach
has low overhead, as you are in control of the frequency of event collection
(eg, gathering CPU stacks at 99 Hertz). For instructions, see:
http://www.brendangregg.com/perf.html#FlameGraphs
On newer Linux kernels, you can use the -m option to limit the function
depth. Eg, 3 levels only:
# ./funcgraph -m 3 do_nanosleep
Tracing "do_nanosleep"... Ctrl-C to end.
1) | do_nanosleep() {
1) | hrtimer_start_range_ns() {
1) 1.115 us | __hrtimer_start_range_ns();
1) 1.919 us | }
1) | schedule() {
1) | __schedule() {
1) 1000131 us | }
1) 11.006 us | xen_evtchn_do_upcall();
1) 1000149 us | }
1) | hrtimer_cancel() {
1) 0.212 us | hrtimer_try_to_cancel();
1) 0.699 us | }
1) 1000154 us | }
Neat.
Now do_sys_open() to 3 levels:
0) | do_sys_open() {
0) | getname() {
0) 0.296 us | getname_flags();
0) 0.768 us | }
0) | get_unused_fd_flags() {
0) 0.397 us | __alloc_fd();
0) 0.827 us | }
0) | do_filp_open() {
0) 4.166 us | path_openat();
0) 4.617 us | }
0) | __fsnotify_parent() {
0) 0.083 us | dget_parent();
0) 0.063 us | dput();
0) 0.883 us | }
0) 0.058 us | fsnotify();
0) | fd_install() {
0) 0.133 us | __fd_install();
0) 0.525 us | }
0) | putname() {
0) 0.198 us | final_putname();
0) 0.512 us | }
0) 10.777 us | }
[...]
I can then pick the highest latency child function, then run funcgraph again
using it as the target.
Without timestamps (-D to elide duration):
# ./funcgraph -Dm 3 do_sys_open
Tracing "do_sys_open"... Ctrl-C to end.
1) do_sys_open() {
1) getname() {
1) getname_flags();
1) }
1) get_unused_fd_flags() {
1) __alloc_fd();
1) }
1) do_filp_open() {
1) path_openat();
1) }
1) __fsnotify_parent();
1) fsnotify();
1) fd_install() {
1) __fd_install();
1) }
1) putname() {
1) final_putname();
1) }
1) }
Beautiful.
I could elide the CPU column as well, but I want to leave it: if it changes
half-way through some output, you know the CPU buffer has switched, and the
output may be shuffled.
For this example, I trace vfs_read() calls by process ID 5363: which is a bash
shell. I also include headers (-H) and absolute timestamps (-t). While
tracing, in that bash shell, I typed the word "hello":
# ./funcgraph -Htp 5363 vfs_read
Tracing "vfs_read" for PID 5363... Ctrl-C to end.
# tracer: function_graph
#
# TIME CPU DURATION FUNCTION CALLS
# | | | | | | | |
7238523.638008 | 0) | finish_task_switch() {
7238523.638012 | 0) | xen_evtchn_do_upcall() {
7238523.638012 | 0) | irq_enter() {
7238523.638013 | 0) 0.153 us | rcu_irq_enter();
7238523.638014 | 0) 1.144 us | }
7238523.638014 | 0) 0.056 us | exit_idle();
7238523.638014 | 0) | __xen_evtchn_do_upcall() {
7238523.638015 | 0) | evtchn_2l_handle_events() {
7238523.638015 | 0) 0.057 us | irq_from_virq();
7238523.638015 | 0) | evtchn_from_irq() {
7238523.638015 | 0) | irq_get_irq_data() {
7238523.638016 | 0) 0.058 us | irq_to_desc();
7238523.638016 | 0) 0.565 us | }
7238523.638016 | 0) 0.966 us | }
7238523.638016 | 0) | get_evtchn_to_irq() {
7238523.638017 | 0) 0.050 us | evtchn_2l_max_channels();
7238523.638017 | 0) 0.386 us | }
7238523.638017 | 0) | generic_handle_irq() {
7238523.638017 | 0) 0.058 us | irq_to_desc();
7238523.638018 | 0) | handle_percpu_irq() {
7238523.638018 | 0) | ack_dynirq() {
7238523.638018 | 0) | evtchn_from_irq() {
7238523.638018 | 0) | irq_get_irq_data() {
7238523.638019 | 0) 0.049 us | irq_to_desc();
7238523.638019 | 0) 0.441 us | }
7238523.638019 | 0) 0.772 us | }
7238523.638019 | 0) 0.049 us | irq_move_irq();
7238523.638020 | 0) 0.060 us | evtchn_2l_clear_pending();
7238523.638020 | 0) 1.810 us | }
7238523.638020 | 0) | handle_irq_event_percpu() {
7238523.638020 | 0) | xen_irq_work_interrupt() {
7238523.638021 | 0) | irq_enter() {
7238523.638021 | 0) 0.056 us | rcu_irq_enter();
7238523.638021 | 0) 0.384 us | }
7238523.638021 | 0) | __wake_up() {
7238523.638022 | 0) 0.059 us | _raw_spin_lock_irqsave();
7238523.638022 | 0) | __wake_up_common() {
7238523.638022 | 0) | autoremove_wake_function() {
7238523.638023 | 0) | default_wake_function() {
7238523.638023 | 0) | try_to_wake_up() {
7238523.638023 | 0) 0.220 us | _raw_spin_lock_irqsave();
7238523.638024 | 0) 0.270 us | task_waking_fair();
7238523.638024 | 0) | select_task_rq_fair() {
7238523.638025 | 0) 0.055 us | source_load();
7238523.638025 | 0) 0.056 us | target_load();
7238523.638025 | 0) 0.060 us | idle_cpu();
7238523.638026 | 0) 0.054 us | cpus_share_cache();
7238523.638026 | 0) 0.083 us | idle_cpu();
7238523.638026 | 0) 2.060 us | }
7238523.638027 | 0) 0.051 us | _raw_spin_lock();
7238523.638027 | 0) | ttwu_do_activate.constprop.124() {
7238523.638027 | 0) | activate_task() {
7238523.638027 | 0) | enqueue_task() {
7238523.638028 | 0) 0.120 us | update_rq_clock();
7238523.638028 | 0) | enqueue_task_fair() {
7238523.638028 | 0) | enqueue_entity() {
7238523.638028 | 0) 0.147 us | update_curr();
7238523.638029 | 0) 0.055 us | __compute_runnable_contrib.part.51();
7238523.638029 | 0) 0.066 us | __update_entity_load_avg_contrib();
7238523.638029 | 0) 0.141 us | update_cfs_rq_blocked_load();
7238523.638030 | 0) 0.068 us | account_entity_enqueue();
7238523.638030 | 0) 0.351 us | update_cfs_shares();
7238523.638031 | 0) 0.053 us | place_entity();
7238523.638031 | 0) 0.082 us | __enqueue_entity();
7238523.638032 | 0) 0.050 us | update_cfs_rq_blocked_load();
7238523.638032 | 0) 3.922 us | }
7238523.638032 | 0) | enqueue_entity() {
7238523.638033 | 0) 0.058 us | update_curr();
7238523.638033 | 0) 0.056 us | __compute_runnable_contrib.part.51();
7238523.638033 | 0) 0.078 us | __update_entity_load_avg_contrib();
7238523.638034 | 0) 0.055 us | update_cfs_rq_blocked_load();
7238523.638034 | 0) 0.064 us | account_entity_enqueue();
7238523.638034 | 0) 0.059 us | update_cfs_shares();
7238523.638035 | 0) 0.050 us | place_entity();
7238523.638036 | 0) 0.057 us | __enqueue_entity();
7238523.638036 | 0) 3.829 us | }
7238523.638037 | 0) 0.057 us | hrtick_update();
7238523.638037 | 0) 8.876 us | }
7238523.638037 | 0) 9.698 us | }
7238523.638037 | 0) 10.113 us | }
7238523.638038 | 0) | ttwu_do_wakeup() {
7238523.638038 | 0) | check_preempt_curr() {
7238523.638038 | 0) | resched_task() {
7238523.638038 | 0) | xen_smp_send_reschedule() {
7238523.638038 | 0) | xen_send_IPI_one() {
7238523.638039 | 0) | notify_remote_via_irq() {
7238523.638039 | 0) | evtchn_from_irq() {
7238523.638039 | 0) | irq_get_irq_data() {
7238523.638039 | 0) 0.051 us | irq_to_desc();
7238523.638039 | 0) 0.518 us | }
7238523.638040 | 0) 0.955 us | }
7238523.638041 | 0) 2.001 us | }
7238523.638041 | 0) 2.391 us | }
7238523.638041 | 0) 2.745 us | }
7238523.638041 | 0) 3.183 us | }
7238523.638042 | 0) 3.663 us | }
7238523.638042 | 0) 4.621 us | }
7238523.638043 | 0) 15.443 us | }
7238523.638043 | 0) 0.067 us | _raw_spin_unlock();
7238523.638043 | 0) 0.167 us | ttwu_stat();
7238523.638044 | 0) 0.087 us | _raw_spin_unlock_irqrestore();
7238523.638044 | 0) 21.447 us | }
7238523.638045 | 0) 21.940 us | }
7238523.638045 | 0) 22.406 us | }
7238523.638045 | 0) 23.071 us | }
7238523.638045 | 0) 0.073 us | _raw_spin_unlock_irqrestore();
7238523.638046 | 0) 24.382 us | }
7238523.638046 | 0) | irq_exit() {
7238523.638047 | 0) 0.085 us | idle_cpu();
7238523.638047 | 0) 0.093 us | rcu_irq_exit();
7238523.638048 | 0) 1.242 us | }
7238523.638048 | 0) 27.410 us | }
7238523.638049 | 0) 0.139 us | add_interrupt_randomness();
7238523.638049 | 0) 0.089 us | note_interrupt();
7238523.638050 | 0) 29.582 us | }
7238523.638050 | 0) 32.112 us | }
7238523.638050 | 0) 32.951 us | }
7238523.638051 | 0) 35.765 us | }
7238523.638051 | 0) 36.170 us | }
7238523.638051 | 0) | irq_exit() {
7238523.638051 | 0) 0.082 us | idle_cpu();
7238523.638052 | 0) 0.071 us | rcu_irq_exit();
7238523.638053 | 0) 1.328 us | }
7238523.638053 | 0) 40.563 us | }
7238523.638054 | 0) | __mmdrop() {
7238523.638054 | 0) | pgd_free() {
7238523.638055 | 0) 0.151 us | _raw_spin_lock();
7238523.638055 | 0) 0.069 us | _raw_spin_unlock();
7238523.638056 | 0) | xen_pgd_free() {
7238523.638056 | 0) 0.067 us | xen_get_user_pgd();
7238523.638057 | 0) | free_pages() {
7238523.638057 | 0) | __free_pages() {
7238523.638057 | 0) | free_hot_cold_page() {
7238523.638058 | 0) 0.080 us | free_pages_prepare();
7238523.638058 | 0) 0.363 us | get_pfnblock_flags_mask();
7238523.638059 | 0) 1.626 us | }
7238523.638059 | 0) 2.317 us | }
7238523.638060 | 0) 2.847 us | }
7238523.638060 | 0) 3.908 us | }
7238523.638060 | 0) | free_pages() {
7238523.638060 | 0) | __free_pages() {
7238523.638061 | 0) | free_hot_cold_page() {
7238523.638061 | 0) 0.083 us | free_pages_prepare();
7238523.638061 | 0) 0.139 us | get_pfnblock_flags_mask();
7238523.638062 | 0) 1.062 us | }
7238523.638062 | 0) 1.534 us | }
7238523.638062 | 0) 2.038 us | }
7238523.638063 | 0) 8.268 us | }
7238523.638064 | 0) 0.160 us | destroy_context();
7238523.638065 | 0) 0.384 us | kmem_cache_free();
7238523.638066 | 0) 11.433 us | }
7238523.638066 | 0) 54.448 us | }
7238523.638066 | 0) 19354026 us | } /* __schedule */
7238523.638067 | 0) 19354026 us | } /* schedule */
7238523.638067 | 0) 19354027 us | } /* schedule_timeout */
7238523.638067 | 0) 0.121 us | down_read();
7238523.638068 | 0) | copy_from_read_buf() {
7238523.638069 | 0) | tty_audit_add_data() {
7238523.638070 | 0) 0.220 us | _raw_spin_lock_irqsave();
7238523.638071 | 0) 0.097 us | _raw_spin_unlock_irqrestore();
7238523.638071 | 0) 0.078 us | _raw_spin_lock_irqsave();
7238523.638072 | 0) 0.077 us | _raw_spin_unlock_irqrestore();
7238523.638072 | 0) 2.795 us | }
7238523.638073 | 0) 4.183 us | }
7238523.638073 | 0) 0.084 us | copy_from_read_buf();
7238523.638074 | 0) 0.078 us | n_tty_set_room();
7238523.638074 | 0) 0.082 us | n_tty_write_wakeup();
7238523.638075 | 0) | __wake_up() {
7238523.638075 | 0) 0.084 us | _raw_spin_lock_irqsave();
7238523.638076 | 0) | __wake_up_common() {
7238523.638076 | 0) 0.095 us | pollwake();
7238523.638077 | 0) 0.819 us | }
7238523.638077 | 0) 0.074 us | _raw_spin_unlock_irqrestore();
7238523.638078 | 0) 2.463 us | }
7238523.638078 | 0) 0.071 us | n_tty_set_room();
7238523.638078 | 0) 0.082 us | up_read();
7238523.638079 | 0) | remove_wait_queue() {
7238523.638079 | 0) 0.082 us | _raw_spin_lock_irqsave();
7238523.638080 | 0) 0.086 us | _raw_spin_unlock_irqrestore();
7238523.638080 | 0) 1.239 us | }
7238523.638081 | 0) 0.142 us | mutex_unlock();
7238523.638081 | 0) 19354047 us | } /* n_tty_read */
7238523.638082 | 0) | tty_ldisc_deref() {
7238523.638082 | 0) 0.064 us | ldsem_up_read();
7238523.638082 | 0) 0.554 us | }
7238523.638083 | 0) 0.074 us | get_seconds();
7238523.638083 | 0) 19354052 us | } /* tty_read */
7238523.638084 | 0) 0.352 us | __fsnotify_parent();
7238523.638085 | 0) 0.178 us | fsnotify();
7238523.638085 | 0) 19354058 us | } /* vfs_read */
7238523.638156 | 0) | vfs_read() {
7238523.638157 | 0) | rw_verify_area() {
7238523.638157 | 0) | security_file_permission() {
7238523.638158 | 0) | apparmor_file_permission() {
7238523.638158 | 0) 0.183 us | common_file_perm();
7238523.638159 | 0) 0.778 us | }
7238523.638159 | 0) 0.081 us | __fsnotify_parent();
7238523.638160 | 0) 0.104 us | fsnotify();
7238523.638160 | 0) 2.662 us | }
7238523.638161 | 0) 3.337 us | }
7238523.638161 | 0) | tty_read() {
7238523.638161 | 0) 0.067 us | tty_paranoia_check();
7238523.638162 | 0) | tty_ldisc_ref_wait() {
7238523.638162 | 0) 0.080 us | } /* ldsem_down_read */
7238523.638163 | 0) 0.637 us | }
7238523.638163 | 0) | n_tty_read() {
7238523.638164 | 0) 0.078 us | _raw_spin_lock_irq();
7238523.638164 | 0) 0.090 us | mutex_lock_interruptible();
7238523.638165 | 0) 0.078 us | down_read();
7238523.638165 | 0) | add_wait_queue() {
7238523.638166 | 0) 0.070 us | _raw_spin_lock_irqsave();
7238523.638166 | 0) 0.084 us | _raw_spin_unlock_irqrestore();
7238523.638167 | 0) 1.111 us | }
7238523.638167 | 0) 0.083 us | tty_hung_up_p();
7238523.638168 | 0) 0.080 us | n_tty_set_room();
7238523.638169 | 0) 0.068 us | up_read();
7238523.638169 | 0) | schedule_timeout() {
7238523.638170 | 0) | schedule() {
7238523.638170 | 0) | __schedule() {
7238523.638171 | 0) 0.078 us | rcu_note_context_switch();
7238523.638171 | 0) 0.081 us | _raw_spin_lock_irq();
7238523.638172 | 0) | deactivate_task() {
7238523.638172 | 0) | dequeue_task() {
7238523.638172 | 0) 0.181 us | update_rq_clock();
7238523.638173 | 0) | dequeue_task_fair() {
7238523.638174 | 0) | dequeue_entity() {
7238523.638174 | 0) | update_curr() {
7238523.638174 | 0) 0.257 us | cpuacct_charge();
7238523.638175 | 0) 0.982 us | }
7238523.638175 | 0) 0.079 us | update_cfs_rq_blocked_load();
7238523.638176 | 0) 0.080 us | clear_buddies();
7238523.638177 | 0) 0.096 us | account_entity_dequeue();
7238523.638177 | 0) | update_cfs_shares() {
7238523.638178 | 0) 0.113 us | update_curr();
7238523.638178 | 0) 0.087 us | account_entity_dequeue();
7238523.638179 | 0) 0.073 us | account_entity_enqueue();
7238523.638179 | 0) 1.948 us | }
7238523.638180 | 0) 5.913 us | }
7238523.638180 | 0) | dequeue_entity() {
7238523.638180 | 0) 0.086 us | update_curr();
7238523.638181 | 0) 0.079 us | update_cfs_rq_blocked_load();
7238523.638182 | 0) 0.076 us | clear_buddies();
7238523.638182 | 0) 0.076 us | account_entity_dequeue();
7238523.638183 | 0) 0.104 us | update_cfs_shares();
7238523.638183 | 0) 3.171 us | }
7238523.638184 | 0) 0.076 us | hrtick_update();
7238523.638184 | 0) 10.785 us | }
7238523.638184 | 0) 12.057 us | }
7238523.638185 | 0) 12.704 us | }
7238523.638185 | 0) | pick_next_task_fair() {
7238523.638185 | 0) 0.074 us | check_cfs_rq_runtime();
7238523.638186 | 0) | pick_next_entity() {
7238523.638186 | 0) 0.067 us | clear_buddies();
7238523.638187 | 0) 0.544 us | }
7238523.638187 | 0) | put_prev_entity() {
7238523.638187 | 0) 0.079 us | check_cfs_rq_runtime();
7238523.638188 | 0) 0.612 us | }
7238523.638188 | 0) | put_prev_entity() {
7238523.638188 | 0) 0.076 us | check_cfs_rq_runtime();
7238523.638189 | 0) 0.618 us | }
7238523.638189 | 0) | set_next_entity() {
7238523.638190 | 0) 0.078 us | update_stats_wait_end();
7238523.638190 | 0) 0.712 us | }
7238523.638190 | 0) 5.023 us | }
7238523.638191 | 0) 0.086 us | paravirt_start_context_switch();
7238523.638192 | 0) 0.070 us | xen_read_cr0();
7238523.638193 | 0) | xen_write_cr0() {
7238523.638193 | 0) 0.085 us | paravirt_get_lazy_mode();
7238523.638194 | 0) 0.085 us | __xen_mc_entry();
7238523.638194 | 0) 0.077 us | paravirt_get_lazy_mode();
7238523.638195 | 0) 1.822 us | }
7238523.638195 | 0) | xen_load_sp0() {
7238523.638195 | 0) 0.074 us | paravirt_get_lazy_mode();
7238523.638196 | 0) 0.085 us | __xen_mc_entry();
7238523.638196 | 0) 0.078 us | paravirt_get_lazy_mode();
7238523.638197 | 0) 1.754 us | }
7238523.638197 | 0) | xen_load_tls() {
7238523.638198 | 0) 0.069 us | paravirt_get_lazy_mode();
7238523.638198 | 0) 0.082 us | paravirt_get_lazy_mode();
7238523.638199 | 0) 0.127 us | load_TLS_descriptor();
7238523.638199 | 0) 0.080 us | load_TLS_descriptor();
7238523.638200 | 0) 0.094 us | load_TLS_descriptor();
7238523.638201 | 0) 0.081 us | paravirt_get_lazy_mode();
7238523.638202 | 0) 4.155 us | }
7238523.638202 | 0) | xen_end_context_switch() {
7238523.638202 | 0) 0.699 us | xen_mc_flush();
7238523.638204 | 0) 0.089 us | paravirt_end_context_switch();
7238523.638204 | 0) 1.915 us | }
7238523.797630 | 0) | finish_task_switch() {
7238523.797634 | 0) | xen_evtchn_do_upcall() {
7238523.797634 | 0) | irq_enter() {
7238523.797634 | 0) 0.134 us | rcu_irq_enter();
7238523.797635 | 0) 0.688 us | }
7238523.797635 | 0) 0.055 us | exit_idle();
7238523.797635 | 0) | __xen_evtchn_do_upcall() {
7238523.797636 | 0) | evtchn_2l_handle_events() {
7238523.797636 | 0) 0.048 us | irq_from_virq();
7238523.797636 | 0) | evtchn_from_irq() {
7238523.797636 | 0) | irq_get_irq_data() {
7238523.797637 | 0) 0.061 us | irq_to_desc();
7238523.797637 | 0) 0.564 us | }
7238523.797637 | 0) 0.954 us | }
7238523.797638 | 0) | get_evtchn_to_irq() {
7238523.797638 | 0) 0.057 us | evtchn_2l_max_channels();
7238523.797638 | 0) 0.409 us | }
7238523.797638 | 0) | generic_handle_irq() {
7238523.797638 | 0) 0.052 us | irq_to_desc();
7238523.797639 | 0) | handle_percpu_irq() {
7238523.797639 | 0) | ack_dynirq() {
7238523.797639 | 0) | evtchn_from_irq() {
7238523.797639 | 0) | irq_get_irq_data() {
7238523.797640 | 0) 0.057 us | irq_to_desc();
7238523.797640 | 0) 0.440 us | }
7238523.797640 | 0) 0.746 us | }
7238523.797640 | 0) 0.056 us | irq_move_irq();
7238523.797641 | 0) 0.058 us | evtchn_2l_clear_pending();
7238523.797641 | 0) 1.729 us | }
7238523.797641 | 0) | handle_irq_event_percpu() {
7238523.797641 | 0) | xen_irq_work_interrupt() {
7238523.797642 | 0) | irq_enter() {
7238523.797642 | 0) 0.053 us | rcu_irq_enter();
7238523.797642 | 0) 0.396 us | }
7238523.797642 | 0) | __wake_up() {
7238523.797643 | 0) 0.053 us | _raw_spin_lock_irqsave();
7238523.797643 | 0) | __wake_up_common() {
7238523.797643 | 0) | autoremove_wake_function() {
7238523.797644 | 0) | default_wake_function() {
7238523.797644 | 0) | try_to_wake_up() {
7238523.797644 | 0) 0.228 us | _raw_spin_lock_irqsave();
7238523.797645 | 0) 0.194 us | task_waking_fair();
7238523.797645 | 0) | select_task_rq_fair() {
7238523.797645 | 0) 0.051 us | source_load();
7238523.797646 | 0) 0.050 us | target_load();
7238523.797646 | 0) 0.067 us | idle_cpu();
7238523.797647 | 0) 0.050 us | cpus_share_cache();
7238523.797647 | 0) 0.068 us | idle_cpu();
7238523.797647 | 0) 1.983 us | }
7238523.797648 | 0) 0.051 us | _raw_spin_lock();
7238523.797648 | 0) | ttwu_do_activate.constprop.124() {
7238523.797648 | 0) | activate_task() {
7238523.797648 | 0) | enqueue_task() {
7238523.797648 | 0) 0.135 us | update_rq_clock();
7238523.797649 | 0) | enqueue_task_fair() {
7238523.797649 | 0) | enqueue_entity() {
7238523.797649 | 0) 0.059 us | update_curr();
7238523.797650 | 0) 0.073 us | __compute_runnable_contrib.part.51();
7238523.797650 | 0) 0.066 us | __update_entity_load_avg_contrib();
7238523.797650 | 0) 0.059 us | update_cfs_rq_blocked_load();
7238523.797651 | 0) 0.064 us | account_entity_enqueue();
7238523.797651 | 0) 0.137 us | update_cfs_shares();
7238523.797651 | 0) 0.054 us | place_entity();
7238523.797652 | 0) 0.074 us | __enqueue_entity();
7238523.797652 | 0) 3.085 us | }
7238523.797652 | 0) | enqueue_entity() {
7238523.797653 | 0) 0.058 us | update_curr();
7238523.797654 | 0) 0.049 us | update_cfs_rq_blocked_load();
7238523.797654 | 0) 0.057 us | account_entity_enqueue();
7238523.797655 | 0) 0.066 us | update_cfs_shares();
7238523.797655 | 0) 0.049 us | place_entity();
7238523.797655 | 0) 0.051 us | __enqueue_entity();
7238523.797656 | 0) 3.432 us | }
7238523.797656 | 0) 0.049 us | hrtick_update();
7238523.797657 | 0) 7.552 us | }
7238523.797657 | 0) 8.414 us | }
7238523.797657 | 0) 8.753 us | }
7238523.797657 | 0) | ttwu_do_wakeup() {
7238523.797657 | 0) | check_preempt_curr() {
7238523.797657 | 0) | resched_task() {
7238523.797658 | 0) | xen_smp_send_reschedule() {
7238523.797658 | 0) | xen_send_IPI_one() {
7238523.797658 | 0) | notify_remote_via_irq() {
7238523.797658 | 0) | evtchn_from_irq() {
7238523.797658 | 0) | irq_get_irq_data() {
7238523.797659 | 0) 0.069 us | irq_to_desc();
7238523.797659 | 0) 0.504 us | }
7238523.797659 | 0) 0.869 us | }
7238523.797660 | 0) 1.940 us | } /* notify_remote_via_irq */
7238523.797660 | 0) 2.319 us | }
7238523.797660 | 0) 2.712 us | }
7238523.797661 | 0) 3.147 us | }
7238523.797661 | 0) 3.625 us | }
7238523.797662 | 0) 4.525 us | }
7238523.797662 | 0) 13.961 us | }
7238523.797662 | 0) 0.069 us | _raw_spin_unlock();
7238523.797663 | 0) 0.168 us | ttwu_stat();
7238523.797663 | 0) 0.076 us | _raw_spin_unlock_irqrestore();
7238523.797664 | 0) 19.821 us | }
7238523.797664 | 0) 20.301 us | }
7238523.797664 | 0) 20.796 us | }
7238523.797664 | 0) 21.367 us | }
7238523.797665 | 0) 0.071 us | _raw_spin_unlock_irqrestore();
7238523.797665 | 0) 22.621 us | }
7238523.797666 | 0) | irq_exit() {
7238523.797666 | 0) 0.085 us | idle_cpu();
7238523.797666 | 0) 0.106 us | rcu_irq_exit();
7238523.797667 | 0) 1.220 us | }
7238523.797667 | 0) 25.712 us | }
7238523.797668 | 0) 0.138 us | add_interrupt_randomness();
7238523.797668 | 0) 0.092 us | note_interrupt();
7238523.797669 | 0) 27.713 us | }
7238523.797669 | 0) 30.163 us | }
7238523.797669 | 0) 31.017 us | }
7238523.797670 | 0) 33.953 us | }
7238523.797670 | 0) 34.384 us | }
7238523.797670 | 0) | irq_exit() {
7238523.797671 | 0) 0.079 us | idle_cpu();
7238523.797671 | 0) 0.072 us | rcu_irq_exit();
7238523.797672 | 0) 1.023 us | }
7238523.797672 | 0) 37.789 us | }
7238523.797672 | 0) 39.298 us | }
7238523.797673 | 0) 159502.1 us | }
7238523.797673 | 0) 159502.8 us | }
7238523.797673 | 0) 159503.5 us | }
7238523.797674 | 0) 0.112 us | down_read();
7238523.797675 | 0) | copy_from_read_buf() {
7238523.797676 | 0) | tty_audit_add_data() {
7238523.797676 | 0) 0.226 us | _raw_spin_lock_irqsave();
7238523.797677 | 0) 0.075 us | _raw_spin_unlock_irqrestore();
7238523.797677 | 0) 0.101 us | _raw_spin_lock_irqsave();
7238523.797678 | 0) 0.068 us | _raw_spin_unlock_irqrestore();
7238523.797679 | 0) 2.656 us | }
7238523.797679 | 0) 3.762 us | }
7238523.797679 | 0) 0.145 us | copy_from_read_buf();
7238523.797680 | 0) 0.068 us | n_tty_set_room();
7238523.797680 | 0) 0.058 us | n_tty_write_wakeup();
7238523.797681 | 0) | __wake_up() {
7238523.797682 | 0) 0.060 us | _raw_spin_lock_irqsave();
7238523.797682 | 0) | __wake_up_common() {
7238523.797683 | 0) 0.083 us | pollwake();
7238523.797683 | 0) 0.739 us | }
7238523.797683 | 0) 0.069 us | _raw_spin_unlock_irqrestore();
7238523.797684 | 0) 2.745 us | }
7238523.797684 | 0) 0.061 us | n_tty_set_room();
7238523.797685 | 0) 0.074 us | up_read();
7238523.797685 | 0) | remove_wait_queue() {
7238523.797685 | 0) 0.075 us | _raw_spin_lock_irqsave();
7238523.797686 | 0) 0.070 us | _raw_spin_unlock_irqrestore();
7238523.797686 | 0) 1.110 us | }
7238523.797687 | 0) 0.146 us | mutex_unlock();
7238523.797687 | 0) 159524.0 us | }
7238523.797688 | 0) | tty_ldisc_deref() {
7238523.797688 | 0) 0.070 us | ldsem_up_read();
7238523.797689 | 0) 0.739 us | }
7238523.797689 | 0) 0.066 us | get_seconds();
7238523.797690 | 0) 159528.3 us | }
7238523.797690 | 0) 0.298 us | __fsnotify_parent();
7238523.797691 | 0) 0.179 us | fsnotify();
7238523.797692 | 0) 159534.6 us | }
7238523.797762 | 0) | vfs_read() {
7238523.797763 | 0) | rw_verify_area() {
7238523.797763 | 0) | security_file_permission() {
7238523.797764 | 0) | apparmor_file_permission() {
7238523.797764 | 0) 0.165 us | common_file_perm();
7238523.797765 | 0) 0.732 us | }
7238523.797765 | 0) 0.081 us | __fsnotify_parent();
7238523.797766 | 0) 0.094 us | fsnotify();
7238523.797766 | 0) 2.711 us | }
7238523.797767 | 0) 3.386 us | }
7238523.797767 | 0) | tty_read() {
7238523.797767 | 0) 0.077 us | tty_paranoia_check();
7238523.797768 | 0) | tty_ldisc_ref_wait() {
7238523.797768 | 0) 0.083 us | ldsem_down_read();
7238523.797769 | 0) 0.686 us | }
7238523.797769 | 0) | n_tty_read() {
7238523.797770 | 0) 0.071 us | _raw_spin_lock_irq();
7238523.797770 | 0) 0.111 us | mutex_lock_interruptible();
7238523.797771 | 0) 0.072 us | down_read();
7238523.797771 | 0) | add_wait_queue() {
7238523.797772 | 0) 0.083 us | _raw_spin_lock_irqsave();
7238523.797772 | 0) 0.085 us | _raw_spin_unlock_irqrestore();
7238523.797773 | 0) 1.124 us | }
7238523.797773 | 0) 0.066 us | tty_hung_up_p();
7238523.797774 | 0) 0.090 us | n_tty_set_room();
7238523.797774 | 0) 0.064 us | up_read();
7238523.797775 | 0) | schedule_timeout() {
7238523.797775 | 0) | schedule() {
7238523.797775 | 0) | __schedule() {
7238523.797776 | 0) 0.083 us | rcu_note_context_switch();
7238523.797776 | 0) 0.078 us | _raw_spin_lock_irq();
7238523.797777 | 0) | deactivate_task() {
7238523.797777 | 0) | dequeue_task() {
7238523.797777 | 0) 0.191 us | update_rq_clock();
7238523.797778 | 0) | dequeue_task_fair() {
7238523.797778 | 0) | dequeue_entity() {
7238523.797779 | 0) | update_curr() {
7238523.797779 | 0) 0.179 us | cpuacct_charge();
7238523.797780 | 0) 0.902 us | }
7238523.797780 | 0) 0.070 us | __update_entity_load_avg_contrib();
7238523.797781 | 0) 0.152 us | update_cfs_rq_blocked_load();
7238523.797781 | 0) 0.073 us | clear_buddies();
7238523.797782 | 0) 0.074 us | account_entity_dequeue();
7238523.797783 | 0) | update_cfs_shares() {
7238523.797783 | 0) 0.111 us | update_curr();
7238523.797783 | 0) 0.082 us | account_entity_dequeue();
7238523.797784 | 0) 0.081 us | account_entity_enqueue();
7238523.797785 | 0) 2.330 us | }
7238523.797785 | 0) 6.633 us | } /* dequeue_entity */
7238523.797786 | 0) | dequeue_entity() {
7238523.797786 | 0) 0.078 us | update_curr();
7238523.797787 | 0) 0.086 us | update_cfs_rq_blocked_load();
7238523.797787 | 0) 0.076 us | clear_buddies();
7238523.797788 | 0) 0.079 us | account_entity_dequeue();
7238523.797789 | 0) 0.074 us | update_cfs_shares();
7238523.797789 | 0) 3.287 us | }
7238523.797789 | 0) 0.074 us | hrtick_update();
7238523.797790 | 0) 11.606 us | }
7238523.797790 | 0) 12.879 us | }
7238523.797790 | 0) 13.406 us | }
7238523.797791 | 0) | pick_next_task_fair() {
7238523.797791 | 0) 0.073 us | check_cfs_rq_runtime();
7238523.797792 | 0) | pick_next_entity() {
7238523.797792 | 0) 0.076 us | clear_buddies();
7238523.797793 | 0) 0.663 us | }
7238523.797793 | 0) | put_prev_entity() {
7238523.797793 | 0) 0.076 us | check_cfs_rq_runtime();
7238523.797794 | 0) 0.598 us | }
7238523.797794 | 0) | put_prev_entity() {
7238523.797794 | 0) 0.078 us | check_cfs_rq_runtime();
7238523.797795 | 0) 0.618 us | }
7238523.797795 | 0) | set_next_entity() {
7238523.797795 | 0) 0.096 us | update_stats_wait_end();
7238523.797796 | 0) 0.738 us | }
7238523.797796 | 0) 5.222 us | }
7238523.797797 | 0) 0.078 us | paravirt_start_context_switch();
7238523.797798 | 0) 0.071 us | xen_read_cr0();
7238523.797799 | 0) | xen_write_cr0() {
7238523.797799 | 0) 0.078 us | paravirt_get_lazy_mode();
7238523.797800 | 0) 0.084 us | __xen_mc_entry();
7238523.797800 | 0) 0.076 us | paravirt_get_lazy_mode();
7238523.797801 | 0) 1.798 us | }
7238523.797801 | 0) | xen_load_sp0() {
7238523.797801 | 0) 0.080 us | paravirt_get_lazy_mode();
7238523.797802 | 0) 0.076 us | __xen_mc_entry();
7238523.797802 | 0) 0.073 us | paravirt_get_lazy_mode();
7238523.797803 | 0) 1.623 us | }
7238523.797803 | 0) | xen_load_tls() {
7238523.797803 | 0) 0.082 us | paravirt_get_lazy_mode();
7238523.797804 | 0) 0.084 us | paravirt_get_lazy_mode();
7238523.797804 | 0) 0.136 us | load_TLS_descriptor();
7238523.797805 | 0) 0.072 us | load_TLS_descriptor();
7238523.797806 | 0) 0.080 us | load_TLS_descriptor();
7238523.797806 | 0) 0.088 us | paravirt_get_lazy_mode();
7238523.797807 | 0) 3.360 us | }
7238523.797807 | 0) | xen_end_context_switch() {
7238523.797807 | 0) 0.601 us | xen_mc_flush();
7238523.797808 | 0) 0.098 us | paravirt_end_context_switch();
7238523.797809 | 0) 1.902 us | }
7238524.005649 | 0) | finish_task_switch() {
7238524.005653 | 0) | xen_evtchn_do_upcall() {
7238524.005653 | 0) | irq_enter() {
7238524.005653 | 0) 0.138 us | rcu_irq_enter();
7238524.005654 | 0) 0.753 us | }
7238524.005654 | 0) 0.056 us | exit_idle();
7238524.005655 | 0) | __xen_evtchn_do_upcall() {
7238524.005655 | 0) | evtchn_2l_handle_events() {
7238524.005655 | 0) 0.057 us | irq_from_virq();
7238524.005656 | 0) | evtchn_from_irq() {
7238524.005656 | 0) | irq_get_irq_data() {
7238524.005656 | 0) 0.050 us | irq_to_desc();
7238524.005656 | 0) 0.499 us | }
7238524.005657 | 0) 0.958 us | }
7238524.005657 | 0) | get_evtchn_to_irq() {
7238524.005657 | 0) 0.057 us | evtchn_2l_max_channels();
7238524.005658 | 0) 0.400 us | }
7238524.005659 | 0) | generic_handle_irq() {
7238524.005659 | 0) 0.052 us | irq_to_desc();
7238524.005659 | 0) | handle_percpu_irq() {
7238524.005659 | 0) | ack_dynirq() {
7238524.005659 | 0) | evtchn_from_irq() {
7238524.005660 | 0) | irq_get_irq_data() {
7238524.005660 | 0) 0.056 us | irq_to_desc();
7238524.005660 | 0) 0.439 us | }
7238524.005660 | 0) 0.739 us | }
7238524.005661 | 0) 0.051 us | irq_move_irq();
7238524.005661 | 0) 0.051 us | evtchn_2l_clear_pending();
7238524.005661 | 0) 1.963 us | }
7238524.005662 | 0) | handle_irq_event_percpu() {
7238524.005662 | 0) | xen_irq_work_interrupt() {
7238524.005662 | 0) | irq_enter() {
7238524.005662 | 0) 0.053 us | rcu_irq_enter();
7238524.005663 | 0) 0.392 us | }
7238524.005663 | 0) | __wake_up() {
7238524.005663 | 0) 0.058 us | _raw_spin_lock_irqsave();
7238524.005664 | 0) | __wake_up_common() {
7238524.005664 | 0) | autoremove_wake_function() {
7238524.005664 | 0) | default_wake_function() {
7238524.005665 | 0) | try_to_wake_up() {
7238524.005665 | 0) 0.226 us | _raw_spin_lock_irqsave();
7238524.005665 | 0) 0.392 us | task_waking_fair();
7238524.005666 | 0) | select_task_rq_fair() {
7238524.005666 | 0) 0.067 us | source_load();
7238524.005667 | 0) 0.057 us | target_load();
7238524.005667 | 0) 0.065 us | idle_cpu();
7238524.005668 | 0) 0.050 us | cpus_share_cache();
7238524.005668 | 0) 0.080 us | idle_cpu();
7238524.005668 | 0) 2.053 us | }
7238524.005669 | 0) 0.051 us | _raw_spin_lock();
7238524.005669 | 0) | ttwu_do_activate.constprop.124() {
7238524.005669 | 0) | activate_task() {
7238524.005669 | 0) | enqueue_task() {
7238524.005669 | 0) 0.165 us | update_rq_clock();
7238524.005670 | 0) | enqueue_task_fair() {
7238524.005670 | 0) | enqueue_entity() {
7238524.005670 | 0) 0.065 us | update_curr();
7238524.005671 | 0) 0.078 us | __compute_runnable_contrib.part.51();
7238524.005671 | 0) 0.070 us | __update_entity_load_avg_contrib();
7238524.005671 | 0) 0.051 us | update_cfs_rq_blocked_load();
7238524.005672 | 0) 0.069 us | account_entity_enqueue();
7238524.005672 | 0) 0.132 us | update_cfs_shares();
7238524.005673 | 0) 0.054 us | place_entity();
7238524.005673 | 0) 0.081 us | __enqueue_entity();
7238524.005673 | 0) 3.111 us | }
7238524.005673 | 0) | enqueue_entity() {
7238524.005674 | 0) 0.059 us | update_curr();
7238524.005674 | 0) 0.057 us | update_cfs_rq_blocked_load();
7238524.005674 | 0) 0.067 us | account_entity_enqueue();
7238524.005675 | 0) 0.082 us | update_cfs_shares();
7238524.005675 | 0) 0.120 us | place_entity();
7238524.005675 | 0) 0.051 us | __enqueue_entity();
7238524.005676 | 0) 2.075 us | }
7238524.005676 | 0) 0.049 us | hrtick_update();
7238524.005676 | 0) 6.167 us | }
7238524.005676 | 0) 6.979 us | }
7238524.005676 | 0) 7.317 us | }
7238524.005677 | 0) | ttwu_do_wakeup() {
7238524.005677 | 0) | check_preempt_curr() {
7238524.005677 | 0) | resched_task() {
7238524.005677 | 0) | xen_smp_send_reschedule() {
7238524.005677 | 0) | xen_send_IPI_one() {
7238524.005678 | 0) | notify_remote_via_irq() {
7238524.005678 | 0) | evtchn_from_irq() {
7238524.005678 | 0) | irq_get_irq_data() {
7238524.005678 | 0) 0.051 us | irq_to_desc();
7238524.005679 | 0) 0.545 us | }
7238524.005679 | 0) 0.910 us | }
7238524.005680 | 0) 1.962 us | } /* notify_remote_via_irq */
7238524.005680 | 0) 2.332 us | }
7238524.005680 | 0) 2.684 us | }
7238524.005681 | 0) 3.606 us | }
7238524.005681 | 0) 4.064 us | }
7238524.005682 | 0) 5.129 us | }
7238524.005682 | 0) 13.194 us | }
7238524.005683 | 0) 0.066 us | _raw_spin_unlock();
7238524.005683 | 0) 0.165 us | ttwu_stat();
7238524.005684 | 0) 0.070 us | _raw_spin_unlock_irqrestore();
7238524.005684 | 0) 19.634 us | }
7238524.005685 | 0) 20.080 us | }
7238524.005685 | 0) 20.608 us | }
7238524.005685 | 0) 21.348 us | }
7238524.005685 | 0) 0.084 us | _raw_spin_unlock_irqrestore();
7238524.005686 | 0) 22.728 us | }
7238524.005686 | 0) | irq_exit() {
7238524.005687 | 0) 0.077 us | idle_cpu();
7238524.005687 | 0) 0.093 us | rcu_irq_exit();
7238524.005688 | 0) 1.101 us | }
7238524.005688 | 0) 25.644 us | }
7238524.005688 | 0) 0.138 us | add_interrupt_randomness();
7238524.005689 | 0) 0.083 us | note_interrupt();
7238524.005689 | 0) 27.672 us | }
7238524.005690 | 0) 30.410 us | }
7238524.005690 | 0) 31.458 us | }
7238524.005690 | 0) 35.276 us | }
7238524.005691 | 0) 35.797 us | }
7238524.005691 | 0) | irq_exit() {
7238524.005691 | 0) 0.066 us | idle_cpu();
7238524.005692 | 0) 0.080 us | rcu_irq_exit();
7238524.005692 | 0) 1.110 us | }
7238524.005693 | 0) 39.440 us | }
7238524.005693 | 0) 41.142 us | }
7238524.005694 | 0) 207918.1 us | }
7238524.005694 | 0) 207918.7 us | }
7238524.005694 | 0) 207919.4 us | }
7238524.005695 | 0) 0.068 us | down_read();
7238524.005696 | 0) | copy_from_read_buf() {
7238524.005697 | 0) | tty_audit_add_data() {
7238524.005697 | 0) 0.233 us | _raw_spin_lock_irqsave();
7238524.005698 | 0) 0.076 us | _raw_spin_unlock_irqrestore();
7238524.005699 | 0) 0.076 us | _raw_spin_lock_irqsave();
7238524.005699 | 0) 0.078 us | _raw_spin_unlock_irqrestore();
7238524.005700 | 0) 2.696 us | }
7238524.005700 | 0) 4.335 us | }
7238524.005701 | 0) 0.086 us | copy_from_read_buf();
7238524.005701 | 0) 0.074 us | n_tty_set_room();
7238524.005702 | 0) 0.085 us | n_tty_write_wakeup();
7238524.005702 | 0) | __wake_up() {
7238524.005703 | 0) 0.061 us | _raw_spin_lock_irqsave();
7238524.005703 | 0) | __wake_up_common() {
7238524.005703 | 0) 0.080 us | pollwake();
7238524.005704 | 0) 0.687 us | }
7238524.005704 | 0) 0.063 us | _raw_spin_unlock_irqrestore();
7238524.005705 | 0) 2.040 us | }
7238524.005705 | 0) 0.071 us | n_tty_set_room();
7238524.005706 | 0) 0.074 us | up_read();
7238524.005706 | 0) | remove_wait_queue() {
7238524.005706 | 0) 0.074 us | _raw_spin_lock_irqsave();
7238524.005707 | 0) 0.069 us | _raw_spin_unlock_irqrestore();
7238524.005707 | 0) 1.076 us | }
7238524.005708 | 0) 0.139 us | mutex_unlock();
7238524.005708 | 0) 207939.0 us | }
7238524.005709 | 0) | tty_ldisc_deref() {
7238524.005709 | 0) 0.077 us | ldsem_up_read();
7238524.005710 | 0) 0.702 us | }
7238524.005710 | 0) 0.068 us | get_seconds();
7238524.005711 | 0) 207943.4 us | }
7238524.005712 | 0) 0.301 us | __fsnotify_parent();
7238524.005713 | 0) 0.157 us | fsnotify();
7238524.005713 | 0) 207950.3 us | }
7238524.005783 | 0) | vfs_read() {
7238524.005784 | 0) | rw_verify_area() {
7238524.005784 | 0) | security_file_permission() {
7238524.005785 | 0) | apparmor_file_permission() {
7238524.005785 | 0) 0.164 us | common_file_perm();
7238524.005786 | 0) 0.790 us | }
7238524.005786 | 0) 0.080 us | __fsnotify_parent();
7238524.005787 | 0) 0.094 us | fsnotify();
7238524.005787 | 0) 2.683 us | }
7238524.005788 | 0) 3.313 us | }
7238524.005788 | 0) | tty_read() {
7238524.005788 | 0) 0.087 us | tty_paranoia_check();
7238524.005789 | 0) | tty_ldisc_ref_wait() {
7238524.005789 | 0) 0.080 us | ldsem_down_read();
7238524.005790 | 0) 0.683 us | }
7238524.005790 | 0) | n_tty_read() {
7238524.005791 | 0) 0.080 us | _raw_spin_lock_irq();
7238524.005791 | 0) 0.104 us | mutex_lock_interruptible();
7238524.005792 | 0) 0.070 us | down_read();
7238524.005792 | 0) | add_wait_queue() {
7238524.005793 | 0) 0.079 us | _raw_spin_lock_irqsave();
7238524.005793 | 0) 0.087 us | _raw_spin_unlock_irqrestore();
7238524.005794 | 0) 1.147 us | }
7238524.005794 | 0) 0.078 us | tty_hung_up_p();
7238524.005795 | 0) 0.071 us | n_tty_set_room();
7238524.005795 | 0) 0.077 us | up_read();
7238524.005796 | 0) | schedule_timeout() {
7238524.005796 | 0) | schedule() {
7238524.005796 | 0) | __schedule() {
7238524.005797 | 0) 0.087 us | rcu_note_context_switch();
7238524.005797 | 0) 0.075 us | _raw_spin_lock_irq();
7238524.005798 | 0) | deactivate_task() {
7238524.005798 | 0) | dequeue_task() {
7238524.005798 | 0) 0.177 us | update_rq_clock();
7238524.005799 | 0) | dequeue_task_fair() {
7238524.005799 | 0) | dequeue_entity() {
7238524.005800 | 0) | update_curr() {
7238524.005800 | 0) 0.334 us | cpuacct_charge();
7238524.005801 | 0) 1.199 us | }
7238524.005802 | 0) 0.081 us | __update_entity_load_avg_contrib();
7238524.005802 | 0) 0.064 us | update_cfs_rq_blocked_load();
7238524.005803 | 0) 0.076 us | clear_buddies();
7238524.005803 | 0) 0.079 us | account_entity_dequeue();
7238524.005804 | 0) | update_cfs_shares() {
7238524.005804 | 0) 0.108 us | update_curr();
7238524.005805 | 0) 0.083 us | account_entity_dequeue();
7238524.005805 | 0) 0.084 us | account_entity_enqueue();
7238524.005806 | 0) 1.869 us | }
7238524.005806 | 0) 6.530 us | } /* dequeue_entity */
7238524.005807 | 0) | dequeue_entity() {
7238524.005807 | 0) 0.104 us | update_curr();
7238524.005808 | 0) 0.115 us | __update_entity_load_avg_contrib();
7238524.005808 | 0) 0.069 us | update_cfs_rq_blocked_load();
7238524.005809 | 0) 0.066 us | clear_buddies();
7238524.005809 | 0) 0.086 us | account_entity_dequeue();
7238524.005810 | 0) 0.102 us | update_cfs_shares();
7238524.005811 | 0) 3.907 us | }
7238524.005811 | 0) 0.071 us | hrtick_update();
7238524.005812 | 0) 12.301 us | }
7238524.005812 | 0) 13.546 us | }
7238524.005812 | 0) 14.105 us | }
7238524.005812 | 0) | pick_next_task_fair() {
7238524.005813 | 0) 0.078 us | check_cfs_rq_runtime();
7238524.005813 | 0) | pick_next_entity() {
7238524.005814 | 0) 0.071 us | clear_buddies();
7238524.005815 | 0) 0.585 us | }
7238524.005815 | 0) | put_prev_entity() {
7238524.005815 | 0) 0.073 us | check_cfs_rq_runtime();
7238524.005816 | 0) 0.717 us | }
7238524.005816 | 0) | put_prev_entity() {
7238524.005817 | 0) 0.080 us | check_cfs_rq_runtime();
7238524.005817 | 0) 0.687 us | }
7238524.005817 | 0) | set_next_entity() {
7238524.005818 | 0) 0.091 us | update_stats_wait_end();
7238524.005818 | 0) 0.786 us | }
7238524.005819 | 0) 6.135 us | }
7238524.005820 | 0) 0.091 us | paravirt_start_context_switch();
7238524.005821 | 0) 0.089 us | xen_read_cr0();
7238524.005821 | 0) | xen_write_cr0() {
7238524.005821 | 0) 0.078 us | paravirt_get_lazy_mode();
7238524.005822 | 0) 0.083 us | __xen_mc_entry();
7238524.005823 | 0) 0.074 us | paravirt_get_lazy_mode();
7238524.005823 | 0) 1.657 us | }
7238524.005823 | 0) | xen_load_sp0() {
7238524.005824 | 0) 0.074 us | paravirt_get_lazy_mode();
7238524.005824 | 0) 0.083 us | __xen_mc_entry();
7238524.005825 | 0) 0.087 us | paravirt_get_lazy_mode();
7238524.005825 | 0) 1.764 us | }
7238524.005826 | 0) | xen_load_tls() {
7238524.005826 | 0) 0.077 us | paravirt_get_lazy_mode();
7238524.005826 | 0) 0.084 us | paravirt_get_lazy_mode();
7238524.005827 | 0) 0.150 us | load_TLS_descriptor();
7238524.005828 | 0) 0.082 us | load_TLS_descriptor();
7238524.005828 | 0) 0.084 us | load_TLS_descriptor();
7238524.005829 | 0) 0.076 us | paravirt_get_lazy_mode();
7238524.005829 | 0) 3.388 us | }
7238524.005829 | 0) | xen_end_context_switch() {
7238524.005830 | 0) 0.731 us | xen_mc_flush();
7238524.005831 | 0) 0.093 us | paravirt_end_context_switch();
7238524.005831 | 0) 1.836 us | }
7238524.141853 | 0) | finish_task_switch() {
7238524.141857 | 0) | xen_evtchn_do_upcall() {
7238524.141858 | 0) | irq_enter() {
7238524.141858 | 0) 0.133 us | rcu_irq_enter();
7238524.141859 | 0) 0.766 us | }
7238524.141859 | 0) 0.056 us | exit_idle();
7238524.141859 | 0) | __xen_evtchn_do_upcall() {
7238524.141859 | 0) | evtchn_2l_handle_events() {
7238524.141860 | 0) 0.049 us | irq_from_virq();
7238524.141860 | 0) | evtchn_from_irq() {
7238524.141860 | 0) | irq_get_irq_data() {
7238524.141860 | 0) 0.058 us | irq_to_desc();
7238524.141861 | 0) 0.498 us | }
7238524.141861 | 0) 0.897 us | }
7238524.141861 | 0) | get_evtchn_to_irq() {
7238524.141861 | 0) 0.049 us | evtchn_2l_max_channels();
7238524.141862 | 0) 0.392 us | }
7238524.141862 | 0) | generic_handle_irq() {
7238524.141862 | 0) 0.061 us | irq_to_desc();
7238524.141862 | 0) | handle_percpu_irq() {
7238524.141863 | 0) | ack_dynirq() {
7238524.141863 | 0) | evtchn_from_irq() {
7238524.141863 | 0) | irq_get_irq_data() {
7238524.141863 | 0) 0.051 us | irq_to_desc();
7238524.141863 | 0) 0.439 us | }
7238524.141864 | 0) 0.745 us | }
7238524.141864 | 0) 0.049 us | irq_move_irq();
7238524.141864 | 0) 0.060 us | evtchn_2l_clear_pending();
7238524.141864 | 0) 1.714 us | }
7238524.141865 | 0) | handle_irq_event_percpu() {
7238524.141865 | 0) | xen_irq_work_interrupt() {
7238524.141865 | 0) | irq_enter() {
7238524.141865 | 0) 0.053 us | rcu_irq_enter();
7238524.141866 | 0) 0.371 us | }
7238524.141866 | 0) | __wake_up() {
7238524.141866 | 0) 0.051 us | _raw_spin_lock_irqsave();
7238524.141867 | 0) | __wake_up_common() {
7238524.141867 | 0) | autoremove_wake_function() {
7238524.141867 | 0) | default_wake_function() {
7238524.141867 | 0) | try_to_wake_up() {
7238524.141868 | 0) 0.213 us | _raw_spin_lock_irqsave();
7238524.141869 | 0) 0.196 us | task_waking_fair();
7238524.141870 | 0) | select_task_rq_fair() {
7238524.141870 | 0) 0.051 us | source_load();
7238524.141870 | 0) 0.049 us | target_load();
7238524.141871 | 0) 0.065 us | idle_cpu();
7238524.141871 | 0) 0.051 us | cpus_share_cache();
7238524.141872 | 0) 0.078 us | idle_cpu();
7238524.141872 | 0) 2.427 us | }
7238524.141873 | 0) 0.050 us | _raw_spin_lock();
7238524.141873 | 0) | ttwu_do_activate.constprop.124() {
7238524.141873 | 0) | activate_task() {
7238524.141873 | 0) | enqueue_task() {
7238524.141873 | 0) 0.170 us | update_rq_clock();
7238524.141874 | 0) | enqueue_task_fair() {
7238524.141874 | 0) | enqueue_entity() {
7238524.141874 | 0) 0.058 us | update_curr();
7238524.141875 | 0) 0.076 us | __compute_runnable_contrib.part.51();
7238524.141875 | 0) 0.059 us | __update_entity_load_avg_contrib();
7238524.141875 | 0) 0.060 us | update_cfs_rq_blocked_load();
7238524.141876 | 0) 0.064 us | account_entity_enqueue();
7238524.141876 | 0) 0.123 us | update_cfs_shares();
7238524.141876 | 0) 0.055 us | place_entity();
7238524.141877 | 0) 0.078 us | __enqueue_entity();
7238524.141877 | 0) 3.039 us | }
7238524.141877 | 0) | enqueue_entity() {
7238524.141877 | 0) 0.065 us | update_curr();
7238524.141878 | 0) 0.050 us | update_cfs_rq_blocked_load();
7238524.141878 | 0) 0.049 us | account_entity_enqueue();
7238524.141878 | 0) 0.081 us | update_cfs_shares();
7238524.141879 | 0) 0.049 us | place_entity();
7238524.141879 | 0) 0.051 us | __enqueue_entity();
7238524.141879 | 0) 2.021 us | }
7238524.141880 | 0) 0.049 us | hrtick_update();
7238524.141880 | 0) 6.040 us | }
7238524.141880 | 0) 6.874 us | }
7238524.141880 | 0) 7.212 us | }
7238524.141880 | 0) | ttwu_do_wakeup() {
7238524.141881 | 0) | check_preempt_curr() {
7238524.141881 | 0) | resched_task() {
7238524.141881 | 0) | xen_smp_send_reschedule() {
7238524.141881 | 0) | xen_send_IPI_one() {
7238524.141881 | 0) | notify_remote_via_irq() {
7238524.141881 | 0) | evtchn_from_irq() {
7238524.141882 | 0) | irq_get_irq_data() {
7238524.141882 | 0) 0.049 us | irq_to_desc();
7238524.141882 | 0) 0.497 us | }
7238524.141882 | 0) 0.860 us | }
7238524.141883 | 0) 1.882 us | } /* notify_remote_via_irq */
7238524.141884 | 0) 2.257 us | }
7238524.141884 | 0) 2.619 us | }
7238524.141884 | 0) 3.079 us | }
7238524.141884 | 0) 3.526 us | }
7238524.141885 | 0) 4.485 us | }
7238524.141885 | 0) 12.333 us | }
7238524.141885 | 0) 0.062 us | _raw_spin_unlock();
7238524.141886 | 0) 0.169 us | ttwu_stat();
7238524.141887 | 0) 0.076 us | _raw_spin_unlock_irqrestore();
7238524.141887 | 0) 19.434 us | }
7238524.141887 | 0) 19.909 us | }
7238524.141888 | 0) 20.377 us | }
7238524.141888 | 0) 21.020 us | }
7238524.141888 | 0) 0.075 us | _raw_spin_unlock_irqrestore();
7238524.141888 | 0) 22.268 us | }
7238524.141889 | 0) | irq_exit() {
7238524.141889 | 0) 0.087 us | idle_cpu();
7238524.141889 | 0) 0.101 us | rcu_irq_exit();
7238524.141890 | 0) 1.127 us | }
7238524.141890 | 0) 25.163 us | }
7238524.141891 | 0) 0.133 us | add_interrupt_randomness();
7238524.141892 | 0) 0.083 us | note_interrupt();
7238524.141892 | 0) 27.453 us | }
7238524.141893 | 0) 30.024 us | }
7238524.141893 | 0) 30.898 us | }
7238524.141893 | 0) 33.683 us | }
7238524.141893 | 0) 34.097 us | }
7238524.141894 | 0) | irq_exit() {
7238524.141894 | 0) 0.065 us | idle_cpu();
7238524.141895 | 0) 0.076 us | rcu_irq_exit();
7238524.141895 | 0) 1.135 us | }
7238524.141895 | 0) 37.746 us | }
7238524.141896 | 0) 39.634 us | }
7238524.141897 | 0) 136100.0 us | }
7238524.141897 | 0) 136100.6 us | }
7238524.141897 | 0) 136101.4 us | }
7238524.141898 | 0) 0.093 us | down_read();
7238524.141899 | 0) | copy_from_read_buf() {
7238524.141900 | 0) | tty_audit_add_data() {
7238524.141900 | 0) 0.238 us | _raw_spin_lock_irqsave();
7238524.141901 | 0) 0.069 us | _raw_spin_unlock_irqrestore();
7238524.141901 | 0) 0.090 us | _raw_spin_lock_irqsave();
7238524.141902 | 0) 0.077 us | _raw_spin_unlock_irqrestore();
7238524.141902 | 0) 2.513 us | }
7238524.141903 | 0) 3.632 us | }
7238524.141903 | 0) 0.085 us | copy_from_read_buf();
7238524.141904 | 0) 0.066 us | n_tty_set_room();
7238524.141905 | 0) 0.067 us | n_tty_write_wakeup();
7238524.141905 | 0) | __wake_up() {
7238524.141906 | 0) 0.070 us | _raw_spin_lock_irqsave();
7238524.141906 | 0) | __wake_up_common() {
7238524.141906 | 0) 0.086 us | pollwake();
7238524.141907 | 0) 0.620 us | }
7238524.141907 | 0) 0.064 us | _raw_spin_unlock_irqrestore();
7238524.141907 | 0) 1.980 us | }
7238524.141908 | 0) 0.059 us | n_tty_set_room();
7238524.141909 | 0) 0.071 us | up_read();
7238524.141909 | 0) | remove_wait_queue() {
7238524.141909 | 0) 0.079 us | _raw_spin_lock_irqsave();
7238524.141910 | 0) 0.082 us | _raw_spin_unlock_irqrestore();
7238524.141910 | 0) 1.164 us | }
7238524.141910 | 0) 0.142 us | mutex_unlock();
7238524.141911 | 0) 136120.9 us | }
7238524.141911 | 0) | tty_ldisc_deref() {
7238524.141912 | 0) 0.062 us | ldsem_up_read();
7238524.141912 | 0) 0.593 us | }
7238524.141912 | 0) 0.079 us | get_seconds();
7238524.141913 | 0) 136125.1 us | }
7238524.141914 | 0) 0.280 us | __fsnotify_parent();
7238524.141915 | 0) 0.187 us | fsnotify();
7238524.141915 | 0) 136131.2 us | }
7238524.141988 | 0) | vfs_read() {
7238524.141989 | 0) | rw_verify_area() {
7238524.141989 | 0) | security_file_permission() {
7238524.141989 | 0) | apparmor_file_permission() {
7238524.141990 | 0) 0.149 us | common_file_perm();
7238524.141990 | 0) 0.774 us | }
7238524.141991 | 0) 0.079 us | __fsnotify_parent();
7238524.141991 | 0) 0.095 us | fsnotify();
7238524.141992 | 0) 2.558 us | }
7238524.141992 | 0) 3.300 us | }
7238524.141993 | 0) | tty_read() {
7238524.141993 | 0) 0.076 us | tty_paranoia_check();
7238524.141994 | 0) | tty_ldisc_ref_wait() {
7238524.141994 | 0) 0.081 us | ldsem_down_read();
7238524.141995 | 0) 0.689 us | }
7238524.141995 | 0) | n_tty_read() {
7238524.141995 | 0) 0.073 us | _raw_spin_lock_irq();
7238524.141996 | 0) 0.110 us | mutex_lock_interruptible();
7238524.141997 | 0) 0.069 us | down_read();
7238524.141998 | 0) | add_wait_queue() {
7238524.141998 | 0) 0.079 us | _raw_spin_lock_irqsave();
7238524.141999 | 0) 0.078 us | _raw_spin_unlock_irqrestore();
7238524.141999 | 0) 1.201 us | }
7238524.142000 | 0) 0.067 us | tty_hung_up_p();
7238524.142000 | 0) 0.078 us | n_tty_set_room();
7238524.142001 | 0) 0.079 us | up_read();
7238524.142001 | 0) | schedule_timeout() {
7238524.142002 | 0) | schedule() {
7238524.142002 | 0) | __schedule() {
7238524.142002 | 0) 0.076 us | rcu_note_context_switch();
7238524.142003 | 0) 0.080 us | _raw_spin_lock_irq();
7238524.142004 | 0) | deactivate_task() {
7238524.142004 | 0) | dequeue_task() {
7238524.142004 | 0) 0.178 us | update_rq_clock();
7238524.142005 | 0) | dequeue_task_fair() {
7238524.142005 | 0) | dequeue_entity() {
7238524.142005 | 0) | update_curr() {
7238524.142006 | 0) 0.263 us | cpuacct_charge();
7238524.142007 | 0) 0.965 us | }
7238524.142007 | 0) 0.075 us | update_cfs_rq_blocked_load();
7238524.142008 | 0) 0.065 us | clear_buddies();
7238524.142008 | 0) 0.084 us | account_entity_dequeue();
7238524.142009 | 0) | update_cfs_shares() {
7238524.142009 | 0) 0.115 us | update_curr();
7238524.142010 | 0) 0.084 us | account_entity_dequeue();
7238524.142010 | 0) 0.068 us | account_entity_enqueue();
7238524.142011 | 0) 1.754 us | }
7238524.142011 | 0) 5.580 us | }
7238524.142012 | 0) | dequeue_entity() {
7238524.142012 | 0) 0.089 us | update_curr();
7238524.142012 | 0) 0.101 us | update_cfs_rq_blocked_load();
7238524.142013 | 0) 0.076 us | clear_buddies();
7238524.142013 | 0) 0.078 us | account_entity_dequeue();
7238524.142014 | 0) 0.076 us | update_cfs_shares();
7238524.142015 | 0) 3.071 us | }
7238524.142015 | 0) 0.078 us | hrtick_update();
7238524.142016 | 0) 10.525 us | }
7238524.142016 | 0) 11.803 us | }
7238524.142016 | 0) 12.447 us | }
7238524.142017 | 0) | pick_next_task_fair() {
7238524.142017 | 0) 0.069 us | check_cfs_rq_runtime();
7238524.142017 | 0) | pick_next_entity() {
7238524.142018 | 0) 0.061 us | clear_buddies();
7238524.142018 | 0) 0.601 us | }
7238524.142019 | 0) | put_prev_entity() {
7238524.142019 | 0) 0.069 us | check_cfs_rq_runtime();
7238524.142019 | 0) 0.605 us | }
7238524.142020 | 0) | put_prev_entity() {
7238524.142020 | 0) 0.076 us | check_cfs_rq_runtime();
7238524.142020 | 0) 0.609 us | }
7238524.142021 | 0) | set_next_entity() {
7238524.142021 | 0) 0.088 us | update_stats_wait_end();
7238524.142022 | 0) 0.768 us | }
7238524.142022 | 0) 5.183 us | }
7238524.142023 | 0) 0.080 us | paravirt_start_context_switch();
7238524.142024 | 0) 0.076 us | xen_read_cr0();
7238524.142024 | 0) | xen_write_cr0() {
7238524.142025 | 0) 0.088 us | paravirt_get_lazy_mode();
7238524.142025 | 0) 0.096 us | __xen_mc_entry();
7238524.142026 | 0) 0.083 us | paravirt_get_lazy_mode();
7238524.142026 | 0) 1.802 us | }
7238524.142026 | 0) | xen_load_sp0() {
7238524.142027 | 0) 0.074 us | paravirt_get_lazy_mode();
7238524.142027 | 0) 0.098 us | __xen_mc_entry();
7238524.142028 | 0) 0.073 us | paravirt_get_lazy_mode();
7238524.142029 | 0) 2.289 us | }
7238524.142029 | 0) | xen_load_tls() {
7238524.142029 | 0) 0.073 us | paravirt_get_lazy_mode();
7238524.142030 | 0) 0.079 us | paravirt_get_lazy_mode();
7238524.142031 | 0) 0.135 us | load_TLS_descriptor();
7238524.142031 | 0) 0.082 us | load_TLS_descriptor();
7238524.142032 | 0) 0.091 us | load_TLS_descriptor();
7238524.142032 | 0) 0.081 us | paravirt_get_lazy_mode();
7238524.142033 | 0) 3.306 us | }
7238524.142033 | 0) | xen_end_context_switch() {
7238524.142033 | 0) 0.697 us | xen_mc_flush();
7238524.142034 | 0) 0.083 us | paravirt_end_context_switch();
7238524.142035 | 0) 1.876 us | }
7238524.269404 | 0) | finish_task_switch() {
7238524.269408 | 0) | xen_evtchn_do_upcall() {
7238524.269408 | 0) | irq_enter() {
7238524.269408 | 0) 0.132 us | rcu_irq_enter();
7238524.269409 | 0) 0.948 us | }
7238524.269409 | 0) 0.063 us | exit_idle();
7238524.269410 | 0) | __xen_evtchn_do_upcall() {
7238524.269410 | 0) | evtchn_2l_handle_events() {
7238524.269410 | 0) 0.057 us | irq_from_virq();
7238524.269411 | 0) | evtchn_from_irq() {
7238524.269411 | 0) | irq_get_irq_data() {
7238524.269411 | 0) 0.058 us | irq_to_desc();
7238524.269412 | 0) 0.579 us | }
7238524.269412 | 0) 0.898 us | }
7238524.269412 | 0) | get_evtchn_to_irq() {
7238524.269412 | 0) 0.049 us | evtchn_2l_max_channels();
7238524.269412 | 0) 0.390 us | }
7238524.269413 | 0) | generic_handle_irq() {
7238524.269413 | 0) 0.051 us | irq_to_desc();
7238524.269413 | 0) | handle_percpu_irq() {
7238524.269413 | 0) | ack_dynirq() {
7238524.269413 | 0) | evtchn_from_irq() {
7238524.269414 | 0) | irq_get_irq_data() {
7238524.269414 | 0) 0.057 us | irq_to_desc();
7238524.269414 | 0) 0.446 us | }
7238524.269414 | 0) 0.754 us | }
7238524.269414 | 0) 0.057 us | irq_move_irq();
7238524.269415 | 0) 0.057 us | evtchn_2l_clear_pending();
7238524.269415 | 0) 1.718 us | }
7238524.269415 | 0) | handle_irq_event_percpu() {
7238524.269416 | 0) | xen_irq_work_interrupt() {
7238524.269416 | 0) | irq_enter() {
7238524.269416 | 0) 0.059 us | rcu_irq_enter();
7238524.269416 | 0) 0.380 us | }
7238524.269417 | 0) | __wake_up() {
7238524.269417 | 0) 0.051 us | _raw_spin_lock_irqsave();
7238524.269417 | 0) | __wake_up_common() {
7238524.269417 | 0) | autoremove_wake_function() {
7238524.269418 | 0) | default_wake_function() {
7238524.269418 | 0) | try_to_wake_up() {
7238524.269418 | 0) 0.230 us | _raw_spin_lock_irqsave();
7238524.269419 | 0) 0.197 us | task_waking_fair();
7238524.269419 | 0) | select_task_rq_fair() {
7238524.269419 | 0) 0.050 us | source_load();
7238524.269420 | 0) 0.057 us | target_load();
7238524.269420 | 0) 0.065 us | idle_cpu();
7238524.269421 | 0) 0.055 us | cpus_share_cache();
7238524.269421 | 0) 0.076 us | idle_cpu();
7238524.269421 | 0) 2.041 us | }
7238524.269422 | 0) 0.050 us | _raw_spin_lock();
7238524.269422 | 0) | ttwu_do_activate.constprop.124() {
7238524.269422 | 0) | activate_task() {
7238524.269422 | 0) | enqueue_task() {
7238524.269422 | 0) 0.175 us | update_rq_clock();
7238524.269423 | 0) | enqueue_task_fair() {
7238524.269423 | 0) | enqueue_entity() {
7238524.269423 | 0) 0.065 us | update_curr();
7238524.269424 | 0) 0.070 us | __compute_runnable_contrib.part.51();
7238524.269424 | 0) 0.052 us | __update_entity_load_avg_contrib();
7238524.269424 | 0) 0.050 us | update_cfs_rq_blocked_load();
7238524.269425 | 0) 0.059 us | account_entity_enqueue();
7238524.269426 | 0) 0.134 us | update_cfs_shares();
7238524.269426 | 0) 0.055 us | place_entity();
7238524.269427 | 0) 0.083 us | __enqueue_entity();
7238524.269427 | 0) 4.026 us | }
7238524.269427 | 0) | enqueue_entity() {
7238524.269428 | 0) 0.065 us | update_curr();
7238524.269428 | 0) 0.051 us | update_cfs_rq_blocked_load();
7238524.269428 | 0) 0.058 us | account_entity_enqueue();
7238524.269429 | 0) 0.082 us | update_cfs_shares();
7238524.269429 | 0) 0.105 us | place_entity();
7238524.269429 | 0) 0.049 us | __enqueue_entity();
7238524.269430 | 0) 2.247 us | }
7238524.269430 | 0) 0.050 us | hrtick_update();
7238524.269430 | 0) 7.310 us | }
7238524.269430 | 0) 8.101 us | }
7238524.269431 | 0) 8.449 us | }
7238524.269431 | 0) | ttwu_do_wakeup() {
7238524.269431 | 0) | check_preempt_curr() {
7238524.269431 | 0) | resched_task() {
7238524.269431 | 0) | xen_smp_send_reschedule() {
7238524.269432 | 0) | xen_send_IPI_one() {
7238524.269432 | 0) | notify_remote_via_irq() {
7238524.269432 | 0) | evtchn_from_irq() {
7238524.269432 | 0) | irq_get_irq_data() {
7238524.269432 | 0) 0.051 us | irq_to_desc();
7238524.269433 | 0) 0.493 us | }
7238524.269433 | 0) 0.857 us | }
7238524.269434 | 0) 1.909 us | } /* notify_remote_via_irq */
7238524.269434 | 0) 2.288 us | }
7238524.269434 | 0) 2.655 us | }
7238524.269434 | 0) 3.127 us | }
7238524.269435 | 0) 3.590 us | }
7238524.269435 | 0) 4.506 us | }
7238524.269436 | 0) 13.594 us | }
7238524.269436 | 0) 0.070 us | _raw_spin_unlock();
7238524.269436 | 0) 0.163 us | ttwu_stat();
7238524.269437 | 0) 0.080 us | _raw_spin_unlock_irqrestore();
7238524.269438 | 0) 19.508 us | }
7238524.269438 | 0) 19.991 us | }
7238524.269438 | 0) 20.486 us | }
7238524.269438 | 0) 21.024 us | }
7238524.269438 | 0) 0.076 us | _raw_spin_unlock_irqrestore();
7238524.269439 | 0) 22.247 us | }
7238524.269439 | 0) | irq_exit() {
7238524.269439 | 0) 0.101 us | idle_cpu();
7238524.269440 | 0) 0.099 us | rcu_irq_exit();
7238524.269441 | 0) 1.207 us | }
7238524.269441 | 0) 25.035 us | }
7238524.269441 | 0) 0.131 us | add_interrupt_randomness();
7238524.269442 | 0) 0.076 us | note_interrupt();
7238524.269442 | 0) 26.909 us | }
7238524.269443 | 0) 29.377 us | }
7238524.269443 | 0) 30.139 us | }
7238524.269443 | 0) 32.759 us | }
7238524.269443 | 0) 33.204 us | }
7238524.269444 | 0) | irq_exit() {
7238524.269444 | 0) | __do_softirq() {
7238524.269444 | 0) 0.068 us | msecs_to_jiffies();
7238524.269445 | 0) | rcu_process_callbacks() {
7238524.269445 | 0) 0.070 us | note_gp_changes();
7238524.269445 | 0) 0.064 us | _raw_spin_lock_irqsave();
7238524.269446 | 0) 0.135 us | rcu_accelerate_cbs();
7238524.269447 | 0) | rcu_report_qs_rnp() {
7238524.269447 | 0) 0.061 us | _raw_spin_unlock_irqrestore();
7238524.269448 | 0) 0.779 us | }
7238524.269448 | 0) 0.081 us | cpu_needs_another_gp();
7238524.269449 | 0) | file_free_rcu() {
7238524.269449 | 0) 0.291 us | kmem_cache_free();
7238524.269450 | 0) 1.139 us | }
7238524.269451 | 0) | put_cred_rcu() {
7238524.269451 | 0) | security_cred_free() {
7238524.269452 | 0) | apparmor_cred_free() {
7238524.269453 | 0) | aa_free_task_context() {
7238524.269453 | 0) | kzfree() {
7238524.269454 | 0) 0.380 us | ksize();
7238524.269455 | 0) 0.147 us | kfree();
7238524.269455 | 0) 1.602 us | }
7238524.269455 | 0) 2.631 us | }
7238524.269456 | 0) 3.611 us | } /* apparmor_cred_free */
7238524.269456 | 0) 4.927 us | }
7238524.269457 | 0) 0.071 us | key_put();
7238524.269457 | 0) 0.071 us | key_put();
7238524.269458 | 0) 0.065 us | key_put();
7238524.269458 | 0) 0.066 us | key_put();
7238524.269459 | 0) 0.390 us | free_uid();
7238524.269460 | 0) 0.178 us | kmem_cache_free();
7238524.269460 | 0) 9.429 us | }
7238524.269461 | 0) 0.099 us | note_gp_changes();
7238524.269461 | 0) 0.080 us | cpu_needs_another_gp();
7238524.269462 | 0) 16.796 us | }
7238524.269462 | 0) 0.068 us | rcu_bh_qs();
7238524.269462 | 0) 0.066 us | __local_bh_enable();
7238524.269463 | 0) 18.770 us | }
7238524.269463 | 0) 0.073 us | idle_cpu();
7238524.269464 | 0) 0.088 us | rcu_irq_exit();
7238524.269464 | 0) 20.487 us | }
7238524.269465 | 0) 56.365 us | }
7238524.269465 | 0) 58.028 us | }
7238524.269466 | 0) 127463.5 us | }
7238524.269466 | 0) 127464.2 us | }
7238524.269467 | 0) 127465.0 us | }
7238524.269467 | 0) 0.095 us | down_read();
7238524.269468 | 0) | copy_from_read_buf() {
7238524.269469 | 0) | tty_audit_add_data() {
7238524.269469 | 0) 0.228 us | _raw_spin_lock_irqsave();
7238524.269470 | 0) 0.070 us | _raw_spin_unlock_irqrestore();
7238524.269471 | 0) 0.074 us | _raw_spin_lock_irqsave();
7238524.269471 | 0) 0.079 us | _raw_spin_unlock_irqrestore();
7238524.269472 | 0) 2.616 us | }
7238524.269472 | 0) 3.878 us | }
7238524.269473 | 0) 0.104 us | copy_from_read_buf();
7238524.269473 | 0) 0.074 us | n_tty_set_room();
7238524.269474 | 0) 0.067 us | n_tty_write_wakeup();
7238524.269474 | 0) | __wake_up() {
7238524.269475 | 0) 0.077 us | _raw_spin_lock_irqsave();
7238524.269475 | 0) | __wake_up_common() {
7238524.269476 | 0) 0.095 us | pollwake();
7238524.269476 | 0) 0.694 us | }
7238524.269476 | 0) 0.064 us | _raw_spin_unlock_irqrestore();
7238524.269477 | 0) 2.128 us | }
7238524.269477 | 0) 0.062 us | n_tty_set_room();
7238524.269477 | 0) 0.066 us | up_read();
7238524.269478 | 0) | remove_wait_queue() {
7238524.269478 | 0) 0.080 us | _raw_spin_lock_irqsave();
7238524.269479 | 0) 0.081 us | _raw_spin_unlock_irqrestore();
7238524.269480 | 0) 1.225 us | }
7238524.269480 | 0) 0.152 us | mutex_unlock();
7238524.269480 | 0) 127485.3 us | }
7238524.269481 | 0) | tty_ldisc_deref() {
7238524.269481 | 0) 0.081 us | ldsem_up_read();
7238524.269482 | 0) 0.655 us | }
7238524.269482 | 0) 0.089 us | get_seconds();
7238524.269483 | 0) 127490.1 us | }
7238524.269484 | 0) 0.287 us | __fsnotify_parent();
7238524.269484 | 0) 0.183 us | fsnotify();
7238524.269485 | 0) 127496.2 us | }
7238524.269559 | 0) | vfs_read() {
7238524.269559 | 0) | rw_verify_area() {
7238524.269560 | 0) | security_file_permission() {
7238524.269560 | 0) | apparmor_file_permission() {
7238524.269561 | 0) 0.164 us | common_file_perm();
7238524.269561 | 0) 0.831 us | }
7238524.269562 | 0) 0.078 us | __fsnotify_parent();
7238524.269562 | 0) 0.080 us | fsnotify();
7238524.269563 | 0) 2.765 us | }
7238524.269563 | 0) 3.490 us | }
7238524.269564 | 0) | tty_read() {
7238524.269564 | 0) 0.066 us | tty_paranoia_check();
7238524.269564 | 0) | tty_ldisc_ref_wait() {
7238524.269565 | 0) 0.085 us | ldsem_down_read();
7238524.269565 | 0) 0.656 us | }
7238524.269566 | 0) | n_tty_read() {
7238524.269566 | 0) 0.078 us | _raw_spin_lock_irq();
7238524.269567 | 0) 0.118 us | mutex_lock_interruptible();
7238524.269567 | 0) 0.078 us | down_read();
7238524.269568 | 0) | add_wait_queue() {
7238524.269568 | 0) 0.089 us | _raw_spin_lock_irqsave();
7238524.269569 | 0) 0.082 us | _raw_spin_unlock_irqrestore();
7238524.269569 | 0) 1.164 us | }
7238524.269570 | 0) 0.073 us | tty_hung_up_p();
7238524.269570 | 0) 0.076 us | n_tty_set_room();
7238524.269571 | 0) 0.078 us | up_read();
7238524.269571 | 0) | schedule_timeout() {
7238524.269572 | 0) | schedule() {
7238524.269572 | 0) | __schedule() {
7238524.269572 | 0) 0.078 us | rcu_note_context_switch();
7238524.269573 | 0) 0.085 us | _raw_spin_lock_irq();
7238524.269574 | 0) | deactivate_task() {
7238524.269574 | 0) | dequeue_task() {
7238524.269574 | 0) 0.185 us | update_rq_clock();
7238524.269575 | 0) | dequeue_task_fair() {
7238524.269575 | 0) | dequeue_entity() {
7238524.269575 | 0) | update_curr() {
7238524.269576 | 0) 0.206 us | cpuacct_charge();
7238524.269577 | 0) 0.937 us | }
7238524.269577 | 0) 0.084 us | __update_entity_load_avg_contrib();
7238524.269577 | 0) 0.077 us | update_cfs_rq_blocked_load();
7238524.269578 | 0) 0.075 us | clear_buddies();
7238524.269579 | 0) 0.096 us | account_entity_dequeue();
7238524.269579 | 0) | update_cfs_shares() {
7238524.269580 | 0) 0.095 us | update_curr();
7238524.269580 | 0) 0.104 us | account_entity_dequeue();
7238524.269581 | 0) 0.076 us | account_entity_enqueue();
7238524.269581 | 0) 1.898 us | }
7238524.269582 | 0) 6.120 us | }
7238524.269582 | 0) | dequeue_entity() {
7238524.269582 | 0) 0.093 us | update_curr();
7238524.269583 | 0) 0.116 us | __update_entity_load_avg_contrib();
7238524.269583 | 0) 0.085 us | update_cfs_rq_blocked_load();
7238524.269584 | 0) 0.067 us | clear_buddies();
7238524.269585 | 0) 0.082 us | account_entity_dequeue();
7238524.269585 | 0) 0.097 us | update_cfs_shares();
7238524.269586 | 0) 3.833 us | }
7238524.269586 | 0) 0.070 us | hrtick_update();
7238524.269587 | 0) 11.677 us | }
7238524.269587 | 0) 13.001 us | }
7238524.269587 | 0) 13.516 us | }
7238524.269588 | 0) | pick_next_task_fair() {
7238524.269588 | 0) 0.072 us | check_cfs_rq_runtime();
7238524.269588 | 0) | pick_next_entity() {
7238524.269589 | 0) 0.080 us | clear_buddies();
7238524.269589 | 0) 0.675 us | }
7238524.269590 | 0) | put_prev_entity() {
7238524.269590 | 0) 0.071 us | check_cfs_rq_runtime();
7238524.269591 | 0) 0.543 us | }
7238524.269591 | 0) | put_prev_entity() {
7238524.269591 | 0) 0.066 us | check_cfs_rq_runtime();
7238524.269592 | 0) 0.658 us | }
7238524.269592 | 0) | set_next_entity() {
7238524.269593 | 0) 0.082 us | update_stats_wait_end();
7238524.269593 | 0) 0.844 us | }
7238524.269594 | 0) 5.970 us | }
7238524.269594 | 0) 0.076 us | paravirt_start_context_switch();
7238524.269595 | 0) 0.074 us | xen_read_cr0();
7238524.269596 | 0) | xen_write_cr0() {
7238524.269597 | 0) 0.081 us | paravirt_get_lazy_mode();
7238524.269597 | 0) 0.086 us | __xen_mc_entry();
7238524.269598 | 0) 0.070 us | paravirt_get_lazy_mode();
7238524.269598 | 0) 1.739 us | }
7238524.269598 | 0) | xen_load_sp0() {
7238524.269599 | 0) 0.078 us | paravirt_get_lazy_mode();
7238524.269599 | 0) 0.078 us | __xen_mc_entry();
7238524.269600 | 0) 0.069 us | paravirt_get_lazy_mode();
7238524.269600 | 0) 1.568 us | }
7238524.269601 | 0) | xen_load_tls() {
7238524.269601 | 0) 0.068 us | paravirt_get_lazy_mode();
7238524.269601 | 0) 0.068 us | paravirt_get_lazy_mode();
7238524.269602 | 0) 0.078 us | load_TLS_descriptor();
7238524.269602 | 0) 0.071 us | load_TLS_descriptor();
7238524.269603 | 0) 0.073 us | load_TLS_descriptor();
7238524.269603 | 0) 0.063 us | paravirt_get_lazy_mode();
7238524.269604 | 0) 3.025 us | }
7238524.269604 | 0) | xen_end_context_switch() {
7238524.269604 | 0) 0.646 us | xen_mc_flush();
7238524.269605 | 0) 0.087 us | paravirt_end_context_switch();
7238524.269606 | 0) 1.604 us | }
^C
Ending tracing...
If you read through the durations carefully, you can see that the shell begins
by completing a 19 second read (time between commands), then has a series of
100 to 200 ms reads (inter-keystroke latency).
The function times printed are inclusive of their children.
The -C option will print on-CPU times only, excluding sleeping or blocking
events from the function duration times. Eg:
# ./funcgraph -Ctp 25285 vfs_read
Tracing "vfs_read" for PID 25285... Ctrl-C to end.
7338520.591816 | 0) | finish_task_switch() {
7338520.591820 | 0) | xen_evtchn_do_upcall() {
7338520.591821 | 0) | irq_enter() {
7338520.591821 | 0) 0.134 us | rcu_irq_enter();
7338520.591822 | 0) 0.823 us | }
7338520.591822 | 0) 0.055 us | exit_idle();
7338520.591822 | 0) | __xen_evtchn_do_upcall() {
7338520.591823 | 0) | evtchn_2l_handle_events() {
7338520.591823 | 0) 0.051 us | irq_from_virq();
7338520.591823 | 0) | evtchn_from_irq() {
7338520.591823 | 0) | irq_get_irq_data() {
7338520.591824 | 0) 0.064 us | irq_to_desc();
7338520.591824 | 0) 0.572 us | }
7338520.591824 | 0) 0.973 us | }
7338520.591825 | 0) | get_evtchn_to_irq() {
7338520.591825 | 0) 0.049 us | evtchn_2l_max_channels();
7338520.591825 | 0) 0.386 us | }
7338520.591825 | 0) | generic_handle_irq() {
7338520.591825 | 0) 0.061 us | irq_to_desc();
7338520.591826 | 0) | handle_percpu_irq() {
7338520.591826 | 0) | ack_dynirq() {
7338520.591826 | 0) | evtchn_from_irq() {
7338520.591826 | 0) | irq_get_irq_data() {
7338520.591827 | 0) 0.050 us | irq_to_desc();
7338520.591827 | 0) 0.441 us | }
7338520.591827 | 0) 0.748 us | }
7338520.591827 | 0) 0.048 us | irq_move_irq();
7338520.591828 | 0) 0.053 us | evtchn_2l_clear_pending();
7338520.591828 | 0) 1.810 us | }
7338520.591828 | 0) | handle_irq_event_percpu() {
7338520.591828 | 0) | xen_irq_work_interrupt() {
7338520.591829 | 0) | irq_enter() {
7338520.591829 | 0) 0.069 us | rcu_irq_enter();
7338520.591829 | 0) 0.386 us | }
7338520.591830 | 0) | __wake_up() {
7338520.591830 | 0) 0.060 us | _raw_spin_lock_irqsave();
7338520.591830 | 0) | __wake_up_common() {
7338520.591830 | 0) | autoremove_wake_function() {
7338520.591831 | 0) | default_wake_function() {
7338520.591831 | 0) | try_to_wake_up() {
7338520.591831 | 0) 0.223 us | _raw_spin_lock_irqsave();
7338520.591832 | 0) 0.243 us | task_waking_fair();
7338520.591832 | 0) | select_task_rq_fair() {
7338520.591833 | 0) 0.063 us | source_load();
7338520.591833 | 0) 0.059 us | target_load();
7338520.591834 | 0) 0.060 us | idle_cpu();
7338520.591834 | 0) 0.059 us | cpus_share_cache();
7338520.591834 | 0) 0.085 us | idle_cpu();
7338520.591835 | 0) 2.176 us | }
7338520.591835 | 0) 0.050 us | _raw_spin_lock();
7338520.591835 | 0) | ttwu_do_activate.constprop.124() {
7338520.591835 | 0) | activate_task() {
7338520.591836 | 0) | enqueue_task() {
7338520.591836 | 0) 0.197 us | update_rq_clock();
7338520.591836 | 0) | enqueue_task_fair() {
7338520.591836 | 0) | enqueue_entity() {
7338520.591837 | 0) 0.118 us | update_curr();
7338520.591837 | 0) 0.060 us | __compute_runnable_contrib.part.51();
7338520.591838 | 0) 0.052 us | __update_entity_load_avg_contrib();
7338520.591838 | 0) 0.132 us | update_cfs_rq_blocked_load();
7338520.591838 | 0) 0.068 us | account_entity_enqueue();
7338520.591839 | 0) 0.327 us | update_cfs_shares();
7338520.591839 | 0) 0.055 us | place_entity();
7338520.591840 | 0) 0.086 us | __enqueue_entity();
7338520.591840 | 0) 0.069 us | update_cfs_rq_blocked_load();
7338520.591840 | 0) 3.870 us | }
7338520.591841 | 0) | enqueue_entity() {
7338520.591841 | 0) 0.050 us | update_curr();
7338520.591841 | 0) 0.048 us | __compute_runnable_contrib.part.51();
7338520.591842 | 0) 0.079 us | __update_entity_load_avg_contrib();
7338520.591842 | 0) 0.068 us | update_cfs_rq_blocked_load();
7338520.591842 | 0) 0.072 us | account_entity_enqueue();
7338520.591843 | 0) 0.068 us | update_cfs_shares();
7338520.591844 | 0) 0.123 us | place_entity();
7338520.591844 | 0) 0.051 us | __enqueue_entity();
7338520.591845 | 0) 3.919 us | }
7338520.591845 | 0) 0.059 us | hrtick_update();
7338520.591845 | 0) 8.895 us | }
7338520.591846 | 0) 9.770 us | }
7338520.591846 | 0) 10.197 us | }
7338520.591846 | 0) | ttwu_do_wakeup() {
7338520.591846 | 0) | check_preempt_curr() {
7338520.591846 | 0) | resched_task() {
7338520.591847 | 0) | xen_smp_send_reschedule() {
7338520.591847 | 0) | xen_send_IPI_one() {
7338520.591847 | 0) | notify_remote_via_irq() {
7338520.591847 | 0) | evtchn_from_irq() {
7338520.591848 | 0) | irq_get_irq_data() {
7338520.591848 | 0) 0.051 us | irq_to_desc();
7338520.591848 | 0) 0.503 us | }
7338520.591848 | 0) 1.031 us | }
7338520.591849 | 0) 2.112 us | }
7338520.591849 | 0) 2.484 us | }
7338520.591850 | 0) 2.851 us | }
7338520.591850 | 0) 3.311 us | }
7338520.591850 | 0) 3.828 us | }
7338520.591851 | 0) 4.788 us | }
7338520.591851 | 0) 15.731 us | }
7338520.591851 | 0) 0.074 us | _raw_spin_unlock();
7338520.591852 | 0) 0.156 us | ttwu_stat();
7338520.591852 | 0) 0.080 us | _raw_spin_unlock_irqrestore();
7338520.591853 | 0) 21.807 us | }
7338520.591853 | 0) 22.286 us | }
7338520.591853 | 0) 22.738 us | }
7338520.591854 | 0) 23.387 us | }
7338520.591854 | 0) 0.105 us | _raw_spin_unlock_irqrestore();
7338520.591854 | 0) 24.698 us | }
7338520.591855 | 0) | irq_exit() {
7338520.591855 | 0) 0.086 us | idle_cpu();
7338520.591856 | 0) 0.105 us | rcu_irq_exit();
7338520.591856 | 0) 1.272 us | }
7338520.591856 | 0) 27.818 us | }
7338520.591857 | 0) 0.140 us | add_interrupt_randomness();
7338520.591857 | 0) 0.084 us | note_interrupt();
7338520.591858 | 0) 29.866 us | }
7338520.591858 | 0) 32.390 us | }
7338520.591859 | 0) 33.204 us | }
7338520.591859 | 0) 36.137 us | }
7338520.591859 | 0) 36.574 us | }
7338520.591859 | 0) | irq_exit() {
7338520.591860 | 0) 0.073 us | idle_cpu();
7338520.591860 | 0) 0.076 us | rcu_irq_exit();
7338520.591861 | 0) 1.091 us | }
7338520.591861 | 0) 40.156 us | }
7338520.591862 | 0) 41.874 us | }
7338520.591862 | 0) 75.633 us | } /* __schedule */
7338520.591862 | 0) 76.182 us | } /* schedule */
7338520.591863 | 0) 76.965 us | } /* schedule_timeout */
7338520.591863 | 0) 0.070 us | down_read();
7338520.591864 | 0) | copy_from_read_buf() {
7338520.591865 | 0) | tty_audit_add_data() {
7338520.591865 | 0) 0.232 us | _raw_spin_lock_irqsave();
7338520.591866 | 0) 0.079 us | _raw_spin_unlock_irqrestore();
7338520.591867 | 0) 0.122 us | _raw_spin_lock_irqsave();
7338520.591867 | 0) 0.066 us | _raw_spin_unlock_irqrestore();
7338520.591868 | 0) 2.642 us | }
7338520.591868 | 0) 3.886 us | }
7338520.591868 | 0) 0.149 us | copy_from_read_buf();
7338520.591869 | 0) 0.072 us | n_tty_set_room();
7338520.591870 | 0) 0.071 us | n_tty_write_wakeup();
7338520.591870 | 0) | __wake_up() {
7338520.591871 | 0) 0.071 us | _raw_spin_lock_irqsave();
7338520.591872 | 0) | __wake_up_common() {
7338520.591872 | 0) 0.097 us | pollwake();
7338520.591873 | 0) 0.739 us | }
7338520.591873 | 0) 0.066 us | _raw_spin_unlock_irqrestore();
7338520.591874 | 0) 3.043 us | }
7338520.591874 | 0) 0.075 us | n_tty_set_room();
7338520.591875 | 0) 0.106 us | up_read();
7338520.591875 | 0) | remove_wait_queue() {
7338520.591875 | 0) 0.078 us | _raw_spin_lock_irqsave();
7338520.591876 | 0) 0.075 us | _raw_spin_unlock_irqrestore();
7338520.591877 | 0) 1.165 us | }
7338520.591877 | 0) 0.137 us | mutex_unlock();
7338520.591877 | 0) 98.321 us | } /* n_tty_read */
7338520.591878 | 0) | tty_ldisc_deref() {
7338520.591878 | 0) 0.072 us | ldsem_up_read();
7338520.591879 | 0) 0.561 us | }
7338520.591879 | 0) 0.090 us | get_seconds();
7338520.591880 | 0) 102.599 us | } /* tty_read */
7338520.591880 | 0) 0.362 us | __fsnotify_parent();
7338520.591881 | 0) 0.171 us | fsnotify();
7338520.591882 | 0) 109.640 us | } /* vfs_read */
7338520.591951 | 0) | vfs_read() {
7338520.591951 | 0) | rw_verify_area() {
7338520.591952 | 0) | security_file_permission() {
7338520.591952 | 0) | apparmor_file_permission() {
7338520.591952 | 0) 0.174 us | common_file_perm();
7338520.591953 | 0) 0.762 us | }
7338520.591953 | 0) 0.126 us | __fsnotify_parent();
7338520.591954 | 0) 0.088 us | fsnotify();
7338520.591954 | 0) 2.609 us | }
7338520.591955 | 0) 3.351 us | }
7338520.591955 | 0) | tty_read() {
7338520.591956 | 0) 0.081 us | tty_paranoia_check();
7338520.591956 | 0) | tty_ldisc_ref_wait() {
7338520.591956 | 0) 0.090 us | ldsem_down_read();
7338520.591957 | 0) 0.633 us | }
7338520.591957 | 0) | n_tty_read() {
7338520.591958 | 0) 0.073 us | _raw_spin_lock_irq();
7338520.591958 | 0) 0.089 us | mutex_lock_interruptible();
7338520.591959 | 0) 0.080 us | down_read();
7338520.591960 | 0) | add_wait_queue() {
7338520.591960 | 0) 0.084 us | _raw_spin_lock_irqsave();
7338520.591960 | 0) 0.087 us | _raw_spin_unlock_irqrestore();
7338520.591961 | 0) 1.215 us | }
7338520.591961 | 0) 0.078 us | tty_hung_up_p();
7338520.591962 | 0) 0.084 us | n_tty_set_room();
7338520.591962 | 0) 0.072 us | up_read();
7338520.591963 | 0) | schedule_timeout() {
7338520.591963 | 0) | schedule() {
7338520.591964 | 0) | __schedule() {
7338520.591964 | 0) 0.084 us | rcu_note_context_switch();
7338520.591965 | 0) 0.086 us | _raw_spin_lock_irq();
7338520.591965 | 0) | deactivate_task() {
7338520.591966 | 0) | dequeue_task() {
7338520.591966 | 0) 0.171 us | update_rq_clock();
7338520.591966 | 0) | dequeue_task_fair() {
7338520.591967 | 0) | dequeue_entity() {
7338520.591967 | 0) | update_curr() {
7338520.591967 | 0) 0.248 us | cpuacct_charge();
7338520.591968 | 0) 0.974 us | }
7338520.591969 | 0) 0.074 us | update_cfs_rq_blocked_load();
7338520.591969 | 0) 0.081 us | clear_buddies();
7338520.591970 | 0) 0.094 us | account_entity_dequeue();
7338520.591971 | 0) | update_cfs_shares() {
7338520.591971 | 0) 0.096 us | update_curr();
7338520.591971 | 0) 0.093 us | account_entity_dequeue();
7338520.591972 | 0) 0.079 us | account_entity_enqueue();
7338520.591972 | 0) 1.743 us | }
7338520.591972 | 0) 5.515 us | }
7338520.591973 | 0) | dequeue_entity() {
7338520.591973 | 0) 0.088 us | update_curr();
7338520.591974 | 0) 0.106 us | update_cfs_rq_blocked_load();
7338520.591975 | 0) 0.078 us | clear_buddies();
7338520.591975 | 0) 0.088 us | account_entity_dequeue();
7338520.591976 | 0) 0.091 us | update_cfs_shares();
7338520.591977 | 0) 3.639 us | }
7338520.591977 | 0) 0.078 us | hrtick_update();
7338520.591978 | 0) 10.851 us | }
7338520.591978 | 0) 11.992 us | }
7338520.591978 | 0) 12.496 us | }
7338520.591978 | 0) | pick_next_task_fair() {
7338520.591979 | 0) 0.079 us | check_cfs_rq_runtime();
7338520.591979 | 0) | pick_next_entity() {
7338520.591979 | 0) 0.080 us | clear_buddies();
7338520.591980 | 0) 0.594 us | }
7338520.591980 | 0) | put_prev_entity() {
7338520.591980 | 0) 0.078 us | check_cfs_rq_runtime();
7338520.591981 | 0) 0.641 us | }
7338520.591981 | 0) | put_prev_entity() {
7338520.591982 | 0) 0.076 us | check_cfs_rq_runtime();
7338520.591982 | 0) 0.610 us | }
7338520.591982 | 0) | set_next_entity() {
7338520.591983 | 0) 0.097 us | update_stats_wait_end();
7338520.591983 | 0) 0.744 us | }
7338520.591984 | 0) 5.115 us | }
7338520.591984 | 0) 0.076 us | paravirt_start_context_switch();
7338520.591985 | 0) 0.086 us | xen_read_cr0();
7338520.591986 | 0) | xen_write_cr0() {
7338520.591986 | 0) 0.078 us | paravirt_get_lazy_mode();
7338520.591987 | 0) 0.086 us | __xen_mc_entry();
7338520.591987 | 0) 0.078 us | paravirt_get_lazy_mode();
7338520.591988 | 0) 1.698 us | }
7338520.591988 | 0) | xen_load_sp0() {
7338520.591988 | 0) 0.074 us | paravirt_get_lazy_mode();
7338520.591989 | 0) 0.084 us | __xen_mc_entry();
7338520.591989 | 0) 0.084 us | paravirt_get_lazy_mode();
7338520.591990 | 0) 1.724 us | }
7338520.591990 | 0) | xen_load_tls() {
7338520.591991 | 0) 0.080 us | paravirt_get_lazy_mode();
7338520.591991 | 0) 0.088 us | paravirt_get_lazy_mode();
7338520.591992 | 0) 0.140 us | load_TLS_descriptor();
7338520.591992 | 0) 0.079 us | load_TLS_descriptor();
7338520.591993 | 0) 0.087 us | load_TLS_descriptor();
7338520.591994 | 0) 0.078 us | paravirt_get_lazy_mode();
7338520.591994 | 0) 3.666 us | }
7338520.591995 | 0) | xen_end_context_switch() {
7338520.591995 | 0) 0.644 us | xen_mc_flush();
7338520.591996 | 0) 0.080 us | paravirt_end_context_switch();
7338520.591997 | 0) 1.813 us | }
7338520.855105 | 0) | finish_task_switch() {
7338520.855110 | 0) | xen_evtchn_do_upcall() {
7338520.855110 | 0) | irq_enter() {
7338520.855110 | 0) 0.137 us | rcu_irq_enter();
7338520.855111 | 0) 0.673 us | }
7338520.855111 | 0) 0.063 us | exit_idle();
7338520.855111 | 0) | __xen_evtchn_do_upcall() {
7338520.855112 | 0) | evtchn_2l_handle_events() {
7338520.855112 | 0) 0.050 us | irq_from_virq();
7338520.855112 | 0) | evtchn_from_irq() {
7338520.855112 | 0) | irq_get_irq_data() {
7338520.855113 | 0) 0.050 us | irq_to_desc();
7338520.855113 | 0) 0.568 us | }
7338520.855113 | 0) 0.895 us | }
7338520.855114 | 0) | get_evtchn_to_irq() {
7338520.855114 | 0) 0.048 us | evtchn_2l_max_channels();
7338520.855114 | 0) 0.386 us | }
7338520.855114 | 0) | generic_handle_irq() {
7338520.855114 | 0) 0.051 us | irq_to_desc();
7338520.855115 | 0) | handle_percpu_irq() {
7338520.855115 | 0) | ack_dynirq() {
7338520.855115 | 0) | evtchn_from_irq() {
7338520.855115 | 0) | irq_get_irq_data() {
7338520.855116 | 0) 0.058 us | irq_to_desc();
7338520.855117 | 0) 1.264 us | }
7338520.855117 | 0) 1.644 us | }
7338520.855117 | 0) 0.048 us | irq_move_irq();
7338520.855118 | 0) 0.050 us | evtchn_2l_clear_pending();
7338520.855118 | 0) 2.876 us | }
7338520.855118 | 0) | handle_irq_event_percpu() {
7338520.855119 | 0) | xen_irq_work_interrupt() {
7338520.855119 | 0) | irq_enter() {
7338520.855119 | 0) 0.055 us | rcu_irq_enter();
7338520.855119 | 0) 0.460 us | }
7338520.855120 | 0) | __wake_up() {
7338520.855120 | 0) 0.057 us | _raw_spin_lock_irqsave();
7338520.855120 | 0) | __wake_up_common() {
7338520.855121 | 0) | autoremove_wake_function() {
7338520.855121 | 0) | default_wake_function() {
7338520.855121 | 0) | try_to_wake_up() {
7338520.855121 | 0) 0.203 us | _raw_spin_lock_irqsave();
7338520.855122 | 0) 0.179 us | task_waking_fair();
7338520.855123 | 0) | select_task_rq_fair() {
7338520.855123 | 0) 0.048 us | source_load();
7338520.855123 | 0) 0.059 us | target_load();
7338520.855124 | 0) 0.059 us | idle_cpu();
7338520.855124 | 0) 0.058 us | cpus_share_cache();
7338520.855124 | 0) 0.058 us | idle_cpu();
7338520.855125 | 0) 1.940 us | }
7338520.855125 | 0) 0.057 us | _raw_spin_lock();
7338520.855125 | 0) | ttwu_do_activate.constprop.124() {
7338520.855125 | 0) | activate_task() {
7338520.855126 | 0) | enqueue_task() {
7338520.855126 | 0) 0.171 us | update_rq_clock();
7338520.855126 | 0) | enqueue_task_fair() {
7338520.855126 | 0) | enqueue_entity() {
7338520.855127 | 0) 0.063 us | update_curr();
7338520.855127 | 0) 0.078 us | __compute_runnable_contrib.part.51();
7338520.855127 | 0) 0.066 us | __update_entity_load_avg_contrib();
7338520.855128 | 0) 0.061 us | update_cfs_rq_blocked_load();
7338520.855128 | 0) 0.072 us | account_entity_enqueue();
7338520.855128 | 0) 0.116 us | update_cfs_shares();
7338520.855129 | 0) 0.062 us | place_entity();
7338520.855129 | 0) 0.087 us | __enqueue_entity();
7338520.855129 | 0) 2.950 us | }
7338520.855130 | 0) | enqueue_entity() {
7338520.855130 | 0) 0.065 us | update_curr();
7338520.855130 | 0) 0.065 us | update_cfs_rq_blocked_load();
7338520.855130 | 0) 0.067 us | account_entity_enqueue();
7338520.855131 | 0) 0.084 us | update_cfs_shares();
7338520.855131 | 0) 0.112 us | place_entity();
7338520.855131 | 0) 0.051 us | __enqueue_entity();
7338520.855132 | 0) 2.074 us | }
7338520.855132 | 0) 0.055 us | hrtick_update();
7338520.855132 | 0) 5.983 us | }
7338520.855133 | 0) 6.790 us | }
7338520.855133 | 0) 7.138 us | }
7338520.855133 | 0) | ttwu_do_wakeup() {
7338520.855133 | 0) | check_preempt_curr() {
7338520.855133 | 0) | resched_task() {
7338520.855133 | 0) | xen_smp_send_reschedule() {
7338520.855134 | 0) | xen_send_IPI_one() {
7338520.855134 | 0) | notify_remote_via_irq() {
7338520.855134 | 0) | evtchn_from_irq() {
7338520.855134 | 0) | irq_get_irq_data() {
7338520.855134 | 0) 0.057 us | irq_to_desc();
7338520.855135 | 0) 0.502 us | }
7338520.855135 | 0) 0.865 us | }
7338520.855136 | 0) 1.975 us | } /* notify_remote_via_irq */
7338520.855136 | 0) 2.350 us | }
7338520.855136 | 0) 2.723 us | }
7338520.855136 | 0) 3.175 us | }
7338520.855137 | 0) 3.620 us | }
7338520.855138 | 0) 4.642 us | }
7338520.855138 | 0) 12.409 us | }
7338520.855138 | 0) 0.059 us | _raw_spin_unlock();
7338520.855139 | 0) 0.108 us | ttwu_stat();
7338520.855140 | 0) 0.073 us | _raw_spin_unlock_irqrestore();
7338520.855140 | 0) 18.857 us | }
7338520.855141 | 0) 19.415 us | }
7338520.855141 | 0) 19.993 us | }
7338520.855141 | 0) 20.587 us | }
7338520.855141 | 0) 0.070 us | _raw_spin_unlock_irqrestore();
7338520.855142 | 0) 21.858 us | }
7338520.855142 | 0) | irq_exit() {
7338520.855143 | 0) 0.084 us | idle_cpu();
7338520.855143 | 0) 0.082 us | rcu_irq_exit();
7338520.855144 | 0) 1.235 us | }
7338520.855144 | 0) 25.109 us | }
7338520.855144 | 0) 0.126 us | add_interrupt_randomness();
7338520.855145 | 0) 0.091 us | note_interrupt();
7338520.855145 | 0) 26.935 us | }
7338520.855146 | 0) 30.693 us | }
7338520.855146 | 0) 31.575 us | }
7338520.855146 | 0) 34.424 us | }
7338520.855147 | 0) 34.841 us | }
7338520.855147 | 0) | irq_exit() {
7338520.855147 | 0) 0.083 us | idle_cpu();
7338520.855148 | 0) 0.069 us | rcu_irq_exit();
7338520.855148 | 0) 1.056 us | }
7338520.855148 | 0) 38.284 us | }
7338520.855149 | 0) 39.892 us | }
7338520.855150 | 0) 72.181 us | }
7338520.855150 | 0) 72.925 us | }
7338520.855150 | 0) 73.638 us | }
7338520.855151 | 0) 0.078 us | down_read();
7338520.855152 | 0) | copy_from_read_buf() {
7338520.855153 | 0) | tty_audit_add_data() {
7338520.855153 | 0) 0.272 us | _raw_spin_lock_irqsave();
7338520.855154 | 0) 0.063 us | _raw_spin_unlock_irqrestore();
7338520.855155 | 0) 0.086 us | _raw_spin_lock_irqsave();
7338520.855155 | 0) 0.067 us | _raw_spin_unlock_irqrestore();
7338520.855156 | 0) 2.808 us | }
7338520.855156 | 0) 4.330 us | }
7338520.855156 | 0) 0.083 us | copy_from_read_buf();
7338520.855157 | 0) 0.062 us | n_tty_set_room();
7338520.855158 | 0) 0.079 us | n_tty_write_wakeup();
7338520.855158 | 0) | __wake_up() {
7338520.855158 | 0) 0.068 us | _raw_spin_lock_irqsave();
7338520.855159 | 0) | __wake_up_common() {
7338520.855159 | 0) 0.092 us | pollwake();
7338520.855160 | 0) 0.643 us | }
7338520.855160 | 0) 0.074 us | _raw_spin_unlock_irqrestore();
7338520.855160 | 0) 2.040 us | }
7338520.855161 | 0) 0.074 us | n_tty_set_room();
7338520.855162 | 0) 0.073 us | up_read();
7338520.855162 | 0) | remove_wait_queue() {
7338520.855162 | 0) 0.084 us | _raw_spin_lock_irqsave();
7338520.855163 | 0) 0.078 us | _raw_spin_unlock_irqrestore();
7338520.855163 | 0) 1.166 us | }
7338520.855164 | 0) 0.140 us | mutex_unlock();
7338520.855164 | 0) 93.360 us | }
7338520.855165 | 0) | tty_ldisc_deref() {
7338520.855165 | 0) 0.070 us | ldsem_up_read();
7338520.855166 | 0) 0.746 us | }
7338520.855166 | 0) 0.071 us | get_seconds();
7338520.855167 | 0) 97.713 us | }
7338520.855167 | 0) 0.283 us | __fsnotify_parent();
7338520.855168 | 0) 0.172 us | fsnotify();
7338520.855168 | 0) 103.847 us | }
7338520.855238 | 0) | vfs_read() {
7338520.855239 | 0) | rw_verify_area() {
7338520.855240 | 0) | security_file_permission() {
7338520.855240 | 0) | apparmor_file_permission() {
7338520.855240 | 0) 0.160 us | common_file_perm();
7338520.855241 | 0) 0.770 us | }
7338520.855241 | 0) 0.078 us | __fsnotify_parent();
7338520.855242 | 0) 0.087 us | fsnotify();
7338520.855243 | 0) 2.595 us | }
7338520.855243 | 0) 4.148 us | }
7338520.855243 | 0) | tty_read() {
7338520.855244 | 0) 0.078 us | tty_paranoia_check();
7338520.855244 | 0) | tty_ldisc_ref_wait() {
7338520.855244 | 0) 0.084 us | ldsem_down_read();
7338520.855245 | 0) 0.643 us | }
7338520.855245 | 0) | n_tty_read() {
7338520.855246 | 0) 0.079 us | _raw_spin_lock_irq();
7338520.855247 | 0) 0.171 us | mutex_lock_interruptible();
7338520.855247 | 0) 0.064 us | down_read();
7338520.855248 | 0) | add_wait_queue() {
7338520.855248 | 0) 0.078 us | _raw_spin_lock_irqsave();
7338520.855249 | 0) 0.082 us | _raw_spin_unlock_irqrestore();
7338520.855249 | 0) 1.076 us | }
7338520.855250 | 0) 0.075 us | tty_hung_up_p();
7338520.855250 | 0) 0.079 us | n_tty_set_room();
7338520.855251 | 0) 0.075 us | up_read();
7338520.855251 | 0) | schedule_timeout() {
7338520.855252 | 0) | schedule() {
7338520.855252 | 0) | __schedule() {
7338520.855252 | 0) 0.084 us | rcu_note_context_switch();
7338520.855253 | 0) 0.079 us | _raw_spin_lock_irq();
7338520.855254 | 0) | deactivate_task() {
7338520.855254 | 0) | dequeue_task() {
7338520.855254 | 0) 0.219 us | update_rq_clock();
7338520.855255 | 0) | dequeue_task_fair() {
7338520.855255 | 0) | dequeue_entity() {
7338520.855255 | 0) | update_curr() {
7338520.855256 | 0) 0.186 us | cpuacct_charge();
7338520.855257 | 0) 0.924 us | }
7338520.855257 | 0) 0.078 us | update_cfs_rq_blocked_load();
7338520.855258 | 0) 0.078 us | clear_buddies();
7338520.855258 | 0) 0.083 us | account_entity_dequeue();
7338520.855259 | 0) | update_cfs_shares() {
7338520.855259 | 0) 0.105 us | update_curr();
7338520.855260 | 0) 0.093 us | account_entity_dequeue();
7338520.855260 | 0) 0.098 us | account_entity_enqueue();
7338520.855261 | 0) 1.825 us | }
7338520.855261 | 0) 5.574 us | }
7338520.855261 | 0) | dequeue_entity() {
7338520.855261 | 0) 0.086 us | update_curr();
7338520.855262 | 0) 0.127 us | __update_entity_load_avg_contrib();
7338520.855263 | 0) 0.070 us | update_cfs_rq_blocked_load();
7338520.855263 | 0) 0.066 us | clear_buddies();
7338520.855264 | 0) 0.082 us | account_entity_dequeue();
7338520.855264 | 0) 0.104 us | update_cfs_shares();
7338520.855265 | 0) 3.439 us | }
7338520.855265 | 0) 0.078 us | hrtick_update();
7338520.855266 | 0) 10.741 us | }
7338520.855266 | 0) 11.990 us | }
7338520.855266 | 0) 12.580 us | }
7338520.855267 | 0) | pick_next_task_fair() {
7338520.855267 | 0) 0.074 us | check_cfs_rq_runtime();
7338520.855268 | 0) | pick_next_entity() {
7338520.855268 | 0) 0.078 us | clear_buddies();
7338520.855269 | 0) 0.696 us | }
7338520.855269 | 0) | put_prev_entity() {
7338520.855269 | 0) 0.084 us | check_cfs_rq_runtime();
7338520.855270 | 0) 0.628 us | }
7338520.855270 | 0) | put_prev_entity() {
7338520.855270 | 0) 0.074 us | check_cfs_rq_runtime();
7338520.855271 | 0) 0.575 us | }
7338520.855271 | 0) | set_next_entity() {
7338520.855272 | 0) 0.104 us | update_stats_wait_end();
7338520.855273 | 0) 0.834 us | }
7338520.855273 | 0) 5.872 us | }
7338520.855274 | 0) 0.079 us | paravirt_start_context_switch();
7338520.855275 | 0) 0.080 us | xen_read_cr0();
7338520.855276 | 0) | xen_write_cr0() {
7338520.855276 | 0) 0.091 us | paravirt_get_lazy_mode();
7338520.855277 | 0) 0.087 us | __xen_mc_entry();
7338520.855277 | 0) 0.076 us | paravirt_get_lazy_mode();
7338520.855278 | 0) 1.986 us | }
7338520.855278 | 0) | xen_load_sp0() {
7338520.855278 | 0) 0.066 us | paravirt_get_lazy_mode();
7338520.855279 | 0) 0.083 us | __xen_mc_entry();
7338520.855280 | 0) 0.082 us | paravirt_get_lazy_mode();
7338520.855280 | 0) 1.925 us | }
7338520.855281 | 0) | xen_load_tls() {
7338520.855281 | 0) 0.082 us | paravirt_get_lazy_mode();
7338520.855281 | 0) 0.080 us | paravirt_get_lazy_mode();
7338520.855282 | 0) 0.137 us | load_TLS_descriptor();
7338520.855283 | 0) 0.090 us | load_TLS_descriptor();
7338520.855283 | 0) 0.081 us | load_TLS_descriptor();
7338520.855284 | 0) 0.081 us | paravirt_get_lazy_mode();
7338520.855284 | 0) 3.397 us | }
7338520.855284 | 0) | xen_end_context_switch() {
7338520.855285 | 0) 0.618 us | xen_mc_flush();
7338520.855286 | 0) 0.086 us | paravirt_end_context_switch();
7338520.855286 | 0) 1.708 us | }
^C
Ending tracing...
Understanding whether the time is on-CPU or blocked off-CPU directs the
performance investigation.
Use -h to print the USAGE message:
# ./funcgraph -h
USAGE: funcgraph [-aCDhHPtT] [-m maxdepth] [-p PID] [-L TID] [-d secs] funcstring
-a # all info (same as -HPt)
-C # measure on-CPU time only
-d seconds # trace duration, and use buffers
-D # do not show function duration
-h # this usage message
-H # include column headers
-m maxdepth # max stack depth to show
-p PID # trace when this pid is on-CPU
-L TID # trace when this thread is on-CPU
-P # show process names & PIDs
-t # show timestamps
-T # comment function tails
eg,
funcgraph do_nanosleep # trace do_nanosleep() and children
funcgraph -m 3 do_sys_open # trace do_sys_open() to 3 levels only
funcgraph -a do_sys_open # include timestamps and process name
funcgraph -p 198 do_sys_open # trace vfs_read() for PID 198 only
funcgraph -d 1 do_sys_open >out # trace 1 sec, then write to file
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/funcslower_example.txt 0000664 0000000 0000000 00000010416 13614503575 0026563 0 ustar 00root root 0000000 0000000 Demonstrations of funcslower, the Linux ftrace version.
Show me ext3_readpages() calls slower than 1000 microseconds (1 ms):
# ./funcslower ext3_readpages 1000
Tracing "ext3_readpages" slower than 1000 us... Ctrl-C to end.
0) ! 8147.120 us | } /* ext3_readpages */
0) ! 8135.067 us | } /* ext3_readpages */
0) ! 12202.93 us | } /* ext3_readpages */
0) ! 12201.84 us | } /* ext3_readpages */
0) ! 8142.667 us | } /* ext3_readpages */
0) ! 12194.14 us | } /* ext3_readpages */
^C
Ending tracing...
Neat. So this confirms that there are ext3_readpages() calls that are taking
over 8000 us (8 ms).
funcslower uses the ftrace function graph profiler to dynamically instrument
the given kernel function, time it in-kernel, and only emit events slower
than the given latency threshold in-kernel. Since this all operates in
kernel context, the overheads are relatively low (compared to post-processing
in user space).
Now include the process name and PID (-P) of the process who is on-CPU, and the
absolute timestamp (-t) of the event:
# ./funcslower -Pt ext3_readpages 1000
Tracing "ext3_readpages" slower than 1000 us... Ctrl-C to end.
2678112.003180 | 0) cksum-26695 | ! 8145.268 us | } /* ext3_readpages */
2678113.538763 | 0) cksum-26695 | ! 8139.086 us | } /* ext3_readpages */
2678113.704901 | 0) cksum-26695 | ! 8147.549 us | } /* ext3_readpages */
2678113.721102 | 0) cksum-26695 | ! 8142.530 us | } /* ext3_readpages */
2678113.810269 | 0) cksum-26695 | ! 12234.70 us | } /* ext3_readpages */
2678113.996625 | 0) cksum-26695 | ! 8146.129 us | } /* ext3_readpages */
2678114.012832 | 0) cksum-26695 | ! 8148.153 us | } /* ext3_readpages */
^C
Ending tracing...
Great! Now I can see the process name, which in this case is the responsible
process. The timestamps also let me determine the rate of these slow events.
Now measure time differently: excluding time spent sleeping, so that we only
see on-CPU time:
# ./funcslower -Pct ext3_readpages 1000
Tracing "ext3_readpages" slower than 1000 us... Ctrl-C to end.
^C
Ending tracing...
I believe the workload hasn't changed, so these ext3_readpages() calls are
still happening, however, their CPU time doesn't exceed 1 ms. Compared to the
earlier output, this tells me that the latency in this function is due to time
spent blocked off-CPU, and not on-CPU. This makes sense: this function is
ultimately being blocked on disk I/O.
Were the function duration times to be similar with and without -C, that would
tell us that the high latency is due to time spent on-CPU executing code.
This traces the sys_nanosleep() kernel function, and shows calls taking over
100 us:
# ./funcslower sys_nanosleep 100
Tracing "sys_nanosleep" slower than 100 us... Ctrl-C to end.
0) ! 2000147 us | } /* sys_nanosleep */
------------------------------------------
0) registe-27414 => vmstat-27419
------------------------------------------
0) ! 1000143 us | } /* sys_nanosleep */
0) ! 1000154 us | } /* sys_nanosleep */
------------------------------------------
0) vmstat-27419 => registe-27414
------------------------------------------
0) ! 2000183 us | } /* sys_nanosleep */
------------------------------------------
0) registe-27414 => vmstat-27419
------------------------------------------
0) ! 1000141 us | } /* sys_nanosleep */
^C
Ending tracing...
This is an example where I did not use -P, but ftrace has included process
information anyway. Look for the lines containing "=>", which indicate a process
switch on the given CPU.
Use -h to print the USAGE message:
# ./funcslower -h
USAGE: funcslower [-aChHPt] [-p PID] [-d secs] funcstring latency_us
-a # all info (same as -HPt)
-C # measure on-CPU time only
-d seconds # trace duration, and use buffers
-h # this usage message
-H # include column headers
-p PID # trace when this pid is on-CPU
-L TID # trace when this thread is on-CPU
-P # show process names & PIDs
-t # show timestamps
eg,
funcslower vfs_read 10000 # trace vfs_read() slower than 10 ms
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/functrace_example.txt 0000664 0000000 0000000 00000033504 13614503575 0026351 0 ustar 00root root 0000000 0000000 Demonstrations of functrace, the Linux ftrace version.
A (usually) good example to start with is do_nanosleep(), since it is not called
frequently, and easily triggered. Here's tracing it using functrace:
# ./functrace 'do_nanosleep'
Tracing "do_nanosleep"... Ctrl-C to end.
svscan-1678 [000] .... 6412438.703521: do_nanosleep <-hrtimer_nanosleep
svscan-1678 [000] .... 6412443.703678: do_nanosleep <-hrtimer_nanosleep
svscan-1678 [000] .... 6412448.703865: do_nanosleep <-hrtimer_nanosleep
vmstat-28371 [000] .... 6412453.216241: do_nanosleep <-hrtimer_nanosleep
svscan-1678 [000] .... 6412453.704049: do_nanosleep <-hrtimer_nanosleep
vmstat-28371 [000] .... 6412454.216524: do_nanosleep <-hrtimer_nanosleep
vmstat-28371 [000] .... 6412455.216816: do_nanosleep <-hrtimer_nanosleep
vmstat-28371 [000] .... 6412456.217093: do_nanosleep <-hrtimer_nanosleep
vmstat-28371 [000] .... 6412457.217378: do_nanosleep <-hrtimer_nanosleep
vmstat-28371 [000] .... 6412458.217660: do_nanosleep <-hrtimer_nanosleep
^C
Ending tracing...
While tracing, I ran a "vmstat 1" in another window. vmstat and its process ID
can be seen as the 1st column, and the timestamp and one second intervals can
be seen as the 4th column.
This is basic details: who was on-CPU (process name and PID), flags, timestamp,
and calling function. Treat this as the next step, after funccount, for getting
a little more information on kernel function execution, before using more
capabilities to dig further.
This is Linux 3.16, and the output is the ftrace text buffer format, which has
changed slightly between kernel versions.
To see the column headers, use -H. This is Linux 3.16:
# ./functrace -H do_nanosleep
Tracing "do_nanosleep"... Ctrl-C to end.
# tracer: function
#
# entries-in-buffer/entries-written: 0/0 #P:2
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
svscan-1678 [001] .... 6413283.729520: do_nanosleep <-hrtimer_nanosleep
svscan-1678 [001] .... 6413288.729679: do_nanosleep <-hrtimer_nanosleep
For comparison, here's Linux 3.2:
# ./functrace -H do_nanosleep
Tracing "do_nanosleep"... Ctrl-C to end.
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
vmstat-11789 [000] 1763207.021204: do_nanosleep <-hrtimer_nanosleep
vmstat-11789 [000] 1763208.022970: do_nanosleep <-hrtimer_nanosleep
vmstat-11789 [000] 1763209.023267: do_nanosleep <-hrtimer_nanosleep
For documentation on the exact format, see the Linux kernel source under
Documentation/trace/ftrace.txt.
This error:
# ./functrace 'ext4_z*'
Tracing "ext4_z*"... Ctrl-C to end.
./functrace: line 136: echo: write error: Invalid argument
ERROR: enabling "ext4_z*". Exiting.
Is because there were no functions beginning with "ext4_z". You can check
available functions in the /sys/kernel/debug/tracing/available_filter_functions
file.
You might want to use funccount to check the frequency of events before using
functrace. For example, counting ext3 events on a system:
# ./funccount -d 10 'ext3*'
Tracing "ext3*" for 10 seconds...
FUNC COUNT
ext3_journal_dirty_data 1
ext3_ordered_write_end 1
ext3_write_begin 1
ext3_writepage_trans_blocks 1
ext3_dirty_inode 2
ext3_do_update_inode 2
ext3_get_group_desc 2
ext3_get_inode_block.isra.20 2
ext3_get_inode_flags 2
ext3_get_inode_loc 2
ext3_mark_iloc_dirty 2
ext3_mark_inode_dirty 2
ext3_reserve_inode_write 2
ext3_journal_start_sb 3
ext3_block_to_path.isra.22 6
ext3_bmap 6
ext3_get_block 6
ext3_get_blocks_handle 6
ext3_get_branch 6
ext3_discard_reservation 11
ext3_ioctl 11
ext3_release_file 11
Ending tracing...
During 10 seconds, there weren't many ext3 calls. I might consider tracing
them all (warnings about dynamic tracing many kernel functions apply: test
before use, as in the past there have been bugs causing panics).
# ./functrace 'ext3_*'
Tracing "ext3_*"... Ctrl-C to end.
register_start.-17008 [000] 1763557.577985: ext3_release_file <-__fput
register_start.-17008 [000] 1763557.577987: ext3_discard_reservation <-ext3_release_file
register_start.-17026 [000] 1763558.163620: ext3_ioctl <-file_ioctl
register_start.-17026 [000] 1763558.481081: ext3_release_file <-__fput
register_start.-17026 [000] 1763558.481083: ext3_discard_reservation <-ext3_release_file
register_start.-17041 [000] 1763559.186984: ext3_ioctl <-file_ioctl
register_start.-17041 [000] 1763559.511267: ext3_release_file <-__fput
[...]
For comparison, here's a different system and ext4:
# ./funccount -d 10 'ext4*'
Tracing "ext4*" for 10 seconds...
FUNC COUNT
ext4_journal_commit_callback 2
ext4_htree_fill_tree 6
ext4_htree_free_dir_info 6
ext4_release_dir 6
ext4_readdir 12
ext4fs_dirhash 29
ext4_htree_store_dirent 29
ext4_follow_link 36
ext4_file_mmap 42
ext4_free_data_callback 44
ext4_getattr 45
ext4_bmap 62
ext4_get_block 62
ext4_add_entry 280
ext4_add_nondir 280
ext4_alloc_da_blocks 280
ext4_alloc_inode 280
ext4_bio_write_page 280
ext4_can_truncate 280
ext4_claim_free_clusters 280
ext4_clear_inode 280
ext4_create 280
ext4_da_get_block_prep 280
ext4_da_invalidatepage 280
ext4_da_update_reserve_space 280
ext4_da_write_begin 280
ext4_da_write_end 280
ext4_dec_count.isra.22 280
ext4_delete_entry 280
ext4_destroy_inode 280
ext4_drop_inode 280
ext4_end_bio 280
ext4_es_init_tree 280
ext4_es_lru_del 280
ext4_evict_inode 280
ext4_ext_calc_metadata_amount 280
ext4_ext_correct_indexes 280
ext4_ext_find_goal 280
ext4_ext_insert_extent 280
ext4_ext_remove_space 280
ext4_ext_tree_init 280
ext4_ext_truncate 280
ext4_ext_truncate_extend_resta 280
ext4_ext_try_to_merge 280
ext4_ext_try_to_merge_right 280
ext4_file_write_iter 280
ext4_find_dest_de 280
ext4_finish_bio 280
ext4_free_blocks 280
ext4_free_inode 280
ext4_generic_delete_entry 280
ext4_has_free_clusters 280
ext4_i_callback 280
ext4_init_acl 280
ext4_init_security 280
ext4_inode_attach_jinode 280
ext4_inode_to_goal_block 280
ext4_insert_dentry 280
ext4_invalidatepage 280
ext4_io_submit_init 280
ext4_itable_unused_count 280
ext4_lookup 280
ext4_mb_complex_scan_group 280
ext4_mb_find_by_goal 280
ext4_mb_free_metadata 280
ext4_mb_initialize_context 280
ext4_mb_mark_diskspace_used 280
ext4_mb_new_blocks 280
ext4_mb_normalize_request 280
ext4_mb_regular_allocator 280
ext4_mb_release_context 280
ext4_mb_use_best_found 280
ext4_mb_use_preallocated 280
ext4_nonda_switch 280
ext4_orphan_del 280
ext4_put_io_end_defer 280
ext4_releasepage 280
ext4_rename 280
ext4_set_aops 280
ext4_setent 280
ext4_set_inode_flags 280
ext4_truncate 280
ext4_writepages 280
ext4_writepage_trans_blocks 280
ext4_xattr_delete_inode 280
ext4_xattr_get 285
ext4_xattr_ibody_get 285
ext4_xattr_security_get 285
ext4_bread 286
ext4_release_file 288
ext4_file_open 305
ext4_superblock_csum_set 494
ext4_block_bitmap_csum_set 560
ext4_es_free_extent 560
ext4_es_insert_extent 560
ext4_es_remove_extent 560
ext4_ext_find_extent 560
ext4_ext_map_blocks 560
ext4_free_group_clusters_set 560
ext4_free_inodes_set 560
ext4_get_group_no_and_offset 560
ext4_get_reserved_space 560
ext4_init_io_end 560
ext4_inode_bitmap_csum_set 560
ext4_io_submit 560
ext4_mb_good_group 560
ext4_orphan_add 560
ext4_put_io_end 560
ext4_read_block_bitmap 560
ext4_read_block_bitmap_nowait 560
ext4_read_inode_bitmap 560
ext4_release_io_end 560
ext4_set_bits 560
ext4_validate_block_bitmap 560
ext4_wait_block_bitmap 560
ext4_mb_load_buddy 604
ext4_mb_unload_buddy.isra.24 604
ext4_block_bitmap 840
ext4_discard_preallocations 840
ext4_ext_drop_refs 840
ext4_ext_get_access.isra.30 840
ext4_ext_index_trans_blocks 840
ext4_find_entry 840
ext4_free_group_clusters 840
ext4_handle_dirty_dirent_node 840
ext4_inode_bitmap 840
ext4_meta_trans_blocks 840
ext4_dirty_inode 845
ext4_free_inodes_count 1120
ext4_group_desc_csum 1120
ext4_group_desc_csum_set 1120
ext4_getblk 1126
ext4_map_blocks 1468
ext4_es_lookup_extent 1748
ext4_mb_check_limits 1875
ext4_es_lru_add 2028
ext4_data_block_valid 2308
ext4_journal_check_start 3085
ext4_mark_inode_dirty 5325
ext4_get_inode_flags 5951
ext4_get_inode_loc 5951
ext4_mark_iloc_dirty 5951
ext4_reserve_inode_write 5951
ext4_inode_table 7071
ext4_get_group_desc 8471
ext4_has_inline_data 9486
Ending tracing...
There are many functions called frequently. Tracing them all may cost
significant performance overhead. I may read through this list and look for
the most interesting functions to trace, reducing overheads by only selecting
a few.
For example, ext4_create() looks interesting:
# ./functrace ext4_create
Tracing "ext4_create"... Ctrl-C to end.
supervise-1681 [000] .... 6414396.700163: ext4_create <-vfs_create
supervise-1684 [001] .... 6414396.700287: ext4_create <-vfs_create
supervise-1681 [000] .... 6414396.700598: ext4_create <-vfs_create
supervise-1684 [001] .... 6414396.700636: ext4_create <-vfs_create
supervise-1687 [001] .... 6414396.701577: ext4_create <-vfs_create
supervise-1688 [000] .... 6414396.702590: ext4_create <-vfs_create
supervise-1693 [001] .... 6414396.702829: ext4_create <-vfs_create
supervise-1693 [001] .... 6414396.703592: ext4_create <-vfs_create
supervise-1688 [000] .... 6414396.703598: ext4_create <-vfs_create
supervise-1687 [001] .... 6414396.703988: ext4_create <-vfs_create
supervise-1685 [001] .... 6414396.704126: ext4_create <-vfs_create
supervise-1685 [001] .... 6414396.704458: ext4_create <-vfs_create
supervise-1682 [001] .... 6414396.704577: ext4_create <-vfs_create
supervise-1683 [000] .... 6414396.704984: ext4_create <-vfs_create
supervise-1682 [001] .... 6414396.704985: ext4_create <-vfs_create
[...]
Now I know that different PIDs of the supervise program are calling ext4_create,
of around the same time, and from vfs_create().
The duration mode uses buffering, instead of printing events as they occur.
This greatly reduces overheads. For example:
# ./functrace -d 10 ext4_create > out.ext4_create
# wc out.ext4_create
283 1687 21059 out.ext4_create
Note that the buffer has a limited size. Check the timestamps to see if the
range does not match your duration, as one clue that the buffer was exhausted
and events were missed.
Use -h to print the USAGE message:
# ./functrace -h
USAGE: functrace [-hH] [-p PID] [-L TID] [-d secs] funcstring
-d seconds # trace duration, and use buffers
-h # this usage message
-H # include column headers
-p PID # trace when this pid is on-CPU
-L TID # trace when this thread is on-CPU
eg,
functrace do_nanosleep # trace the do_nanosleep() function
functrace '*sleep' # trace functions ending in "sleep"
functrace -p 198 'vfs*' # trace "vfs*" funcs for PID 198
functrace 'tcp*' > out # trace all "tcp*" funcs to out file
functrace -d 1 'tcp*' > out # trace 1 sec, then write out file
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/iolatency_example.txt 0000664 0000000 0000000 00000043616 13614503575 0026373 0 ustar 00root root 0000000 0000000 Demonstrations of iolatency, the Linux ftrace version.
Here's a busy system doing over 4k disk IOPS:
# ./iolatency
Tracing block I/O. Output every 1 seconds. Ctrl-C to end.
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 4381 |######################################|
1 -> 2 : 9 |# |
2 -> 4 : 5 |# |
4 -> 8 : 0 | |
8 -> 16 : 1 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 4053 |######################################|
1 -> 2 : 18 |# |
2 -> 4 : 9 |# |
4 -> 8 : 2 |# |
8 -> 16 : 1 |# |
16 -> 32 : 1 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 4658 |######################################|
1 -> 2 : 9 |# |
2 -> 4 : 2 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 4298 |######################################|
1 -> 2 : 17 |# |
2 -> 4 : 10 |# |
4 -> 8 : 1 |# |
8 -> 16 : 1 |# |
^C
Ending tracing...
Disk I/O latency is usually between 0 and 1 milliseconds, as this system uses
SSDs. There are occasional outliers, up to the 16->32 ms range.
Identifying outliers like these is difficult from iostat(1) alone, which at
the same time reported:
# iostat 1
[...]
avg-cpu: %user %nice %system %iowait %steal %idle
0.53 0.00 1.05 46.84 0.53 51.05
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdap1 0.00 0.00 0.00 28.00 0.00 112.00 8.00 0.02 0.71 0.00 0.71 0.29 0.80
xvdb 0.00 0.00 2134.00 0.00 18768.00 0.00 17.59 0.51 0.24 0.24 0.00 0.23 50.00
xvdc 0.00 0.00 2088.00 0.00 18504.00 0.00 17.72 0.47 0.22 0.22 0.00 0.22 46.40
md0 0.00 0.00 4222.00 0.00 37256.00 0.00 17.65 0.00 0.00 0.00 0.00 0.00 0.00
I/O latency ("await") averages 0.24 and 0.22 ms for our busy disks, but this
output doesn't show that occasionally is much higher.
To get more information on these I/O, try the iosnoop(8) tool.
The -Q option includes the block I/O queued time, by tracing based on
block_rq_insert instead of block_rq_issue:
# ./iolatency -Q
Tracing block I/O. Output every 1 seconds. Ctrl-C to end.
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 1913 |######################################|
1 -> 2 : 438 |######### |
2 -> 4 : 100 |## |
4 -> 8 : 145 |### |
8 -> 16 : 43 |# |
16 -> 32 : 43 |# |
32 -> 64 : 1 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 2360 |######################################|
1 -> 2 : 132 |### |
2 -> 4 : 72 |## |
4 -> 8 : 14 |# |
8 -> 16 : 1 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 2138 |######################################|
1 -> 2 : 496 |######### |
2 -> 4 : 81 |## |
4 -> 8 : 40 |# |
8 -> 16 : 1 |# |
16 -> 32 : 2 |# |
^C
Ending tracing...
I use this along with the default mode to identify problems of load (queueing)
vs problems of the device, which is shown by default.
Here's a more interesting system. This is doing a mixed read/write workload,
and has a pretty awful latency distribution:
# ./iolatency 5 3
Tracing block I/O. Output every 5 seconds.
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 2809 |######################################|
1 -> 2 : 32 |# |
2 -> 4 : 14 |# |
4 -> 8 : 6 |# |
8 -> 16 : 7 |# |
16 -> 32 : 14 |# |
32 -> 64 : 39 |# |
64 -> 128 : 1556 |###################### |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 3027 |######################################|
1 -> 2 : 19 |# |
2 -> 4 : 6 |# |
4 -> 8 : 5 |# |
8 -> 16 : 3 |# |
16 -> 32 : 7 |# |
32 -> 64 : 14 |# |
64 -> 128 : 540 |####### |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 2939 |######################################|
1 -> 2 : 25 |# |
2 -> 4 : 15 |# |
4 -> 8 : 2 |# |
8 -> 16 : 3 |# |
16 -> 32 : 7 |# |
32 -> 64 : 17 |# |
64 -> 128 : 936 |############# |
Ending tracing...
It's multi-modal, with most I/O taking 0 to 1 milliseconds, then many between
64 and 128 milliseconds. This is how it looks in iostat:
# iostat -x 1
avg-cpu: %user %nice %system %iowait %steal %idle
0.52 0.00 12.37 32.99 0.00 54.12
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdap1 0.00 12.00 0.00 156.00 0.00 19968.00 256.00 52.17 184.38 0.00 184.38 2.33 36.40
xvdb 0.00 0.00 298.00 0.00 2732.00 0.00 18.34 0.04 0.12 0.12 0.00 0.11 3.20
xvdc 0.00 0.00 297.00 0.00 2712.00 0.00 18.26 0.08 0.27 0.27 0.00 0.24 7.20
md0 0.00 0.00 595.00 0.00 5444.00 0.00 18.30 0.00 0.00 0.00 0.00 0.00 0.00
Fortunately, it turns out that the high latency is to xvdap1, which is for files
from a low priority application (processing and writing log files). A high
priority application is reading from the other disks, xvdb and xvdc.
Examining xvdap1 only:
# ./iolatency -d 202,1 5
Tracing block I/O. Output every 5 seconds. Ctrl-C to end.
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 38 |## |
1 -> 2 : 18 |# |
2 -> 4 : 0 | |
4 -> 8 : 0 | |
8 -> 16 : 5 |# |
16 -> 32 : 11 |# |
32 -> 64 : 26 |## |
64 -> 128 : 894 |######################################|
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 75 |### |
1 -> 2 : 11 |# |
2 -> 4 : 0 | |
4 -> 8 : 4 |# |
8 -> 16 : 4 |# |
16 -> 32 : 7 |# |
32 -> 64 : 13 |# |
64 -> 128 : 1141 |######################################|
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 61 |######## |
1 -> 2 : 21 |### |
2 -> 4 : 5 |# |
4 -> 8 : 1 |# |
8 -> 16 : 5 |# |
16 -> 32 : 7 |# |
32 -> 64 : 19 |### |
64 -> 128 : 324 |######################################|
128 -> 256 : 7 |# |
256 -> 512 : 26 |#### |
^C
Ending tracing...
And now xvdb:
# ./iolatency -d 202,16 5
Tracing block I/O. Output every 5 seconds. Ctrl-C to end.
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 1427 |######################################|
1 -> 2 : 5 |# |
2 -> 4 : 3 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 1409 |######################################|
1 -> 2 : 6 |# |
2 -> 4 : 1 |# |
4 -> 8 : 1 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 1478 |######################################|
1 -> 2 : 6 |# |
2 -> 4 : 5 |# |
4 -> 8 : 0 | |
8 -> 16 : 2 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 1437 |######################################|
1 -> 2 : 5 |# |
2 -> 4 : 7 |# |
4 -> 8 : 0 | |
8 -> 16 : 1 |# |
[...]
While that's much better, it is reaching the 8 - 16 millisecond range,
and these are SSDs with a light workload (~1500 IOPS).
I already know from iosnoop(8) analysis the reason for these high latency
outliers: they are queued behind writes. However, these writes are to a
different disk -- somewhere in this virtualized guest (Xen) there may be a
shared I/O queue.
One way to explore this is to reduce the queue length for the low priority disk,
so that it is less likely to pollute any shared queue. (There are other ways to
investigate and fix this too.) Here I reduce the disk queue length from its
default of 128 to 4:
# echo 4 > /sys/block/xvda1/queue/nr_requests
The overall distribution looks much better:
# ./iolatency 5
Tracing block I/O. Output every 5 seconds. Ctrl-C to end.
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 3005 |######################################|
1 -> 2 : 19 |# |
2 -> 4 : 9 |# |
4 -> 8 : 45 |# |
8 -> 16 : 859 |########### |
16 -> 32 : 16 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 2959 |######################################|
1 -> 2 : 43 |# |
2 -> 4 : 16 |# |
4 -> 8 : 39 |# |
8 -> 16 : 1009 |############# |
16 -> 32 : 76 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 3031 |######################################|
1 -> 2 : 27 |# |
2 -> 4 : 9 |# |
4 -> 8 : 24 |# |
8 -> 16 : 422 |###### |
16 -> 32 : 5 |# |
^C
Ending tracing...
Latency only reaching 32 ms.
Our important disk didn't appear to change much -- maybe a slight improvement
to the outliers:
# ./iolatency -d 202,16 5
Tracing block I/O. Output every 5 seconds. Ctrl-C to end.
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 1449 |######################################|
1 -> 2 : 6 |# |
2 -> 4 : 5 |# |
4 -> 8 : 1 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 1519 |######################################|
1 -> 2 : 12 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 1466 |######################################|
1 -> 2 : 2 |# |
2 -> 4 : 3 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 1460 |######################################|
1 -> 2 : 4 |# |
2 -> 4 : 7 |# |
[...]
And here's the other disk after the queue length change:
# ./iolatency -d 202,1 5
Tracing block I/O. Output every 5 seconds. Ctrl-C to end.
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 85 |### |
1 -> 2 : 12 |# |
2 -> 4 : 21 |# |
4 -> 8 : 76 |## |
8 -> 16 : 1539 |######################################|
16 -> 32 : 10 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 123 |################## |
1 -> 2 : 8 |## |
2 -> 4 : 6 |# |
4 -> 8 : 17 |### |
8 -> 16 : 270 |######################################|
16 -> 32 : 2 |# |
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 91 |### |
1 -> 2 : 23 |# |
2 -> 4 : 8 |# |
4 -> 8 : 71 |### |
8 -> 16 : 1223 |######################################|
16 -> 32 : 12 |# |
^C
Ending tracing...
Much better looking distribution.
Use -h to print the USAGE message:
# ./iolatency -h
USAGE: iolatency [-hQT] [-d device] [-i iotype] [interval [count]]
-d device # device string (eg, "202,1)
-i iotype # match type (eg, '*R*' for all reads)
-Q # use queue insert as start time
-T # timestamp on output
-h # this usage message
interval # summary interval, seconds (default 1)
count # number of summaries
eg,
iolatency # summarize latency every second
iolatency -Q # include block I/O queue time
iolatency 5 2 # 2 x 5 second summaries
iolatency -i '*R*' # trace reads
iolatency -d 202,1 # trace device 202,1 only
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/iosnoop_example.txt 0000664 0000000 0000000 00000035745 13614503575 0026076 0 ustar 00root root 0000000 0000000 Demonstrations of iosnoop, the Linux ftrace version.
Here's Linux 3.16, tracing tar archiving a filesystem:
# ./iosnoop
Tracing block I/O... Ctrl-C to end.
COMM PID TYPE DEV BLOCK BYTES LATms
supervise 1809 W 202,1 17039968 4096 1.32
supervise 1809 W 202,1 17039976 4096 1.30
tar 14794 RM 202,1 8457608 4096 7.53
tar 14794 RM 202,1 8470336 4096 14.90
tar 14794 RM 202,1 8470368 4096 0.27
tar 14794 RM 202,1 8470784 4096 7.74
tar 14794 RM 202,1 8470360 4096 0.25
tar 14794 RM 202,1 8469968 4096 0.24
tar 14794 RM 202,1 8470240 4096 0.24
tar 14794 RM 202,1 8470392 4096 0.23
tar 14794 RM 202,1 8470544 4096 5.96
tar 14794 RM 202,1 8470552 4096 0.27
tar 14794 RM 202,1 8470384 4096 0.24
[...]
The "tar" I/O looks like it is slightly random (based on BLOCK) and 4 Kbytes
in size (BYTES). One returned in 14.9 milliseconds, but the rest were fast,
so fast (0.24 ms) some may be returning from some level of cache (disk or
controller).
The "RM" TYPE means Read of Metadata. The start of the trace shows a
couple of Writes by supervise PID 1809.
Here's a deliberate random I/O workload:
# ./iosnoop
Tracing block I/O. Ctrl-C to end.
COMM PID TYPE DEV BLOCK BYTES LATms
randread 9182 R 202,32 30835224 8192 0.18
randread 9182 R 202,32 21466088 8192 0.15
randread 9182 R 202,32 13529496 8192 0.16
randread 9182 R 202,16 21250648 8192 0.18
randread 9182 R 202,16 1536776 32768 0.30
randread 9182 R 202,32 17157560 24576 0.23
randread 9182 R 202,32 21313320 8192 0.16
randread 9182 R 202,32 862184 8192 0.18
randread 9182 R 202,16 25496872 8192 0.21
randread 9182 R 202,32 31471768 8192 0.18
randread 9182 R 202,16 27571336 8192 0.20
randread 9182 R 202,16 30783448 8192 0.16
randread 9182 R 202,16 21435224 8192 1.28
randread 9182 R 202,16 970616 8192 0.15
randread 9182 R 202,32 13855608 8192 0.16
randread 9182 R 202,32 17549960 8192 0.15
randread 9182 R 202,32 30938232 8192 0.14
[...]
Note the changing offsets. The resulting latencies are very good in this case,
because the storage devices are flash memory-based solid state disks (SSDs).
For rotational disks, I'd expect these latencies to be roughly 10 ms.
Here's an idle Linux 3.2 system:
# ./iosnoop
Tracing block I/O. Ctrl-C to end.
COMM PID TYPE DEV BLOCK BYTES LATms
supervise 3055 W 202,1 12852496 4096 0.64
supervise 3055 W 202,1 12852504 4096 1.32
supervise 3055 W 202,1 12852800 4096 0.55
supervise 3055 W 202,1 12852808 4096 0.52
jbd2/xvda1-212 212 WS 202,1 1066720 45056 41.52
jbd2/xvda1-212 212 WS 202,1 1066808 12288 41.52
jbd2/xvda1-212 212 WS 202,1 1066832 4096 32.37
supervise 3055 W 202,1 12852800 4096 14.28
supervise 3055 W 202,1 12855920 4096 14.07
supervise 3055 W 202,1 12855960 4096 0.67
supervise 3055 W 202,1 12858208 4096 1.00
flush:1-409 409 W 202,1 12939640 12288 18.00
[...]
This shows supervise doing various writes from PID 3055. The highest latency
was from jbd2/xvda1-212, the journaling block device driver, doing
synchronous writes (TYPE = WS).
Options can be added to show the start time (-s) and end time (-t):
# ./iosnoop -ts
Tracing block I/O. Ctrl-C to end.
STARTs ENDs COMM PID TYPE DEV BLOCK BYTES LATms
5982800.302061 5982800.302679 supervise 1809 W 202,1 17039600 4096 0.62
5982800.302423 5982800.302842 supervise 1809 W 202,1 17039608 4096 0.42
5982800.304962 5982800.305446 supervise 1801 W 202,1 17039616 4096 0.48
5982800.305250 5982800.305676 supervise 1801 W 202,1 17039624 4096 0.43
5982800.308849 5982800.309452 supervise 1810 W 202,1 12862464 4096 0.60
5982800.308856 5982800.309470 supervise 1806 W 202,1 17039632 4096 0.61
5982800.309206 5982800.309740 supervise 1806 W 202,1 17039640 4096 0.53
5982800.309211 5982800.309805 supervise 1810 W 202,1 12862472 4096 0.59
5982800.309332 5982800.309953 supervise 1812 W 202,1 17039648 4096 0.62
5982800.309676 5982800.310283 supervise 1812 W 202,1 17039656 4096 0.61
[...]
This is useful when gathering I/O event data for post-processing.
Now for matching on a single PID:
# ./iosnoop -p 1805
Tracing block I/O issued by PID 1805. Ctrl-C to end.
COMM PID TYPE DEV BLOCK BYTES LATms
supervise 1805 W 202,1 17039648 4096 0.68
supervise 1805 W 202,1 17039672 4096 0.60
supervise 1805 W 202,1 17040040 4096 0.62
supervise 1805 W 202,1 17040056 4096 0.47
supervise 1805 W 202,1 17040624 4096 0.49
supervise 1805 W 202,1 17040632 4096 0.44
^C
Ending tracing...
This option works by using an in-kernel filter for that PID on I/O issue. There
is also a "-n" option to match on process names, however, that currently does so
in user space, so is less efficient.
I would say that this will generally identify the origin process, but there will
be an error margin. Depending on the file system, block I/O queueing, and I/O
subsystem, this could miss events that aren't issued in this PID context but are
related to this PID (eg, triggering a read readahead on the completion of
previous I/O. Again, whether this happens is up to the file system and storage
subsystem). You can try the -Q option for more reliable process identification.
The -Q option begins tracing on block I/O queue insert, instead of issue.
Here's before and after, while dd(1) writes a large file:
# ./iosnoop
Tracing block I/O. Ctrl-C to end.
COMM PID TYPE DEV BLOCK BYTES LATms
dd 26983 WS 202,16 4064416 45056 16.70
dd 26983 WS 202,16 4064504 45056 16.72
dd 26983 WS 202,16 4064592 45056 16.74
dd 26983 WS 202,16 4064680 45056 16.75
cat 27031 WS 202,16 4064768 45056 16.56
cat 27031 WS 202,16 4064856 45056 16.46
cat 27031 WS 202,16 4064944 45056 16.40
gawk 27030 WS 202,16 4065032 45056 0.88
gawk 27030 WS 202,16 4065120 45056 1.01
gawk 27030 WS 202,16 4065208 45056 16.15
gawk 27030 WS 202,16 4065296 45056 16.16
gawk 27030 WS 202,16 4065384 45056 16.16
[...]
The output here shows the block I/O time from issue to completion (LATms),
which is largely representative of the device.
The process names and PIDs identify dd, cat, and gawk. By default iosnoop shows
who is on-CPU at time of block I/O issue, but these may not be the processes
that originated the I/O. In this case (having debugged it), the reason is that
processes such as cat and gawk are making hypervisor calls (this is a Xen
guest instance), eg, for memory operations, and during hypervisor processing a
queue of pending work is checked and dispatched. So cat and gawk were on-CPU
when the block device I/O was issued, but they didn't originate it.
Now the -Q option is used:
# ./iosnoop -Q
Tracing block I/O. Ctrl-C to end.
COMM PID TYPE DEV BLOCK BYTES LATms
kjournald 1217 WS 202,16 6132200 45056 141.12
kjournald 1217 WS 202,16 6132288 45056 141.10
kjournald 1217 WS 202,16 6132376 45056 141.10
kjournald 1217 WS 202,16 6132464 45056 141.11
kjournald 1217 WS 202,16 6132552 40960 141.11
dd 27718 WS 202,16 6132624 4096 0.18
flush:16-1279 1279 W 202,16 6132632 20480 0.52
flush:16-1279 1279 W 202,16 5940856 4096 0.50
flush:16-1279 1279 W 202,16 5949056 4096 0.52
flush:16-1279 1279 W 202,16 5957256 4096 0.54
flush:16-1279 1279 W 202,16 5965456 4096 0.56
flush:16-1279 1279 W 202,16 5973656 4096 0.58
flush:16-1279 1279 W 202,16 5981856 4096 0.60
flush:16-1279 1279 W 202,16 5990056 4096 0.63
[...]
This uses the block_rq_insert tracepoint as the starting point of I/O, instead
of block_rq_issue. This makes the following differences to columns and options:
- COMM: more likely to show the originating process.
- PID: more likely to show the originating process.
- LATms: shows the I/O time, including time spent on the block I/O queue.
- STARTs (not shown above): shows the time of queue insert, not I/O issue.
- -p PID: more likely to match the originating process.
- -n name: more likely to match the originating process.
The reason that this ftrace-based iosnoop does not just instrument both insert
and issue tracepoints is one of overhead. Even with buffering, iosnoop can
have difficulty under high load.
If I want to capture events for post-processing, I use the duration mode, which
not only lets me set the duration, but also uses buffering, which reduces the
overheads of tracing.
Capturing 5 seconds, with both start timestamps (-s) and end timestamps (-t):
# time ./iosnoop -ts 5 > out
real 0m5.566s
user 0m0.336s
sys 0m0.140s
# wc out
27010 243072 2619744 out
This server is doing over 5,000 disk IOPS. Even with buffering, this did
consume a measurable amount of CPU to capture: 0.48 seconds of CPU time in
total. Note that the run took 5.57 seconds: this is 5 seconds for the capture,
followed by the CPU time for iosnoop to fetch and process the buffer.
Now tracing for 30 seconds:
# time ./iosnoop -ts 30 > out
real 0m31.207s
user 0m0.884s
sys 0m0.472s
# wc out
64259 578313 6232898 out
Since it's the same server and workload, this should have over 150k events,
but only has 64k. The tracing buffer has overflowed, and events have been
dropped. If I really must capture this many events, I can either increase
the trace buffer size (it's the bufsize_kb setting in the script), or, use
a different tracer (perf_evets, SystemTap, ktap, etc.) If the IOPS rate is low
(eg, less than 5k), then unbuffered (no duration), despite the higher overheads,
may be sufficient, and will keep capturing events until Ctrl-C.
Here's an example of digging into the sequence of I/O to explain an outlier.
My randread program on an SSD server (which is an AWS EC2 instance) usually
experiences about 0.15 ms I/O latency, but there are some outliers as high as
20 milliseconds. Here's an excerpt:
# ./iosnoop -ts > out
# more out
Tracing block I/O. Ctrl-C to end.
STARTs ENDs COMM PID TYPE DEV BLOCK BYTES LATms
6037559.121523 6037559.121685 randread 22341 R 202,32 29295416 8192 0.16
6037559.121719 6037559.121874 randread 22341 R 202,16 27515304 8192 0.16
[...]
6037595.999508 6037596.000051 supervise 1692 W 202,1 12862968 4096 0.54
6037595.999513 6037596.000144 supervise 1687 W 202,1 17040160 4096 0.63
6037595.999634 6037596.000309 supervise 1693 W 202,1 17040168 4096 0.68
6037595.999937 6037596.000440 supervise 1693 W 202,1 17040176 4096 0.50
6037596.000579 6037596.001192 supervise 1689 W 202,1 17040184 4096 0.61
6037596.000826 6037596.001360 supervise 1689 W 202,1 17040192 4096 0.53
6037595.998302 6037596.018133 randread 22341 R 202,32 954168 8192 20.03
6037595.998303 6037596.018150 randread 22341 R 202,32 954200 8192 20.05
6037596.018182 6037596.018347 randread 22341 R 202,32 18836600 8192 0.16
[...]
It's important to sort on the I/O completion time (ENDs). In this case it's
already in the correct order.
So my 20 ms reads happened after a large group of supervise writes were
completed (I truncated dozens of supervise write lines to keep this example
short). Other latency outliers in this output file showed the same sequence:
slow reads after a batch of writes.
Note the I/O request timestamp (STARTs), which shows that these 20 ms reads were
issued before the supervise writes – so they had been sitting on a queue. I've
debugged this type of issue many times before, but this one is different: those
writes were to a different device (202,1), so I would have assumed they would be
on different queues, and wouldn't interfere with each other. Somewhere in this
system (Xen guest) it looks like there is a shared queue. (Having just
discovered this using iosnoop, I can't yet tell you which queue, but I'd hope
that after identifying it there would be a way to tune its queueing behavior,
so that we can eliminate or reduce the severity of these outliers.)
Use -h to print the USAGE message:
# ./iosnoop -h
USAGE: iosnoop [-hQst] [-d device] [-i iotype] [-p PID] [-n name]
[duration]
-d device # device string (eg, "202,1)
-i iotype # match type (eg, '*R*' for all reads)
-n name # process name to match on I/O issue
-p PID # PID to match on I/O issue
-Q # use queue insert as start time
-s # include start time of I/O (s)
-t # include completion time of I/O (s)
-h # this usage message
duration # duration seconds, and use buffers
eg,
iosnoop # watch block I/O live (unbuffered)
iosnoop 1 # trace 1 sec (buffered)
iosnoop -Q # include queueing time in LATms
iosnoop -ts # include start and end timestamps
iosnoop -i '*R*' # trace reads
iosnoop -p 91 # show I/O issued when PID 91 is on-CPU
iosnoop -Qp 91 # show I/O queued by PID 91, queue time
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/killsnoop_example.txt 0000664 0000000 0000000 00000004662 13614503575 0026414 0 ustar 00root root 0000000 0000000 Demonstrations of killsnoop, the Linux ftrace version.
What signals are happening on my system?
# ./killsnoop
Tracing kill()s. Ctrl-C to end.
COMM PID TPID SIGNAL RETURN
postgres 2209 2148 10 0
postgres 5416 2209 12 0
postgres 5416 2209 12 0
supervise 2135 5465 15 0
supervise 2135 5465 18 0
^C
Ending tracing...
The first line of output shows that PID 2209, process name "postgres", has
sent a signal 10 (SIGUSR1) to target PID 2148. This signal returned success (0).
kilsnoop traces the kill() syscall, which is used to send signals to other
processes. These signals can include SIGKILL and SIGTERM, both of which
ultimately kill the target process (in different fashions), but the signals
may also include other operations, including checking if a process still
exists (signal 0). To read more about signals, see "man -s7 signal".
killsnoop can be useful to identify why some processes are abruptly and
unexpectedly ending (also check for the OOM killer in dmesg).
The -s option can be used to print signal names instead of numbers:
# ./killsnoop -s
Tracing kill()s. Ctrl-C to end.
COMM PID KILLED SIGNAL RETURN
postgres 2209 2148 SIGUSR1 0
postgres 5665 2209 SIGUSR2 0
postgres 5665 2209 SIGUSR2 0
supervise 2135 5711 SIGTERM 0
supervise 2135 5711 SIGCONT 0
bash 27450 27450 0 0
[...]
On the last line: there wasn't a nice signal name for signal 0, so just numeric
0 is printed. You'll see signal 0's used to check if processes still exist.
Use -h to print the USAGE message:
# ./opensnoop -h
USAGE: killsnoop [-ht] [-d secs] [-p PID] [-n name] [filename]
-d seconds # trace duration, and use buffers
-n name # process name to match
-p PID # PID to match on kill issue
-t # include time (seconds)
-s # human readable signal names
-h # this usage message
eg,
killsnoop # watch kill()s live (unbuffered)
killsnoop -d 1 # trace 1 sec (buffered)
killsnoop -p 181 # trace kill()s issued to PID 181 only
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/kprobe_example.txt 0000664 0000000 0000000 00000052625 13614503575 0025666 0 ustar 00root root 0000000 0000000 Demonstrations of kprobe, the Linux ftrace version.
This traces the kernel do_sys_open() function, when it is called:
# ./kprobe p:do_sys_open
Tracing kprobe do_sys_open. Ctrl-C to end.
kprobe-26042 [001] d... 6910441.001452: do_sys_open: (do_sys_open+0x0/0x220)
kprobe-26042 [001] d... 6910441.001475: do_sys_open: (do_sys_open+0x0/0x220)
kprobe-26042 [001] d... 6910441.001866: do_sys_open: (do_sys_open+0x0/0x220)
kprobe-26042 [001] d... 6910441.001966: do_sys_open: (do_sys_open+0x0/0x220)
supervise-1689 [000] d... 6910441.083302: do_sys_open: (do_sys_open+0x0/0x220)
supervise-1693 [001] d... 6910441.083530: do_sys_open: (do_sys_open+0x0/0x220)
supervise-1689 [000] d... 6910441.083759: do_sys_open: (do_sys_open+0x0/0x220)
supervise-1693 [001] d... 6910441.083877: do_sys_open: (do_sys_open+0x0/0x220)
[...]
The "p:" is for creating a probe. Use "r:" to probe the return of the function:
# ./kprobe r:do_sys_open
Tracing kprobe do_sys_open. Ctrl-C to end.
kprobe-29475 [001] d... 6910688.229777: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
<...>-29476 [001] d... 6910688.231101: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
<...>-29476 [001] d... 6910688.231123: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
<...>-29476 [001] d... 6910688.231530: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
<...>-29476 [001] d... 6910688.231624: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
supervise-1685 [001] d... 6910688.328776: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
supervise-1689 [000] d... 6910688.328780: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
[...]
This output includes the function that the traced function is returning to.
The trace output can be a little different between kernel versions. Use -H to
print the header:
# ./kprobe -H p:do_sys_open
Tracing kprobe do_sys_open. Ctrl-C to end.
# tracer: nop
#
# entries-in-buffer/entries-written: 4/4 #P:2
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
kprobe-27952 [001] d... 6910580.008086: do_sys_open: (do_sys_open+0x0/0x220)
kprobe-27952 [001] d... 6910580.008109: do_sys_open: (do_sys_open+0x0/0x220)
kprobe-27952 [001] d... 6910580.008483: do_sys_open: (do_sys_open+0x0/0x220)
[...]
These columns are explained in the kernel source under Documentation/trace/ftrace.txt.
This traces do_sys_open() returns, using a probe alias "myopen", and showing
the return value ($retval):
# ./kprobe 'r:myopen do_sys_open $retval'
Tracing kprobe myopen. Ctrl-C to end.
kprobe-26386 [001] d... 6593278.858754: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
<...>-26387 [001] d... 6593278.860043: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
<...>-26387 [001] d... 6593278.860064: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
<...>-26387 [001] d... 6593278.860433: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
<...>-26387 [001] d... 6593278.860521: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
supervise-1685 [001] d... 6593279.178806: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
supervise-1689 [001] d... 6593279.228756: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
supervise-1689 [001] d... 6593279.229106: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
supervise-1688 [000] d... 6593279.229501: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
supervise-1695 [000] d... 6593279.229944: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
supervise-1685 [001] d... 6593279.230104: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
supervise-1687 [001] d... 6593279.230293: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
supervise-1699 [000] d... 6593279.230381: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
supervise-1692 [000] d... 6593279.230825: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
supervise-1698 [000] d... 6593279.230915: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
supervise-1698 [000] d... 6593279.231277: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
supervise-1690 [000] d... 6593279.231703: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
^C
Ending tracing...
The string specified, 'r:myopen do_sys_open $retval', is a kprobe definition,
and is the same as those documented in the Linux kernel source under
Documentation/trace/kprobetrace.txt, which can be written to the
/sys/kernel/debug/tracing/kprobe_events file.
Apart from probe name aliases, you can also provide arbitrary names for
arguments. Eg, instead of the "arg1" default, calling it "rval":
# ./kprobe 'r:myopen do_sys_open rval=$retval'
Tracing kprobe myopen. Ctrl-C to end.
kprobe-27454 [001] d... 6593356.250019: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
<...>-27455 [001] d... 6593356.251280: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
<...>-27455 [001] d... 6593356.251301: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
<...>-27455 [001] d... 6593356.251672: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
<...>-27455 [001] d... 6593356.251769: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
supervise-1689 [000] d... 6593356.859758: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x9
supervise-1689 [000] d... 6593356.860143: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x9
supervise-1696 [000] d... 6593356.862682: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x9
supervise-1685 [001] d... 6593356.862684: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x9
[...]
That's a bit better.
Tracing the open() mode:
# ./kprobe 'p:myopen do_sys_open mode=%cx:u16'
Tracing kprobe myopen. Ctrl-C to end.
kprobe-29572 [001] d... 6593503.353923: myopen: (do_sys_open+0x0/0x220) mode=0x1
kprobe-29572 [001] d... 6593503.353945: myopen: (do_sys_open+0x0/0x220) mode=0x0
kprobe-29572 [001] d... 6593503.354307: myopen: (do_sys_open+0x0/0x220) mode=0x5c00
kprobe-29572 [001] d... 6593503.354401: myopen: (do_sys_open+0x0/0x220) mode=0x0
supervise-1689 [000] d... 6593503.944125: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
supervise-1688 [001] d... 6593503.944125: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
supervise-1688 [001] d... 6593503.944606: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
supervise-1689 [000] d... 6593503.944606: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
supervise-1698 [000] d... 6593503.944728: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
supervise-1698 [000] d... 6593503.945077: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
[...]
Here I guessed that the mode was in register %cx, and cast it as a 16-bit
unsigned integer (":u16"). Your platform and kernel may be different, and the
mode may be in a different register. If fiddling with such registers becomes too
painful or unreliable for you, consider installing kernel debuginfo and using
the named variables with perf_events "perf probe".
Tracing the open() filename:
# ./kprobe 'p:myopen do_sys_open filename=+0(%si):string'
Tracing kprobe myopen. Ctrl-C to end.
kprobe-32369 [001] d... 6593706.999728: myopen: (do_sys_open+0x0/0x220) filename="/etc/ld.so.cache"
kprobe-32369 [001] d... 6593706.999748: myopen: (do_sys_open+0x0/0x220) filename="/lib/x86_64-linux-gnu/libc.so.6"
kprobe-32369 [001] d... 6593707.000092: myopen: (do_sys_open+0x0/0x220) filename="/usr/lib/locale/locale-archive"
kprobe-32369 [001] d... 6593707.000176: myopen: (do_sys_open+0x0/0x220) filename="trace_pipe"
supervise-1699 [000] d... 6593707.254970: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
supervise-1689 [001] d... 6593707.254970: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
supervise-1689 [001] d... 6593707.255432: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
supervise-1699 [000] d... 6593707.255432: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
supervise-1695 [001] d... 6593707.258805: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
[...]
As mentioned previously, the %si register may be different on your platform.
In this example, I cast it as a string.
Specifying a duration will buffer in-kernel (reducing overhead), and write at
the end. Here's tracing for 10 seconds, and writing to the "out" file:
# ./kprobe -d 10 'p:myopen do_sys_open filename=+0(%si):string' > out
You can match on a single PID only:
# ./kprobe -p 1696 'p:myopen do_sys_open filename=+0(%si):string'
Tracing kprobe myopen. Ctrl-C to end.
supervise-1696 [001] d... 6593773.677033: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
supervise-1696 [001] d... 6593773.677332: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
supervise-1696 [001] d... 6593774.697144: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
supervise-1696 [001] d... 6593774.697675: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
supervise-1696 [001] d... 6593775.717986: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
supervise-1696 [001] d... 6593775.718499: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
^C
Ending tracing...
This will only show events when that PID is on-CPU.
The -v option will show you the available variables you can use in custom
filters:
# ./kprobe -v 'p:myopen do_sys_open filename=+0(%si):string'
name: myopen
ID: 1443
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:unsigned long __probe_ip; offset:8; size:8; signed:0;
field:__data_loc char[] filename; offset:16; size:4; signed:1;
print fmt: "(%lx) filename=\"%s\"", REC->__probe_ip, __get_str(filename)
Tracing filenames that end in "stat", by adding a filter:
# ./kprobe 'p:myopen do_sys_open filename=+0(%si):string' 'filename ~ "*stat"'
Tracing kprobe myopen. Ctrl-C to end.
postgres-1172 [000] d... 6594028.787166: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
postgres-1172 [001] d... 6594028.797410: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
postgres-1172 [001] d... 6594028.797467: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
postgres-4443 [001] d... 6594028.800908: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
postgres-4443 [000] d... 6594028.811237: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
postgres-4443 [000] d... 6594028.811290: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
^C
Ending tracing...
This filtering is done in-kernel context.
As an example of tracing a deeper kernel function, lets trace bio_alloc() and
entry registers:
# ./kprobe 'p:myprobe bio_alloc %ax %bx %cx %dx'
Tracing kprobe myprobe. Ctrl-C to end.
supervise-3055 [000] 2172148.728250: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acc8d0 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acc910
supervise-3055 [000] 2172148.728527: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acf948 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acf988
jbd2/xvda1-8-212 [000] 2172149.749474: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800ad1f87b8 arg3=ffff8800ba22c06c arg4=8
jbd2/xvda1-8-212 [000] 2172149.749485: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d053a8 arg3=10f16c5bb arg4=0
jbd2/xvda1-8-212 [000] 2172149.749487: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d05958 arg3=5 arg4=0
jbd2/xvda1-8-212 [000] 2172149.749488: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d05b60 arg3=5 arg4=0
jbd2/xvda1-8-212 [000] 2172149.749489: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d05820 arg3=5 arg4=0
jbd2/xvda1-8-212 [000] 2172149.749489: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d055b0 arg3=5 arg4=0
jbd2/xvda1-8-212 [000] 2172149.749490: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff88006ff22ea0 arg3=5 arg4=0
jbd2/xvda1-8-212 [000] 2172149.749491: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d1f000 arg3=5 arg4=0
jbd2/xvda1-8-212 [000] 2172149.749492: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d1f138 arg3=5 arg4=0
jbd2/xvda1-8-212 [000] 2172149.749493: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff88005d267138 arg3=5 arg4=0
jbd2/xvda1-8-212 [000] 2172149.749494: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff88005d267680 arg3=5 arg4=0
jbd2/xvda1-8-212 [000] 2172149.749495: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff88005d2675b0 arg3=5 arg4=0
jbd2/xvda1-8-212 [000] 2172149.751044: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800cc241ea0 arg3=445f0300 arg4=ffff8800effba000
supervise-3055 [000] 2172149.751095: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acf948 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acf988
supervise-3055 [000] 2172149.751341: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acc8d0 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acc910
supervise-3055 [000] 2172150.772033: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acc8d0 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acc910
supervise-3055 [000] 2172150.772305: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acf948 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acf988
flush-202:1-409 [000] 2172151.087815: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800da51d6e8 arg3=16afd arg4=1
flush-202:1-409 [000] 2172151.087829: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e7537f08 arg3=16afd arg4=2
flush-202:1-409 [000] 2172151.087844: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e7519af8 arg3=16afd arg4=3
flush-202:1-409 [000] 2172151.087846: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e7511478 arg3=16afd arg4=4
flush-202:1-409 [000] 2172151.087849: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e75e6a90 arg3=16afd arg4=5
flush-202:1-409 [000] 2172151.087851: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e7512bc8 arg3=16afd arg4=6
flush-202:1-409 [000] 2172151.087853: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800eb3bf410 arg3=16afd arg4=7
^C
The output includes who is on-CPU, high resolution timestamps, and the arguments
we requested (registers %ax to %dx). These registers are platform dependent,
and are mapped by the compiler to the entry arguments of the function.
How are these useful? If you are debugging this kernel function, you'll know. :)
Note that you can add qualifiers, eg, if I knew %ax was a uint32:
# ./kprobe 'p:myprobe bio_alloc %ax:u32'
Tracing kprobe myprobe. Ctrl-C to end.
supervise-3055 [000] 2172389.734606: myprobe: (bio_alloc+0x0/0x30) arg1=64acf948
supervise-3055 [000] 2172389.734865: myprobe: (bio_alloc+0x0/0x30) arg1=64acc8d0
supervise-3055 [000] 2172390.772391: myprobe: (bio_alloc+0x0/0x30) arg1=64acf948
supervise-3055 [000] 2172390.772676: myprobe: (bio_alloc+0x0/0x30) arg1=64acc8d0
^C
Ending tracing...
You can give them aliases too, instead of the default arg1..N:
# ./kprobe 'p:myprobe bio_alloc ax=%ax'
Tracing kprobe myprobe. Ctrl-C to end.
supervise-3055 [000] 2172420.451663: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acc8d0
supervise-3055 [000] 2172420.451938: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acf948
flush-202:1-409 [000] 2172421.163462: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acc8d0
supervise-3055 [000] 2172421.500994: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acc8d0
supervise-3055 [000] 2172421.501307: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acf948
^C
Ending tracing...
Now for the return of bio_alloc():
# ./kprobe 'r:myprobe bio_alloc $retval'
Tracing kprobe myprobe. Ctrl-C to end.
supervise-3055 [000] 2172164.145533: myprobe: (io_submit_init.isra.6+0x74/0x100 <- bio_alloc) arg1=ffff8800e55843c0
supervise-3055 [000] 2172164.145829: myprobe: (io_submit_init.isra.6+0x74/0x100 <- bio_alloc) arg1=ffff8800e5584840
jbd2/xvda1-8-212 [000] 2172165.166453: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e57596c0
jbd2/xvda1-8-212 [000] 2172165.166493: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759c00
jbd2/xvda1-8-212 [000] 2172165.166496: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759600
jbd2/xvda1-8-212 [000] 2172165.166497: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759e40
jbd2/xvda1-8-212 [000] 2172165.166498: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e57590c0
jbd2/xvda1-8-212 [000] 2172165.166500: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e57599c0
jbd2/xvda1-8-212 [000] 2172165.166500: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759a80
jbd2/xvda1-8-212 [000] 2172165.166502: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759f00
jbd2/xvda1-8-212 [000] 2172165.166503: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759540
jbd2/xvda1-8-212 [000] 2172165.166504: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759180
jbd2/xvda1-8-212 [000] 2172165.166504: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759900
jbd2/xvda1-8-212 [000] 2172165.166505: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759000
jbd2/xvda1-8-212 [000] 2172165.166506: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759480
<...>-212 [000] 2172165.176261: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759480
supervise-3055 [000] 2172165.176317: myprobe: (io_submit_init.isra.6+0x74/0x100 <- bio_alloc) arg1=ffff8800e57596c0
supervise-3055 [000] 2172165.176586: myprobe: (io_submit_init.isra.6+0x74/0x100 <- bio_alloc) arg1=ffff8800e5759900
^C
Ending tracing...
Great. This output includes the function we are returning to, in most cases,
submit_bh().
Note that this mode (without a duration) prints events as they happen,
so the overheads can be high for frequent events. You could try the -d mode,
which buffers in-kernel.
The -s option will print the kernel stack trace after the event:
# ./kprobe -s 'p:mytcp tcp_init_cwnd'
Tracing kprobe mytcp. Ctrl-C to end.
sshd-5121 [000] d... 6897275.911301: mytcp: (tcp_init_cwnd+0x0/0x40)
sshd-5121 [000] d... 6897275.911309:
=> tcp_write_xmit
=> __tcp_push_pending_frames
=> tcp_push
=> tcp_sendmsg
=> inet_sendmsg
=> sock_aio_write
=> do_sync_write
=> vfs_write
=> SyS_write
=> system_call_fastpath
sshd-32219 [000] d... 6897275.911467: mytcp: (tcp_init_cwnd+0x0/0x40)
sshd-32219 [000] d... 6897275.911471:
=> tcp_write_xmit
=> __tcp_push_pending_frames
=> tcp_push
=> tcp_sendmsg
=> inet_sendmsg
=> sock_aio_write
=> do_sync_write
=> vfs_write
=> SyS_write
=> system_call_fastpath
sshd-5121 [000] d... 6897277.878794: mytcp: (tcp_init_cwnd+0x0/0x40)
sshd-5121 [000] d... 6897277.878801:
=> tcp_write_xmit
=> __tcp_push_pending_frames
=> tcp_push
=> tcp_sendmsg
=> inet_sendmsg
=> sock_aio_write
=> do_sync_write
=> vfs_write
=> SyS_write
=> system_call_fastpath
This makes use of the kernel options/stacktrace feature.
Use -h to print the USAGE message:
# ./kprobe -h
USAGE: kprobe [-FhHsv] [-d secs] [-p PID] [-L TID] kprobe_definition [filter]
-F # force. trace despite warnings.
-d seconds # trace duration, and use buffers
-p PID # PID to match on events
-L TID # thread id to match on events
-v # view format file (don't trace)
-H # include column headers
-s # show kernel stack traces
-h # this usage message
Note that these examples may need modification to match your kernel
version's function names and platform's register usage.
eg,
kprobe p:do_sys_open
# trace open() entry
kprobe r:do_sys_open
# trace open() return
kprobe 'r:do_sys_open $retval'
# trace open() return value
kprobe 'r:myopen do_sys_open $retval'
# use a custom probe name
kprobe 'p:myopen do_sys_open mode=%cx:u16'
# trace open() file mode
kprobe 'p:myopen do_sys_open filename=+0(%si):string'
# trace open() with filename
kprobe -s 'p:myprobe tcp_retransmit_skb'
# show kernel stacks
kprobe 'p:do_sys_open file=+0(%si):string' 'file ~ "*stat"'
# opened files ending in "stat"
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/opensnoop_example.txt 0000664 0000000 0000000 00000003737 13614503575 0026424 0 ustar 00root root 0000000 0000000 Demonstrations of opensnoop, the Linux ftrace version.
# ./opensnoop
Tracing open()s. Ctrl-C to end.
COMM PID FD FILE
opensnoop 5334 0x3
<...> 5343 0x3 /etc/ld.so.cache
opensnoop 5342 0x3 /etc/ld.so.cache
<...> 5343 0x3 /lib/x86_64-linux-gnu/libc.so.6
opensnoop 5342 0x3 /lib/x86_64-linux-gnu/libm.so.6
opensnoop 5342 0x3 /lib/x86_64-linux-gnu/libc.so.6
<...> 5343 0x3 /usr/lib/locale/locale-archive
<...> 5343 0x3 trace_pipe
supervise 1684 0x9 supervise/status.new
supervise 1684 0x9 supervise/status.new
supervise 1688 0x9 supervise/status.new
supervise 1688 0x9 supervise/status.new
supervise 1686 0x9 supervise/status.new
supervise 1685 0x9 supervise/status.new
supervise 1685 0x9 supervise/status.new
supervise 1686 0x9 supervise/status.new
[...]
The first several lines show opensnoop catching itself initializing.
Use -h to print the USAGE message:
# ./opensnoop -h
USAGE: opensnoop [-htx] [-d secs] [-p PID] [-L TID] [-n name] [filename]
-d seconds # trace duration, and use buffers
-n name # process name to match on open
-p PID # PID to match on open
-L TID # thread id to match on open
-t # include time (seconds)
-x # only show failed opens
-h # this usage message
filename # match filename (partials, REs, ok)
eg,
opensnoop # watch open()s live (unbuffered)
opensnoop -d 1 # trace 1 sec (buffered)
opensnoop -p 181 # trace I/O issued by PID 181 only
opensnoop conf # trace filenames containing "conf"
opensnoop 'log$' # filenames ending in "log"
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/perf-stat-hist_example.txt 0000664 0000000 0000000 00000016364 13614503575 0027256 0 ustar 00root root 0000000 0000000 Demonstrations of perf-stat-hist, the Linux perf_events version.
Tracing the net:net_dev_xmit tracepoint, and building a power-of-4 histogram
for the "len" variable, for 10 seconds:
# ./perf-stat-hist net:net_dev_xmit len 10
Tracing net:net_dev_xmit, power-of-4, max 1048576, for 10 seconds...
Range : Count Distribution
0 : 0 | |
1 -> 3 : 0 | |
4 -> 15 : 0 | |
16 -> 63 : 2 |# |
64 -> 255 : 30 |### |
256 -> 1023 : 3 |# |
1024 -> 4095 : 446 |######################################|
4096 -> 16383 : 0 | |
16384 -> 65535 : 0 | |
65536 -> 262143 : 0 | |
262144 -> 1048575 : 0 | |
1048576 -> : 0 | |
This showed that most of the network transmits were between 1024 and 4095 bytes,
with a handful between 64 and 255 bytes.
Cat the format file for the tracepoint to see what other variables are available
to trace. Eg:
# cat /sys/kernel/debug/tracing/events/net/net_dev_xmit/format
name: net_dev_xmit
ID: 1078
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:void * skbaddr; offset:8; size:8; signed:0;
field:unsigned int len; offset:16; size:4; signed:0;
field:int rc; offset:20; size:4; signed:1;
field:__data_loc char[] name; offset:24; size:4; signed:1;
print fmt: "dev=%s skbaddr=%p len=%u rc=%d", __get_str(name), REC->skbaddr, REC->len, REC->rc
That's where "len" came from.
This works by creating a series of tracepoint and filter pairs for each
histogram bucket, and doing in-kernel counts. The overhead should in many cases
be better than user space post-processing, however, this approach is still
not ideal. I've called it a "perf hacktogram". The overhead is relative to
the frequency of events, multiplied by the number of buckets. You can modify
the script to use power-of-2 instead, or whatever you like, but the overhead
for more buckets will be higher.
Histogram of the returned read() syscall sizes:
# ./perf-stat-hist syscalls:sys_exit_read ret 10
Tracing syscalls:sys_exit_read, power-of-4, max 1048576, for 10 seconds...
Range : Count Distribution
0 : 90 |# |
1 -> 3 : 9587 |######################################|
4 -> 15 : 69 |# |
16 -> 63 : 590 |### |
64 -> 255 : 250 |# |
256 -> 1023 : 389 |## |
1024 -> 4095 : 296 |## |
4096 -> 16383 : 183 |# |
16384 -> 65535 : 12 |# |
65536 -> 262143 : 0 | |
262144 -> 1048575 : 0 | |
1048576 -> : 0 | |
Most of our read()s were tiny, between 1 and 3 bytes.
Using power-of-2, and a max of 1024:
# ./perf-stat-hist -P 2 -m 1024 syscalls:sys_exit_read ret
Tracing syscalls:sys_exit_read, power-of-2, max 1024, until Ctrl-C...
^C
Range : Count Distribution
-> -1 : 29 |## |
0 -> 0 : 1 |# |
1 -> 1 : 959 |######################################|
2 -> 3 : 1 |# |
4 -> 7 : 0 | |
8 -> 15 : 2 |# |
16 -> 31 : 14 |# |
32 -> 63 : 1 |# |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 1 |# |
1024 -> : 1 |# |
Specifying custom bucket sizes:
# ./perf-stat-hist -b "10 50 100 5000" syscalls:sys_exit_read ret
Tracing syscalls:sys_exit_read, specified buckets, until Ctrl-C...
^C
Range : Count Distribution
-> 9 : 989 |######################################|
10 -> 49 : 5 |# |
50 -> 99 : 0 | |
100 -> 4999 : 2 |# |
5000 -> : 0 | |
Specifying a single value to bifurcate statistics:
# ./perf-stat-hist -b 10 syscalls:sys_exit_read ret
Tracing syscalls:sys_exit_read, specified buckets, until Ctrl-C...
^C
Range : Count Distribution
-> 9 : 2959 |######################################|
10 -> : 7 |# |
This has the lowest overhead for collection, since only two tracepoint
filter pairs are used.
Use -h to print the USAGE message:
# ./perf-stat-hist -h
USAGE: perf-stat-hist [-h] [-b buckets|-P power] [-m max] tracepoint
variable [seconds]
-b buckets # specify histogram bucket points
-P power # power-of (default is 4)
-m max # max value for power-of
-h # this usage message
eg,
perf-stat-hist syscalls:sys_enter_read count 5
# read() request histogram, 5 seconds
perf-stat-hist syscalls:sys_exit_read ret 5
# read() return histogram, 5 seconds
perf-stat-hist -P 10 syscalls:sys_exit_read ret 5
# ... use power-of-10
perf-stat-hist -P 2 -m 1024 syscalls:sys_exit_read ret 5
# ... use power-of-2, max 1024
perf-stat-hist -b "10 50 100 500" syscalls:sys_exit_read ret 5
# ... histogram based on these bucket ranges
perf-stat-hist -b 10 syscalls:sys_exit_read ret 5
# ... bifurcate by the value 10 (lowest overhead)
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/reset-ftrace_example.txt 0000664 0000000 0000000 00000005107 13614503575 0026761 0 ustar 00root root 0000000 0000000 Demonstrations of reset-ftrace, the Linux ftrace tool.
You will probably never need this tool. If you kill -9 an ftrace-based tool,
leaving the kernel in a tracing enabled state, you could try using this tool
to reset ftrace and disable tracing. Make sure no other ftrace sessions are
in use on your system, or it will kill those.
Here's an example:
# ./opensnoop
Tracing open()s. Ctrl-C to end.
ERROR: ftrace may be in use by PID 2197 /var/tmp/.ftrace-lock
I tried to run opensnoop, but there's a lock file for PID 2197. Checking if it
exists:
# ps -fp 2197
UID PID PPID C STIME TTY TIME CMD
#
No.
I also know that no one is using ftrace on this system. So I'll use reset-ftrace
to clean up this lock file and ftrace state:
# ./reset-ftrace
ERROR: ftrace lock (/var/tmp/.ftrace-lock) exists. It shows ftrace may be in use by PID 2197.
Double check to see if that PID is still active. If not, consider using -f to force a reset. Exiting.
... except it's complaining about the lock file too. I'm already sure that this
PID doesn't exist, so I'll add the -f option:
# ./reset-ftrace -f
Reseting ftrace state...
current_tracer, before:
1 nop
current_tracer, after:
1 nop
set_ftrace_filter, before:
1 #### all functions enabled ####
set_ftrace_filter, after:
1 #### all functions enabled ####
set_ftrace_pid, before:
1 no pid
set_ftrace_pid, after:
1 no pid
kprobe_events, before:
kprobe_events, after:
Done.
The output shows what has been reset, including the before and after state of
these files.
Now I can try iosnoop again:
# ./iosnoop
Tracing block I/O. Ctrl-C to end.
COMM PID TYPE DEV BLOCK BYTES LATms
supervise 1689 W 202,1 17039664 4096 0.58
supervise 1689 W 202,1 17039672 4096 0.47
supervise 1694 W 202,1 17039744 4096 0.98
supervise 1694 W 202,1 17039752 4096 0.74
supervise 1684 W 202,1 17039760 4096 0.63
[...]
Fixed.
Note that reset-ftrace currently only resets a few methods of enabling
tracing, such as set_ftrace_filter and kprobe_events. Static tracepoints could
be enabled individually, and this script currently doesn't find and disable
those.
Use -h to print the USAGE message:
# ./reset-ftrace -h
USAGE: reset-ftrace [-fhq]
-f # force: delete ftrace lock file
-q # quiet: reset, but say nothing
-h # this usage message
eg,
reset-ftrace # disable active ftrace session
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/syscount_example.txt 0000664 0000000 0000000 00000022053 13614503575 0026263 0 ustar 00root root 0000000 0000000 Demonstrations of syscount, the Linux perf_events version.
The first mode I use is "-c", where it behaves like "strace -c", but for the
entire system (all procesess) and with much lower overhead:
# ./syscount -c
Tracing... Ctrl-C to end.
^Csleep: Interrupt
SYSCALL COUNT
accept 1
getsockopt 1
setsid 1
chdir 2
getcwd 2
getpeername 2
getsockname 2
setgid 2
setgroups 2
setpgid 2
setuid 2
getpgrp 4
getpid 4
rename 4
setitimer 4
setrlimit 4
setsockopt 4
statfs 4
set_tid_address 5
readlink 6
set_robust_list 6
nanosleep 7
newuname 7
faccessat 8
futex 10
clock_gettime 16
newlstat 20
pipe 20
epoll_wait 24
getrlimit 25
socket 27
connect 29
exit_group 30
getppid 31
dup2 34
wait4 51
fcntl 58
getegid 72
getgid 72
getuid 72
geteuid 75
perf_event_open 100
munmap 121
gettimeofday 216
access 266
ioctl 340
poll 348
sendto 374
mprotect 414
brk 597
rt_sigaction 632
recvfrom 664
lseek 749
newfstatat 2922
openat 2925
newfstat 3229
newstat 4334
open 4534
fchdir 5845
getdents 5854
read 7673
close 7728
select 9633
rt_sigprocmask 19886
write 34581
While tracing, the write() syscall was executed 34,581 times.
This mode uses "perf stat" to count the syscalls:* tracepoints in-kernel.
You can add a duration (-d) and limit the number shown (-t):
# ./syscount -cd 5 -t 10
Tracing for 5 seconds. Top 10 only...
SYSCALL COUNT
gettimeofday 1009
write 3583
read 8174
openat 21550
newfstat 21558
open 21824
fchdir 43098
getdents 43106
close 43694
newfstatat 110936
While tracing for 5 seconds, the newfstatat() syscall was executed 110,936
times.
Without the -c, syscount shows syscalls by process name:
# ./syscount -d 5 -t 10
Tracing for 5 seconds. Top 10 only...
[ perf record: Woken up 66 times to write data ]
[ perf record: Captured and wrote 16.513 MB perf.data (~721455 samples) ]
COMM COUNT
stat 450
perl 537
catalina.sh 1700
postgres 2094
run 2362
:6946 4764
ps 5961
sshd 45796
find 61039
So processes named "find" called 61,039 syscalls during the 5 seconds of
tracing.
Note that this mode writes a perf.data file. This is higher overhead for a
few reasons:
- all data is passed from kernel to user space, which eats CPU for the memory
copy. Note that it is buffered in an efficient way by perf_events, which
wakes up and context switches only a small number of times: 66 in this case,
to hand 16 Mbytes of trace data to user space.
- data is post-processed in user space, eating more CPU.
- data is stored on the file system in the perf.data file, consuming available
storage.
This will be improved in future kernels, but it is difficult to improve this
much further in existing kernels. For example, using a pipe to "perf script"
instead of writing perf.data can have issues with feedback loops, where
perf traces itself. This syscount version goes to lengths to avoid tracing
its own perf, but
right now with existing functionality in older kernels. The trip via perf.data
is necessary
Running without options shows syscalls by process name until Ctrl-C:
# ./syscount
Tracing... Ctrl-C to end.
^C[ perf record: Woken up 39 times to write data ]
[ perf record: Captured and wrote 9.644 MB perf.data (~421335 samples) ]
COMM COUNT
apache2 8
apacheLogParser 13
platformservice 16
snmpd 16
ntpd 21
multilog 66
supervise 84
dirname 102
echo 102
svstat 108
cut 111
bash 113
grep 132
xargs 132
redis-server 190
sed 192
setuidgid 294
stat 450
perl 537
catalina.sh 1275
postgres 1736
run 2352
:7396 4527
ps 5925
sshd 20154
find 28700
Note again it is writing a perf.data file to do this.
The -v option adds process IDs:
# ./syscount -v
Tracing... Ctrl-C to end.
^C[ perf record: Woken up 48 times to write data ]
[ perf record: Captured and wrote 12.114 MB perf.data (~529276 samples) ]
PID COMM COUNT
3599 apacheLogParser 3
7977 xargs 3
7982 supervise 3
7993 xargs 3
3575 apache2 4
1311 ntpd 6
3135 postgres 6
3600 apacheLogParser 6
3210 platformservice 8
6503 sshd 9
7978 :7978 9
7994 run 9
7968 :7968 11
7984 run 11
1451 snmpd 16
3040 svscan 17
3066 postgres 17
3133 postgres 24
3134 postgres 24
3136 postgres 24
3061 multilog 29
3055 supervise 30
7979 bash 31
7977 echo 34
7981 dirname 34
7993 echo 34
7968 svstat 36
7984 svstat 36
7975 cut 37
7991 cut 37
9857 bash 37
7967 :7967 40
7983 run 40
7972 :7972 41
7976 xargs 41
7988 run 41
7992 xargs 41
7969 :7969 42
7976 :7976 42
7985 run 42
7992 run 42
7973 :7973 43
7974 :7974 43
7989 run 43
7990 run 43
7973 grep 44
7989 grep 44
7975 :7975 45
7991 run 45
7970 :7970 51
7986 run 51
7981 catalina.sh 52
7974 sed 64
7990 sed 64
3455 postgres 66
7971 :7971 66
7987 run 66
7966 :7966 96
7966 setuidgid 98
3064 redis-server 110
7970 stat 150
7986 stat 150
7969 perl 179
7985 perl 179
7982 run 341
7966 catalina.sh 373
7980 postgres 432
7972 ps 1971
7988 ps 1983
9832 sshd 37511
7979 find 51040
Once you've found a process ID of interest, you can use "-c" and "-p PID" to
show syscall names. This also switches to "perf stat" mode for in-kernel
counts, and lower overhead:
# ./syscount -cp 7979
Tracing PID 7979... Ctrl-C to end.
^CSYSCALL COUNT
brk 10
newfstat 2171
open 2171
newfstatat 2175
openat 2175
close 4346
fchdir 4346
getdents 4351
write 25482
So the most frequent syscall by PID 7979 was write().
Use -h to print the USAGE message:
# ./syscount -h
USAGE: syscount [-chv] [-t top] {-p PID|-d seconds|command}
syscount # count by process name
-c # show counts by syscall name
-h # this usage message
-v # verbose: shows PID
-p PID # trace this PID only
-d seconds # duration of trace
-t num # show top number only
command # run and trace this command
eg,
syscount # syscalls by process name
syscount -c # syscalls by syscall name
syscount -d 5 # trace for 5 seconds
syscount -cp 923 # syscall names for PID 923
syscount -c ls # syscall names for "ls"
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/tcpretrans_example.txt 0000664 0000000 0000000 00000006677 13614503575 0026577 0 ustar 00root root 0000000 0000000 Demonstrations of tcpretrans, the Linux ftrace version.
Tracing TCP retransmits on a busy server:
# ./tcpretrans
TIME PID LADDR:LPORT -- RADDR:RPORT STATE
05:16:44 3375 10.150.18.225:53874 R> 10.105.152.3:6001 ESTABLISHED
05:16:44 3375 10.150.18.225:53874 R> 10.105.152.3:6001 ESTABLISHED
05:16:54 4028 10.150.18.225:6002 R> 10.150.30.249:1710 ESTABLISHED
05:16:54 4028 10.150.18.225:6002 R> 10.150.30.249:1710 ESTABLISHED
05:16:54 4028 10.150.18.225:6002 R> 10.150.30.249:1710 ESTABLISHED
05:16:54 4028 10.150.18.225:6002 R> 10.150.30.249:1710 ESTABLISHED
05:16:54 4028 10.150.18.225:6002 R> 10.150.30.249:1710 ESTABLISHED
05:16:54 4028 10.150.18.225:6002 R> 10.150.30.249:1710 ESTABLISHED
05:16:54 4028 10.150.18.225:6002 R> 10.150.30.249:1710 ESTABLISHED
05:16:55 0 10.150.18.225:47115 R> 10.71.171.158:6001 ESTABLISHED
05:16:58 0 10.150.18.225:44388 R> 10.103.130.120:6001 ESTABLISHED
05:16:58 0 10.150.18.225:44388 R> 10.103.130.120:6001 ESTABLISHED
05:16:58 0 10.150.18.225:44388 R> 10.103.130.120:6001 ESTABLISHED
05:16:59 0 10.150.18.225:56086 R> 10.150.32.107:6001 ESTABLISHED
05:16:59 0 10.150.18.225:56086 R> 10.150.32.107:6001 ESTABLISHED
^C
Ending tracing...
This shows TCP retransmits by dynamically tracing the kernel function that does
the retransmit. This is a low overhead approach.
The PID may or may not make sense: it's showing the PID that was on-CPU,
however, retransmits are often timer-based, where it's the kernel that is
on-CPU.
The STATE column shows the TCP state for the socket performing the retransmit.
The "--" column is the packet type. "R>" for retransmit.
Kernel stack traces can be included with -s, which may show the type of
retransmit:
# ./tcpretrans -s
TIME PID LADDR:LPORT -- RADDR:RPORT STATE
06:21:10 19516 10.144.107.151:22 R> 10.13.106.251:32167 ESTABLISHED
=> tcp_fastretrans_alert
=> tcp_ack
=> tcp_rcv_established
=> tcp_v4_do_rcv
=> tcp_v4_rcv
=> ip_local_deliver_finish
=> ip_local_deliver
=> ip_rcv_finish
=> ip_rcv
=> __netif_receive_skb
=> netif_receive_skb
=> handle_incoming_queue
=> xennet_poll
=> net_rx_action
=> __do_softirq
=> call_softirq
=> do_softirq
=> irq_exit
=> xen_evtchn_do_upcall
=> xen_do_hypervisor_callback
This looks like a fast retransmit (inclusion of tcp_fastretrans_alert(), and
being based on receiving an ACK, rather than a timer).
The -l option will include TCP tail loss probe events (TLP; see
http://lwn.net/Articles/542642/). Eg:
# ./tcpretrans -l
TIME PID LADDR:LPORT -- RADDR:RPORT STATE
21:56:06 0 10.100.155.200:22 R> 10.10.237.72:18554 LAST_ACK
21:56:08 0 10.100.155.200:22 R> 10.10.237.72:18554 LAST_ACK
21:56:10 16452 10.100.155.200:22 R> 10.10.237.72:18554 LAST_ACK
21:56:10 0 10.100.155.200:22 L> 10.10.237.72:46408 LAST_ACK
21:56:10 0 10.100.155.200:22 R> 10.10.237.72:46408 LAST_ACK
21:56:12 0 10.100.155.200:22 R> 10.10.237.72:46408 LAST_ACK
21:56:13 0 10.100.155.200:22 R> 10.10.237.72:46408 LAST_ACK
^C
Ending tracing...
Look for "L>" in the type column ("--") for TLP events.
Use -h to print the USAGE message:
# ./tcpretrans -h
USAGE: tcpretrans [-hs]
-h # help message
-s # print stack traces
eg,
tcpretrans # trace TCP retransmits
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/tpoint_example.txt 0000664 0000000 0000000 00000017415 13614503575 0025717 0 ustar 00root root 0000000 0000000 Demonstrations of tpoint, the Linux ftrace version.
Let's trace block:block_rq_issue, to see block device (disk) I/O requests:
# ./tpoint block:block_rq_issue
Tracing block:block_rq_issue. Ctrl-C to end.
supervise-1692 [001] d... 7269912.982162: block_rq_issue: 202,1 W 0 () 17039656 + 8 [supervise]
supervise-1696 [000] d... 7269912.982243: block_rq_issue: 202,1 W 0 () 12862264 + 8 [supervise]
cksum-12994 [000] d... 7269913.317924: block_rq_issue: 202,1 R 0 () 9357056 + 72 [cksum]
cksum-12994 [000] d... 7269913.319013: block_rq_issue: 202,1 R 0 () 2977536 + 144 [cksum]
cksum-12994 [000] d... 7269913.320217: block_rq_issue: 202,1 R 0 () 2986240 + 216 [cksum]
cksum-12994 [000] d... 7269913.321677: block_rq_issue: 202,1 R 0 () 620344 + 56 [cksum]
cksum-12994 [001] d... 7269913.329309: block_rq_issue: 202,1 R 0 () 9107912 + 88 [cksum]
cksum-12994 [001] d... 7269913.340133: block_rq_issue: 202,1 R 0 () 3147008 + 248 [cksum]
cksum-12994 [001] d... 7269913.354551: block_rq_issue: 202,1 R 0 () 11583488 + 256 [cksum]
cksum-12994 [001] d... 7269913.379904: block_rq_issue: 202,1 R 0 () 11583744 + 256 [cksum]
[...]
^C
Ending tracing...
Great, that was easy!
perf_events can do this as well, and is better in many ways, including a more
efficient buffering strategy, and multi-user access. It's not that easy to do
this one-liner in perf_events, however. An equivalent for recent kernels is:
perf record --no-buffer -e block:block_rq_issue -a -o - | PAGER=cat stdbuf -oL perf script -i -
Older kernels, use -D instead of --no-buffer. Even better is to set the buffer
page size to a sufficient grouping (using -m), to minimize overheads, at the
expense of liveliness of updates. Note that stack traces (-g) don't work on
my systems with this perf one-liner, however, they do work with tpoint -s.
Column headings can be printed using -H:
# ./tpoint -H block:block_rq_issue
Tracing block:block_rq_issue. Ctrl-C to end.
# tracer: nop
#
# entries-in-buffer/entries-written: 0/0 #P:2
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
supervise-1697 [000] d... 7270545.340856: block_rq_issue: 202,1 W 0 () 12862464 + 8 [supervise]
supervise-1697 [000] d... 7270545.341256: block_rq_issue: 202,1 W 0 () 12862472 + 8 [supervise]
supervise-1690 [000] d... 7270545.342363: block_rq_issue: 202,1 W 0 () 17040368 + 8 [supervise]
[...]
They are also documented in the Linux kernel source under:
Documentation/trace/ftrace.txt.
How about stacks traces for those block_rq_issue events? Adding -s:
# ./tpoint -s block:block_rq_issue
Tracing block:block_rq_issue. Ctrl-C to end.
supervise-1691 [000] d... 7269511.079179: block_rq_issue: 202,1 W 0 () 17040232 + 8 [supervise]
supervise-1691 [000] d... 7269511.079188:
=> blk_peek_request
=> do_blkif_request
=> __blk_run_queue
=> queue_unplugged
=> blk_flush_plug_list
=> blk_finish_plug
=> ext4_writepages
=> do_writepages
=> __filemap_fdatawrite_range
=> filemap_flush
=> ext4_alloc_da_blocks
=> ext4_rename
=> vfs_rename
=> SYSC_renameat2
=> SyS_renameat2
=> SyS_rename
=> system_call_fastpath
cksum-7428 [000] d... 7269511.331778: block_rq_issue: 202,1 R 0 () 9006848 + 208 [cksum]
cksum-7428 [000] d... 7269511.331784:
=> blk_peek_request
=> do_blkif_request
=> __blk_run_queue
=> queue_unplugged
=> blk_flush_plug_list
=> blk_finish_plug
=> __do_page_cache_readahead
=> ondemand_readahead
=> page_cache_async_readahead
=> generic_file_read_iter
=> new_sync_read
=> vfs_read
=> SyS_read
=> system_call_fastpath
cksum-7428 [000] d... 7269511.332631: block_rq_issue: 202,1 R 0 () 620992 + 200 [cksum]
cksum-7428 [000] d... 7269511.332639:
=> blk_peek_request
=> do_blkif_request
=> __blk_run_queue
=> queue_unplugged
=> blk_flush_plug_list
=> blk_finish_plug
=> __do_page_cache_readahead
=> ondemand_readahead
=> page_cache_sync_readahead
=> generic_file_read_iter
=> new_sync_read
=> vfs_read
=> SyS_read
=> system_call_fastpath
^C
Ending tracing...
Easy. Now I can read the ancestry to understand what actually lead to issuing
a block device (disk) I/O.
Here's insertion onto the block I/O queue (better matches processes):
# ./tpoint -s block:block_rq_insert
Tracing block:block_rq_insert. Ctrl-C to end.
cksum-11908 [000] d... 7269834.882517: block_rq_insert: 202,1 R 0 () 736304 + 256 [cksum]
cksum-11908 [000] d... 7269834.882528:
=> __elv_add_request
=> blk_flush_plug_list
=> blk_finish_plug
=> __do_page_cache_readahead
=> ondemand_readahead
=> page_cache_sync_readahead
=> generic_file_read_iter
=> new_sync_read
=> vfs_read
=> SyS_read
=> system_call_fastpath
[...]
You can also add tracepoint filters. To see what variables you can use, use -v:
# ./tpoint -v block:block_rq_issue
name: block_rq_issue
ID: 942
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:dev_t dev; offset:8; size:4; signed:0;
field:sector_t sector; offset:16; size:8; signed:0;
field:unsigned int nr_sector; offset:24; size:4; signed:0;
field:unsigned int bytes; offset:28; size:4; signed:0;
field:char rwbs[8]; offset:32; size:8; signed:1;
field:char comm[16]; offset:40; size:16; signed:1;
field:__data_loc char[] cmd; offset:56; size:4; signed:1;
print fmt: "%d,%d %s %u (%s) %llu + %u [%s]", ((unsigned int) ((REC->dev) >> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), REC->rwbs, REC->bytes, __get_str(cmd), (unsigned long long)REC->sector, REC->nr_sector, REC->comm
Now I'll add a filter to check that the rwbs field (I/O type) includes an "R",
making it a read:
# ./tpoint -s block:block_rq_insert 'rwbs ~ "*R*"'
cksum-11908 [000] d... 7269839.919098: block_rq_insert: 202,1 R 0 () 736560 + 136 [cksum]
cksum-11908 [000] d... 7269839.919107:
=> __elv_add_request
=> blk_flush_plug_list
=> blk_finish_plug
=> __do_page_cache_readahead
=> ondemand_readahead
=> page_cache_async_readahead
=> generic_file_read_iter
=> new_sync_read
=> vfs_read
=> SyS_read
=> system_call_fastpath
[...]
Use -h to print the USAGE message:
# ./tpoint -h
USAGE: tpoint [-hHsv] [-d secs] [-p PID] [-L TID] tracepoint [filter]
tpoint -l
-d seconds # trace duration, and use buffers
-p PID # PID to match on events
-L TID # thread id to match on events
-v # view format file (don't trace)
-H # include column headers
-l # list all tracepoints
-s # show kernel stack traces
-h # this usage message
Note that these examples may need modification to match your kernel
version's function names and platform's register usage.
eg,
tpoint -l | grep open
# find tracepoints containing "open"
tpoint syscalls:sys_enter_open
# trace open() syscall entry
tpoint block:block_rq_issue
# trace block I/O issue
tpoint -s block:black_rq_issue
# show kernel stacks
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/examples/uprobe_example.txt 0000664 0000000 0000000 00000034766 13614503575 0025706 0 ustar 00root root 0000000 0000000 Demonstrations of uprobe, the Linux ftrace version.
Trace the readline() function from all processes named "bash":
# ./uprobe p:bash:readline
Tracing uprobe readline (p:readline /bin/bash:0x8db60). Ctrl-C to end.
bash-11886 [003] d... 19601233.618462: readline: (0x48db60)
bash-11886 [003] d... 19601235.152067: readline: (0x48db60)
bash-11915 [003] d... 19601238.976244: readline: (0x48db60)
^C
Ending tracing...
readline() is the bash shell's function for reading interactive input, and
a line is printed each time I entered commands in separate bash shells.
The line contains default ftrace columns: the process name, "-", and PID;
the CPU, flags, a timestamp (in units of seconds), the probe name, then
other arguments. These columns are documented in the kernel source, under
Documentation/trace/ftrace.txt.
The first line of output is informational, and shows what uprobe is really
doing: it turned "bash" into "/bin/bash", using a $PATH lookup (via which(1)).
It then turned the "readline" symbol into 0x8db60, using objdump(1) for
symbol lookups.
Note that this traces _all_ bash processes simultaneously.
Tracing PID 11886 only:
# ./uprobe -p 11886 p:bash:readline
Tracing uprobe readline (p:readline /bin/bash:0x8db60). Ctrl-C to end.
bash-11886 [002] d... 19601657.753893: readline: (0x48db60)
bash-11886 [002] d... 19601658.246613: readline: (0x48db60)
bash-11886 [002] d... 19601658.386666: readline: (0x48db60)
bash-11886 [002] d... 19601661.415952: readline: (0x48db60)
^C
Ending tracing...
This may be important if you are tracing shared library functions, and only care
about one target process.
You can specify the full path to a binary to trace:
# ./uprobe p:/bin/bash:readline
Tracing uprobe readline (p:readline /bin/bash:0x8db60). Ctrl-C to end.
bash-11886 [002] d... 19601746.902461: readline: (0x48db60)
bash-11886 [002] d... 19601749.543485: readline: (0x48db60)
bash-11886 [001] d... 19601749.702369: readline: (0x48db60)
^C
Ending tracing...
This might be useful if uprobe picked the wrong binary to trace, as shown by
the informational line, and you wanted to specify it directly. It is also useful
for tracing binaries not in the $PATH, which uprobe can't otherwise find.
Use -l to list symbols available to trace; eg, searching for functions
containing "readline" in bash:
# ./uprobe -l bash | grep readline
initialize_readline
pcomp_set_readline_variables
posix_readline_initialize
readline
readline_internal_char
readline_internal_setup
readline_internal_teardown
Tracing the return of readline() with return value as a string:
# ./uprobe 'r:bash:readline +0($retval):string'
Tracing uprobe readline (r:readline /bin/bash:0x8db60 +0($retval):string). Ctrl-C to end.
bash-11886 [003] d... 19601837.001935: readline: (0x41e876 <- 0x48db60) arg1="ls -l"
bash-11886 [002] d... 19601851.008409: readline: (0x41e876 <- 0x48db60) arg1="echo "hello world""
bash-11886 [002] d... 19601854.099730: readline: (0x41e876 <- 0x48db60) arg1="df -h"
bash-11886 [002] d... 19601858.805740: readline: (0x41e876 <- 0x48db60) arg1="cd .."
bash-11886 [003] d... 19601898.378753: readline: (0x41e876 <- 0x48db60) arg1="foo bar"
^C
Ending tracing...
Now I can see the commands entered. Note that this traces what bash reads in,
even if the command eventually fails. Eg, the last command "foo bar" didn't
work (No command 'foo' found).
Note that this invocation now uses "r:" at the start of the probe description,
instead of "p:". r is for return probes, p for entry probes.
Tracing sleep() calls in all running libc shared libraries:
# ./uprobe p:libc:sleep
Tracing uprobe sleep (p:sleep /lib/x86_64-linux-gnu/libc-2.15.so:0xbf130). Ctrl-C to end.
svscan-2134 [000] d... 19602402.959904: sleep: (0x7f2dba562130)
cron-923 [000] d... 19602404.640507: sleep: (0x7f3e26d9e130)
cron-923 [002] d... 19602404.655232: sleep: (0x7f3e26d9e130)
cron-923 [002] d... 19602405.189271: sleep: (0x7f3e26d9e130)
svscan-2134 [000] d... 19602407.959947: sleep: (0x7f2dba562130)
[...]
This shows different programs calling sleep -- likely threads waiting for work.
I ran a "sleep 1" command in a bash shell, which wasn't seen above: probably
using a different sleep library call, which I'd need to trace separately.
Including headers (-H):
# ./uprobe -H p:libc:sleep
Tracing uprobe sleep (p:sleep /lib/x86_64-linux-gnu/libc-2.15.so:0xbf130). Ctrl-C to end.
# tracer: nop
#
# entries-in-buffer/entries-written: 0/0 #P:4
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
svscan-2134 [000] d... 19603052.976770: sleep: (0x7f2dba562130)
svscan-2134 [002] d... 19603057.976927: sleep: (0x7f2dba562130)
[...]
These are documented in Documentation/trace/ftrace.txt.
Tracing sleep() with its argument (seconds):
# ./uprobe 'p:libc:sleep %di'
Tracing uprobe sleep (p:sleep /lib/x86_64-linux-gnu/libc-2.15.so:0xbf130 %di). Ctrl-C to end.
svscan-2134 [002] d... 19602517.962925: sleep: (0x7f2dba562130) arg1=0x5
svscan-2134 [002] d... 19602522.963082: sleep: (0x7f2dba562130) arg1=0x5
cron-923 [002] d... 19602524.187733: sleep: (0x7f3e26d9e130) arg1=0x3c
svscan-2134 [002] d... 19602527.963267: sleep: (0x7f2dba562130) arg1=0x5
[...]
So svcan was sleeping for 5 seconds, and cron for 60 seconds (0x3c = 60).
The argument is specified by its register, %di. This is platform dependent: %di
may only be meaningful on x86. If you're on a different architecture (eg, ARM),
you will probably need to use something else.
If working with registers is not for you, then consider tracing this using
perf_events with debuginfo installed: in which case you can use the variable
names. Or consider a different tracer.
Here is an example of the optional filter expression, to only trace the return
of fopen() when it failed and returned NULL (0):
# ./uprobe 'r:libc:fopen file=$retval' 'file == 0'
Tracing uprobe fopen (r:fopen /lib/x86_64-linux-gnu/libc-2.15.so:0x6e540 file=$retval). Ctrl-C to end.
prog1-23982 [000] d... 19602894.346872: fopen: (0x40051e <- 0x7f637867f540) file=0x0
^C
Ending tracing...
The argument $retval was given a vanity name "file", which was then tested in
the filter expression "file == 0".
Here's an example of tracing the MySQL server dispatch_command() function, along
with the query string (note: the %dx register is only valid for this
architecture and this software build):
# ./uprobe 'p:dispatch_command /opt/mysql/bin/mysqld:_Z16dispatch_command19enum_server_commandP3THDPcj +0(%dx):string'
Tracing uprobe dispatch_command (p:dispatch_command /opt/mysql/bin/mysqld:0x2dbd40 +0(%dx):string). Ctrl-C to end.
mysqld-2855 [001] d... 19956674.509085: dispatch_command: (0x6dbd40) arg1="show tables"
mysqld-2855 [001] d... 19956675.541155: dispatch_command: (0x6dbd40) arg1="SELECT * FROM numbers where number > 32000"
^C
Ending tracing...
The function name, "_Z16dispatch_command19enum_server_commandP3THDPcj", is the
C++ mangled symbol.
I can name the query string argument "cmd" then test it in a filter; eg, to only
match queries that begin with "SELECT":
# ./uprobe 'p:dispatch_command /opt/mysql/bin/mysqld:_Z16dispatch_command19enum_server_commandP3THDPcj cmd=+0(%dx):string' 'cmd ~ "SELECT*"'
Tracing uprobe dispatch_command (p:dispatch_command /opt/mysql/bin/mysqld:0x2dbd40 cmd=+0(%dx):string). Ctrl-C to end.
mysqld-2855 [001] d... 19956754.619958: dispatch_command: (0x6dbd40) cmd="SELECT * FROM numbers where number > 32000"
mysqld-2855 [001] d... 19956755.060125: dispatch_command: (0x6dbd40) cmd="SELECT * FROM numbers where number > 32000"
^C
Ending tracing...
Overhead is relative to the rate of events: a higher rate of traced events,
means uprobe costs higher overhead. If you are unsure of the rate of events,
you can capture a set number only, or trace for a limited duration only (covered
in the next example). To trace a set number only, you can pipe into head, eg:
# ./uprobe -p 11982 p:bash:sh_malloc | head -15
Tracing uprobe sh_malloc (p:sh_malloc /bin/bash:0xaafa0). Ctrl-C to end.
bash-11982 [001] d... 19643121.529484: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529493: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529506: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529510: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529519: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529521: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529523: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529525: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529531: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529533: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529536: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529541: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529546: sh_malloc: (0x4aafa0)
bash-11982 [001] d... 19643121.529549: sh_malloc: (0x4aafa0)
uprobe traps SIGPIPE, so that it properly exits and cleans up probes when used
in this fashion.
Note the timestamps: by examining the rate they are increasing, you can have
some estimation for the rate of events. In this case, the 15 events all
happened within the same millisecond (the timestamp column is in units of
seconds), which suggests these are frequent events.
The -d option can be used to specify a duration for tracing, which also causes
uprobe to perform in-kernel buffering, which reduces the overhead of tracing:
# ./uprobe -d 5 p:libc:gettimeofday
Tracing uprobe gettimeofday for 5 seconds (buffered)...
sleep-12743 [001] d... 19642858.943440: gettimeofday: (0x7f400138ac10)
rotatelog-12744 [000] d... 19642858.955665: gettimeofday: (0x7f0ba34ebc10)
rotatelog-12745 [003] d... 19642858.956425: gettimeofday: (0x7f1e6db20c10)
rotatelog-12744 [000] d... 19642858.956924: gettimeofday: (0x7f0ba34ebc10)
rotatelog-12745 [003] d... 19642858.957608: gettimeofday: (0x7f1e6db20c10)
rotatelog-12744 [001] d... 19642858.958005: gettimeofday: (0x7fd8a1d64c10)
rotatelog-12744 [003] d... 19642858.959496: gettimeofday: (0x7f9531acdc10)
mkdir-12746 [002] d... 19642858.959542: gettimeofday: (0x7fd539474c10)
chown-12747 [001] d... 19642858.961455: gettimeofday: (0x7ff5646afc10)
rotatelog-12745 [000] d... 19642858.963065: gettimeofday: (0x7f406aca7c10)
rotatelog-12745 [001] d... 19642858.964280: gettimeofday: (0x7f6548debc10)
rotatelog-12749 [000] d... 19642859.977462: gettimeofday: (0x7fecaf7e1c10)
rotatelog-12750 [003] d... 19642859.977697: gettimeofday: (0x7f821eb3cc10)
rotatelog-12749 [000] d... 19642859.978707: gettimeofday: (0x7fecaf7e1c10)
[...]
You will not see live output during the -d mode, as it is being buffered
in-kernel.
Tracing func_abc() in my test program, and including user-level stacks:
# ./uprobe -s p:/root/func_abc:func_c
Tracing uprobe func_c (p:func_c /root/func_abc:0x4f4). Ctrl-C to end.
func_abc-25394 [000] d... 19603250.054040: func_c: (0x4004f4)
func_abc-25394 [000] d... 19603250.054056:
=> <00000000004004f4>
=> <0000000000400527>
=> <0000000000400537>
=> <00007fca9f0e376d>
func_abc-25394 [000] d... 19603251.054250: func_c: (0x4004f4)
func_abc-25394 [000] d... 19603251.054266:
=> <00000000004004f4>
=> <0000000000400527>
=> <0000000000400537>
=> <00007fca9f0e376d>
^C
Ending tracing...
The output has the raw hex addresses. If this is too much of a nuisance, then
try tracing this using perf_events which should automate the translation.
It can get worse, eg:
l# ./uprobe -s p:bash:readline
Tracing uprobe readline (p:readline /bin/bash:0x8db60). Ctrl-C to end.
bash-11886 [002] d... 19603434.397818: readline: (0x48db60)
bash-11886 [002] d... 19603434.397832:
=> <000000000048db60>
bash-11886 [002] d... 19603434.592500: readline: (0x48db60)
bash-11886 [002] d... 19603434.592510:
=> <000000000048db60>
^C
Ending tracing...
Here the stack trace is missing (0x48db60 is the traced function, transposed
from the base load address). This is due to compiler optimizations. It can be
fixed by recompiling with -fno-omit-frame-pointer, or, using perf_events and
a different method of stack walking.
Use -h to print the USAGE message:
# ./uprobe -h
USAGE: uprobe [-FhHsv] [-d secs] [-p PID] [-L TID] {-l target |
uprobe_definition [filter]}
-F # force. trace despite warnings.
-d seconds # trace duration, and use buffers
-l target # list functions from this executable
-p PID # PID to match on events
-L TID # thread id to match on events
-v # view format file (don't trace)
-H # include column headers
-s # show user stack traces
-h # this usage message
Note that these examples may need modification to match your kernel
version's function names and platform's register usage.
eg,
# trace readline() calls in all running "bash" executables:
uprobe p:bash:readline
# trace readline() with explicit executable path:
uprobe p:/bin/bash:readline
# trace the return of readline() with return value as a string:
uprobe 'r:bash:readline +0($retval):string'
# trace sleep() calls in all running libc shared libraries:
uprobe p:libc:sleep
# trace sleep() with register %di (x86):
uprobe 'p:libc:sleep %di'
# trace this address (use caution: must be instruction aligned):
uprobe p:libc:0xbf130
# trace gettimeofday() for PID 1182 only:
uprobe -p 1182 p:libc:gettimeofday
# trace the return of fopen() only when it returns NULL:
uprobe 'r:libc:fopen file=$retval' 'file == 0'
See the man page and example file for more info.
perf-tools-unstable-1.0.1~20200130+git49b8cdf/execsnoop 0000775 0000000 0000000 00000020761 13614503575 0022237 0 ustar 00root root 0000000 0000000 #!/bin/bash
#
# execsnoop - trace process exec() with arguments.
# Written using Linux ftrace.
#
# This shows the execution of new processes, especially short-lived ones that
# can be missed by sampling tools such as top(1).
#
# USAGE: ./execsnoop [-hrt] [-n name]
#
# REQUIREMENTS: FTRACE and KPROBE CONFIG, sched:sched_process_fork tracepoint,
# and either the sys_execve, stub_execve or do_execve kernel function. You may
# already have these on recent kernels. And awk.
#
# This traces exec() from the fork()->exec() sequence, which means it won't
# catch new processes that only fork(). With the -r option, it will also catch
# processes that re-exec. It makes a best-effort attempt to retrieve the program
# arguments and PPID; if these are unavailable, 0 and "[?]" are printed
# respectively. There is also a limit to the number of arguments printed (by
# default, 8), which can be increased using -a.
#
# This implementation is designed to work on older kernel versions, and without
# kernel debuginfo. It works by dynamic tracing an execve kernel function to
# read the arguments from the %si register. The sys_execve function is tried
# first, then stub_execve and do_execve. The sched:sched_process_fork
# tracepoint is used to get the PPID. This program is a workaround that should be
# improved in the future when other kernel capabilities are made available. If
# you need a more reliable tool now, then consider other tracing alternatives
# (eg, SystemTap). This tool is really a proof of concept to see what ftrace can
# currently do.
#
# From perf-tools: https://github.com/brendangregg/perf-tools
#
# See the execsnoop(8) man page (in perf-tools) for more info.
#
# COPYRIGHT: Copyright (c) 2014 Brendan Gregg.
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
#
# (http://www.gnu.org/copyleft/gpl.html)
#
# 07-Jul-2014 Brendan Gregg Created this.
### default variables
tracing=/sys/kernel/debug/tracing
flock=/var/tmp/.ftrace-lock; wroteflock=0
opt_duration=0; duration=; opt_name=0; name=; opt_time=0; opt_reexec=0
opt_argc=0; argc=8; max_argc=16; ftext=
trap ':' INT QUIT TERM PIPE HUP # sends execution to end tracing section
function usage {
cat <<-END >&2
USAGE: execsnoop [-hrt] [-a argc] [-d secs] [name]
-d seconds # trace duration, and use buffers
-a argc # max args to show (default 8)
-r # include re-execs
-t # include time (seconds)
-h # this usage message
name # process name to match (REs allowed)
eg,
execsnoop # watch exec()s live (unbuffered)
execsnoop -d 1 # trace 1 sec (buffered)
execsnoop grep # trace process names containing grep
execsnoop 'udevd$' # process names ending in "udevd"
See the man page and example file for more info.
END
exit
}
function warn {
if ! eval "$@"; then
echo >&2 "WARNING: command failed \"$@\""
fi
}
function end {
# disable tracing
echo 2>/dev/null
echo "Ending tracing..." 2>/dev/null
cd $tracing
warn "echo 0 > events/kprobes/$kname/enable"
warn "echo 0 > events/sched/sched_process_fork/enable"
warn "echo -:$kname >> kprobe_events"
warn "echo > trace"
(( wroteflock )) && warn "rm $flock"
}
function die {
echo >&2 "$@"
exit 1
}
function edie {
# die with a quiet end()
echo >&2 "$@"
exec >/dev/null 2>&1
end
exit 1
}
### process options
while getopts a:d:hrt opt
do
case $opt in
a) opt_argc=1; argc=$OPTARG ;;
d) opt_duration=1; duration=$OPTARG ;;
r) opt_reexec=1 ;;
t) opt_time=1 ;;
h|?) usage ;;
esac
done
shift $(( $OPTIND - 1 ))
if (( $# )); then
opt_name=1
name=$1
shift
fi
(( $# )) && usage
### option logic
(( opt_pid && opt_name )) && die "ERROR: use either -p or -n."
(( opt_pid )) && ftext=" issued by PID $pid"
(( opt_name )) && ftext=" issued by process name \"$name\""
(( opt_file )) && ftext="$ftext for filenames containing \"$file\""
(( opt_argc && argc > max_argc )) && die "ERROR: max -a argc is $max_argc."
if (( opt_duration )); then
echo "Tracing exec()s$ftext for $duration seconds (buffered)..."
else
echo "Tracing exec()s$ftext. Ctrl-C to end."
fi
### select awk
if (( opt_duration )); then
[[ -x /usr/bin/mawk ]] && awk=mawk || awk=awk
else
# workarounds for mawk/gawk fflush behavior
if [[ -x /usr/bin/gawk ]]; then
awk=gawk
elif [[ -x /usr/bin/mawk ]]; then
awk="mawk -W interactive"
else
awk=awk
fi
fi
### check permissions
cd $tracing || die "ERROR: accessing tracing. Root user? Kernel has FTRACE?
debugfs mounted? (mount -t debugfs debugfs /sys/kernel/debug)"
### ftrace lock
[[ -e $flock ]] && die "ERROR: ftrace may be in use by PID $(cat $flock) $flock"
echo $$ > $flock || die "ERROR: unable to write $flock."
wroteflock=1
### build probe
if [[ -x /usr/bin/getconf ]]; then
bits=$(getconf LONG_BIT)
else
bits=64
[[ $(uname -m) == i* ]] && bits=32
fi
(( offset = bits / 8 ))
function makeprobe {
func=$1
kname=execsnoop_$func
kprobe="p:$kname $func"
i=0
while (( i < argc + 1 )); do
# p:kname do_execve +0(+0(%si)):string +0(+8(%si)):string ...
kprobe="$kprobe +0(+$(( i * offset ))(%si)):string"
(( i++ ))
done
}
# try in this order: sys_execve, stub_execve, do_execve
makeprobe sys_execve
### setup and begin tracing
echo nop > current_tracer
if ! echo $kprobe >> kprobe_events 2>/dev/null; then
makeprobe stub_execve
if ! echo $kprobe >> kprobe_events 2>/dev/null; then
makeprobe do_execve
if ! echo $kprobe >> kprobe_events 2>/dev/null; then
edie "ERROR: adding a kprobe for execve. Exiting."
fi
fi
fi
if ! echo 1 > events/kprobes/$kname/enable; then
edie "ERROR: enabling kprobe for execve. Exiting."
fi
if ! echo 1 > events/sched/sched_process_fork/enable; then
edie "ERROR: enabling sched:sched_process_fork tracepoint. Exiting."
fi
echo "Instrumenting $func"
(( opt_time )) && printf "%-16s " "TIMEs"
printf "%6s %6s %s\n" "PID" "PPID" "ARGS"
#
# Determine output format. It may be one of the following (newest first):
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# TASK-PID CPU# TIMESTAMP FUNCTION
# To differentiate between them, the number of header fields is counted,
# and an offset set, to skip the extra column when needed.
#
offset=$($awk 'BEGIN { o = 0; }
$1 == "#" && $2 ~ /TASK/ && NF == 6 { o = 1; }
$2 ~ /TASK/ { print o; exit }' trace)
### print trace buffer
warn "echo > trace"
( if (( opt_duration )); then
# wait then dump buffer
sleep $duration
cat -v trace
else
# print buffer live
cat -v trace_pipe
fi ) | $awk -v o=$offset -v opt_name=$opt_name -v name=$name \
-v opt_duration=$opt_duration -v opt_time=$opt_time -v kname=$kname \
-v opt_reexec=$opt_reexec '
# common fields
$1 != "#" {
# task name can contain dashes
comm = pid = $1
sub(/-[0-9][0-9]*/, "", comm)
sub(/.*-/, "", pid)
}
$1 != "#" && $(4+o) ~ /sched_process_fork/ {
cpid=$0
sub(/.* child_pid=/, "", cpid)
sub(/ .*/, "", cpid)
getppid[cpid] = pid
delete seen[pid]
}
$1 != "#" && $(4+o) ~ kname {
if (seen[pid])
next
if (opt_name && comm !~ name)
next
#
# examples:
# ... arg1="/bin/echo" arg2="1" arg3="2" arg4="3" ...
# ... arg1="sleep" arg2="2" arg3=(fault) arg4="" ...
# ... arg1="" arg2=(fault) arg3="" arg4="" ...
# the last example is uncommon, and may be a race.
#
if ($0 ~ /arg1=""/) {
args = comm " [?]"
} else {
args=$0
sub(/ arg[0-9]*=\(fault\).*/, "", args)
sub(/.*arg1="/, "", args)
gsub(/" arg[0-9]*="/, " ", args)
sub(/"$/, "", args)
if ($0 !~ /\(fault\)/)
args = args " [...]"
}
if (opt_time) {
time = $(3+o); sub(":", "", time)
printf "%-16s ", time
}
printf "%6s %6d %s\n", pid, getppid[pid], args
if (!opt_duration)
fflush()
if (!opt_reexec) {
seen[pid] = 1
delete getppid[pid]
}
}
$0 ~ /LOST.*EVENT[S]/ { print "WARNING: " $0 > "/dev/stderr" }
'
### end tracing
end
perf-tools-unstable-1.0.1~20200130+git49b8cdf/fs/ 0000775 0000000 0000000 00000000000 13614503575 0020710 5 ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/fs/cachestat 0000775 0000000 0000000 00000012473 13614503575 0022604 0 ustar 00root root 0000000 0000000 #!/bin/bash
#
# cachestat - show Linux page cache hit/miss statistics.
# Uses Linux ftrace.
#
# This is a proof of concept using Linux ftrace capabilities on older kernels,
# and works by using function profiling for in-kernel counters. Specifically,
# four kernel functions are traced:
#
# mark_page_accessed() for measuring cache accesses
# mark_buffer_dirty() for measuring cache writes
# add_to_page_cache_lru() for measuring page additions
# account_page_dirtied() for measuring page dirties
#
# It is possible that these functions have been renamed (or are different
# logically) for your kernel version, and this script will not work as-is.
# This script was written on Linux 3.13. This script is a sandcastle: the
# kernel may wash some away, and you'll need to rebuild.
#
# USAGE: cachestat [-Dht] [interval]
# eg,
# cachestat 5 # show stats every 5 seconds
#
# Run "cachestat -h" for full usage.
#
# WARNING: This uses dynamic tracing of kernel functions, and could cause
# kernel panics or freezes. Test, and know what you are doing, before use.
# It also traces cache activity, which can be frequent, and cost some overhead.
# The statistics should be treated as best-effort: there may be some error
# margin depending on unusual workload types.
#
# REQUIREMENTS: CONFIG_FUNCTION_PROFILER, awk.
#
# From perf-tools: https://github.com/brendangregg/perf-tools
#
# COPYRIGHT: Copyright (c) 2014 Brendan Gregg.
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
#
# (http://www.gnu.org/copyleft/gpl.html)
#
# 28-Dec-2014 Brendan Gregg Created this.
### default variables
tracing=/sys/kernel/debug/tracing
interval=1; opt_timestamp=0; opt_debug=0
trap 'quit=1' INT QUIT TERM PIPE HUP # sends execution to end tracing section
function usage {
cat <<-END >&2
USAGE: cachestat [-Dht] [interval]
-D # print debug counters
-h # this usage message
-t # include timestamp
interval # output interval in secs (default 1)
eg,
cachestat # show stats every second
cachestat 5 # show stats every 5 seconds
See the man page and example file for more info.
END
exit
}
function warn {
if ! eval "$@"; then
echo >&2 "WARNING: command failed \"$@\""
fi
}
function die {
echo >&2 "$@"
exit 1
}
### process options
while getopts Dht opt
do
case $opt in
D) opt_debug=1 ;;
t) opt_timestamp=1 ;;
h|?) usage ;;
esac
done
shift $(( $OPTIND - 1 ))
### option logic
if (( $# )); then
interval=$1
fi
echo "Counting cache functions... Output every $interval seconds."
### check permissions
cd $tracing || die "ERROR: accessing tracing. Root user? Kernel has FTRACE?
debugfs mounted? (mount -t debugfs debugfs /sys/kernel/debug)"
### enable tracing
sysctl -q kernel.ftrace_enabled=1 # doesn't set exit status
printf "mark_page_accessed\nmark_buffer_dirty\nadd_to_page_cache_lru\naccount_page_dirtied\n" > set_ftrace_filter || \
die "ERROR: tracing these four kernel functions: mark_page_accessed,"\
"mark_buffer_dirty, add_to_page_cache_lru and account_page_dirtied (unknown kernel version?). Exiting."
warn "echo nop > current_tracer"
if ! echo 1 > function_profile_enabled; then
echo > set_ftrace_filter
die "ERROR: enabling function profiling. Have CONFIG_FUNCTION_PROFILER? Exiting."
fi
(( opt_timestamp )) && printf "%-8s " TIME
printf "%8s %8s %8s %8s %12s %10s" HITS MISSES DIRTIES RATIO "BUFFERS_MB" "CACHE_MB"
(( opt_debug )) && printf " DEBUG"
echo
### summarize
quit=0; secs=0
while (( !quit && (!opt_duration || secs < duration) )); do
(( secs += interval ))
echo 0 > function_profile_enabled
echo 1 > function_profile_enabled
sleep $interval
(( opt_timestamp )) && printf "%(%H:%M:%S)T " -1
# cat both meminfo and trace stats, and let awk pick them apart
cat /proc/meminfo trace_stat/function* | awk -v debug=$opt_debug '
# match meminfo stats:
$1 == "Buffers:" && $3 == "kB" { buffers_mb = $2 / 1024 }
$1 == "Cached:" && $3 == "kB" { cached_mb = $2 / 1024 }
# identify and save trace counts:
$2 ~ /[0-9]/ && $3 != "kB" { a[$1] += $2 }
END {
mpa = a["mark_page_accessed"]
mbd = a["mark_buffer_dirty"]
apcl = a["add_to_page_cache_lru"]
apd = a["account_page_dirtied"]
total = mpa - mbd
misses = apcl - apd
if (misses < 0)
misses = 0
hits = total - misses
ratio = 100 * hits / total
printf "%8d %8d %8d %7.1f%% %12.0f %10.0f", hits, misses, mbd,
ratio, buffers_mb, cached_mb
if (debug)
printf " (%d %d %d %d)", mpa, mbd, apcl, apd
printf "\n"
}'
done
### end tracing
echo 2>/dev/null
echo "Ending tracing..." 2>/dev/null
warn "echo 0 > function_profile_enabled"
warn "echo > set_ftrace_filter"
perf-tools-unstable-1.0.1~20200130+git49b8cdf/images/ 0000775 0000000 0000000 00000000000 13614503575 0021545 5 ustar 00root root 0000000 0000000 perf-tools-unstable-1.0.1~20200130+git49b8cdf/images/perf-tools_2016.png 0000664 0000000 0000000 00000657665 13614503575 0025046 0 ustar 00root root 0000000 0000000 ‰PNG
IHDR Ü Yóºy sRGB ®Îé pHYs gŸÒR ÕiTXtXML:com.adobe.xmp
5
1
2
‹O² @ IDATxì˜%Åõ·k°8‹;Áa±@X,8,.ÁÝ݃[pw—àÜÝ‚.®‚-‡?žMHçW_ª§oO_é¾Ý³wfÞzž™îÛåoUK:uj@dÎá @ € @ € @ (Jànj󫢱‰@ € @ € @ €@î],8ƒ @ € @ € &€À½0:"B € @ € @ € º pïbÁ @ € @ € @ 0î…Ñ€ @ € @ € ÐE {Î @ € @ € @ … p/ŒŽˆ€ @ € @ € .Ü»Xp@ € @ € @ (L {atD„ @ € @ € t@àÞÅ‚3@ € @ € @ €@aÜ£#" @ € @ € @ ‹ ÷.œA € @ € @ €
@à^! @ € @ € @ ]¸w±à€ @ € @ € P˜ ÷Âèˆ@ € @ € @ è"€À½‹g€ @ € @ € ¸FGD@ € @ € @ €@î],8ƒ @ € @ € &€À½0:"B € @ € @ € º pïbÁ @ € @ € @ 0î…Ñ€ @ € @ € ÐE {Î @ € @ € @ … p/ŒŽˆ€ @ € @ € .Ü»Xp@ € @ € @ (L {atD„ @ € @ € t@àÞÅ‚3@ € @ € @ €@aÜ£#" @ € @ € @ ‹ ÷.œA € @ € @ €
@à^! @ € @ € @ ]¸w±à€ @ € @ € P˜ ÷Âèˆ@ € @ € @ è"€À½‹g€ @ € @ € ¸FGD@ € @ € @ €@î],8ƒ @ € @ € &€À½0:"B € @ € @ € º pïbÁ @ € @ € @ 0î…Ñ€ @ € @ € ÐE {Î @ € @ € @ … p/ŒŽˆ€ @ € @ € .Ü»Xp@ € @ € @ (L {atD„ @ € @ € t@àÞÅ‚3@ € @ € @ €@aÜ£#" @ € @ € @ ‹ ÷.œA € @ € @ €
@à^! @ € @ € @ ]¸w±à€ @ € @ € P˜ ÷Âèˆ@ € @ € @ è"€À½‹g€ @ € @ € ¸FGD@ € @ € @ €@î],8ƒ @ € @ € &€À½0:"B € @ € @ € º pïbÁ @ € @ € @ 0î…Ñ€ @ € @ € ÐE {Î @ € @ € @ … Œ^8&! ~IàÛo¿u/¿ü²›xâ‰Ýì³Ï^ƒŸ~úÉ=÷ÜsnÚi§õ¥%Ü
m•Á¸ãŽë¦šj*7É$“¸äÞ1á?ÿüswÛm·¹wÞyÇý÷¿ÿu3Í4“Ûzë;¦|exæ™gÜ7Þè^xá÷É'Ÿ¸#F¸üÑ
4ÈM1Åþ¹ºòÊ+»UW]Õ
80WÖ<Ÿsáêµ{¢åQÔ¯7 ë·Ü§Ÿ~ZYQõ=¦ï²Þâµuo©å¬%À7W-¼¿¸'ò#< ^@ ÂA ÈAà‘G‰ìѹæš9b5úÆoøt9äæ ÑÐVj¯"¿þõ¯£¡C‡FwÞygdë–òì”@'žxb4Î8ãøzÿæ7¿‰&tÒh‚ &è”âQ”B`äÈ‘ÑI'M3Í4ñ=®þ>óÌ3GK,±D´ÒJ+EƒŽ&›l²È&Ï|Ý×k¯½vdÀ–ËÀó¹eT½:`O´s£<Šúõè[mµU|y7‹c“m½C\ÆFmâ¤×à›«ý¦âžhŸ!)@ è ?Ø·›CÃ]p€ ú0yæ™Çm³Í6-×ð»ï¾óšxÒ˜5a»»ãŽ;¼v¸´ÅË\ÕÐrr¼å–[ÜÞ{ïíV[m5·ûî»»!C†82:NæL‰àè\=ôÛi§Ü믿î5Öµzcýõ×wK-µ”m´Ñºü£>r×\s»òÊ+Ý
7Üàn¿ýv·×^{9›ätcŽ9f·ð\€@o$pÎ9çøþ¿í¶ÛvTñ7ÜpCg“_-•éòË/wO?ý´Û~ûíÝœsÎÙRœyç·¥p‚@ÙzꛫSïí²y’ ôÜûN[R@ ™dJeçwÎôkvñí·ßv묳Ž{饗œÌQ6Ì™¶l³h£ÔÿüóÏwM4‘»úê«iûÆe‘о@à/ù‹ŸDûå—_ÜlàN9åo:¦Qݦžzj?¥É¨K.¹ÄŸ}ôÑî©§žr˜È”½Àé§Ÿî'X;Mà¾ôÒK;ýµâ4Ù-û*«¬âl•Y+QQF §¾¹:õÞeàÉ€ :ž ›¦v|Q@@ £Ž€™¦p=ö˜ô>ÜkËÎd';Ù°žcŽ9j„í\^Ê<N;í4gæ)¼ïõ×_﮺ꪦÂötú›o¾¹×Œ_h¡…܃>è–Yf÷Í7ߤƒñ€ Ð ß\
ñà @ ý˜ ÷~ÜøTDÀl»ûî»Ïm¶ÙfT,Êb¤ùzóÍ7»é§ŸÞk¸K»¶“Ý—_~é7|íä2R6!`û2¸=÷ÜÓ5ÖX^+Ýl±IÆÇ±}
Üý÷ßï]tÑØ|EáĈ }××b±$À7WApDƒ >O {Ÿob*ÞA@¦?–]vY7ãŒ3öŽ÷³RŽ>úèÞºª-;Ð8@ g üßÿýŸÛtÓMm`ìd
fÅWl» ô‚{¯½öZoß½íDI
}×× (xA € @ 2Ü+CK€ ú™±øÕ¯~åµaûVͨ
:ŸÀGá>øà7Ã3¸wܱ´O2É$nß}÷õéí¶ÛnîÇ,-m‚ @ € Ð pïN! Ó2SL1…û׿þ…½çüˆ¢¾ÿþ{wá…úèGu”ß²hZYñöØc7ÕTS9™¸ì²Ë²‚p
€ @ € Z$0z‹á@ RôœsÎ9nñÅwK.¹d·¼dkø©§žr»îº«“ùƒàÞzë-wæ™gºgžyƽóÎ;îÛo¿õfif™e7Ï<ó¸]vÙÅÉVq=W/ÝzáÓןþywçwºM6ÙÄM7ÝtÞûÉ'Ÿt<ð€›h¢‰
i¢¾ôÒKî¶ÛnsÓN;7!‘ÎsTþžl²ÉÜ?ÿùO'óL0A·¢üýï÷f)î½÷^îóÏ?wã?¾›rÊ)ÝÒK/í6Ø`·ðÂw‹—¼Ð¨M´qëã?î>úè#7räHwðÁ»#<2ÝýòË/îÍ7ßìv}µÕVó}¢&ðÿ~¼ûî»ÞTÎwÜᵈ?ùäo»^å^pÁÝzëç–_~y'Ó:Í\ÞòË.÷£>êvØa7ñÄûä5©qþùç;•GLÅQæ–~ûÛߺ5ÖXÃ÷76H;ñ9å”SÜO<áÞÿ}÷Ýwß¹YgÕÍ>ûìž½äq?üðƒ;÷ÜsýýõÞ{ï9ý©í%œ•MeÙõ×ʇ%–X¢n²õx´{ïÖËPÏ™G mùé§Ÿú¾:ÓL3y†*«6
sÌ1ë%Q÷zý»nâ
<´o‚¸/°Ànýõ×o²˜×Øcí?üp·õÖ[;mʺÝvÛ¹4ML÷Úí·ßâ
§öTŸ“†ü AƒÎsÏ=çîºë.oFHÏ£´+ã>M§™ü]f;7{×'óMž×‹—~þñÅ~Cáôu½»>ûì37ùä“û~žL;Ïù믿în¸á·È"‹x3xyâöDØ2ßeYå:ýtže?”Þ 'œà³Ñs®Ñ÷aº,Í~×ë£ÿùÏüD¦¾]^yåÿN×3A
óÎ;¯F®¼òÊ…&RËxž©^>ÓôÝ•~7ùæjÆRþé{¸Þ½w»¾-ë¹*ûl•i׫OÙ÷D½|¸@ %ˆp€ r0á`džhÍ5×Ì«yÐ7Þxç{È!‡dÞk¯½¼ÿ‡~èýMø»îº‘ #Û@02!n´óÎ;GÇ{ldf"ªùð&ìN<ñÄÌ4u1nÝ€uŸ_|±&›¬çó·Þêïs•KÏYõÛÐÕ÷õ9õ™à§°¸ë¯¿>Ò3FñlR'²M´ýsåøãLß¿wBŸ4Ás¤g\«®Ì6Ìs¯¾úê¾>&lkµ¨ÝÂÙÄ…O#}•yŸ*Óžhç¬ÏTù<Ï´ô½%–y¾¹Z…Χѽ]/Í2ûl:²ÓÎê?é<«¸'Òyð€ J!ðƒ½í
‰ƒ ƒ@B\eßìC39ÐÓ´‚ü€úÒK/+¼}õÕW½ðZº“O>9³–Ét34¹˜%pW”×^{Í—O“/¼ðB“Tº¼ƒ`åôÓOïºXð¬ì¶’ V,50L:M0¡’dõ„Ò¦¥™f±OCáz‚ßd›hPyàú8šˆPù¡‡òÂfÓ&K#>×d…„+Íœi…Fm´‘O{•UV‰ÂdN:ž„²_|±ŸØ1´è±ÇK©ù·üI»ðùˆiRÖ¤«¦=Ol˜vydÚÛ>Œ’j›7Þ8’@!í~úé§èÐCõa4èoæ4Ù ôæŸþÈ4këבš°€ULÓ.É£Ì{7™lñ<*órË-™&~Ò»æ\÷…mÐìë'ÁV3Wvÿn–_Ú_m®>m&ºMt¥Ã¶ûÛ´°=—ôó'ý|Ö—šèÔsV÷i–³•^p.A‰ú_½pŠ{ÑEy¡äˆ$¤2ɬ$#õå³Ï>;2Me?ác«Y2Ã%/–݆É>Ýìe+