Home | History | Annotate | Download | only in doc
      1 <?xml version="1.0" encoding='ISO-8859-1'?>
      2 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
      3 
      4 <book id="oprofile-guide">
      5 <bookinfo>
      6 	<title>OProfile manual</title>
      7  
      8 	<authorgroup>
      9 		<author>
     10 			<firstname>John</firstname>
     11 			<surname>Levon</surname>
     12 			<affiliation>
     13 				<address><email>levon (a] movementarian.org</email></address>
     14 			</affiliation>
     15 		</author>
     16 	</authorgroup>
     17 
     18 	<copyright>
     19 		<year>2000-2004</year>
     20 		<holder>Victoria University of Manchester, John Levon and others</holder>
     21 	</copyright>
     22 </bookinfo>
     23 
     24 <toc></toc>
     25 
     26 <chapter id="introduction">
     27 <title>Introduction</title>
     28 
     29 <para>
     30 This manual applies to OProfile version <oprofileversion />.
     31 OProfile is a profiling system for Linux 2.2/2.4/2.6 systems on a number of architectures. It is capable of profiling
     32 all parts of a running system, from the kernel (including modules and interrupt handlers) to shared libraries
     33 to binaries. It runs transparently in the background collecting information at a low overhead. These
     34 features make it ideal for profiling entire systems to determine bottle necks in real-world systems.
     35 </para>
     36 <para>
     37 Many CPUs provide "performance counters", hardware registers that can count "events"; for example,
     38 cache misses, or CPU cycles. OProfile provides profiles of code based on the number of these occurring events:
     39 repeatedly, every time a certain (configurable) number of events has occurred, the PC value is recorded.
     40 This information is aggregated into profiles for each binary image.</para>
     41 <para>
     42 Some hardware setups do not allow OProfile to use performance counters: in these cases, no
     43 events are available, and OProfile operates in timer/RTC mode, as described in later chapters.
     44 </para>
     45 <sect1 id="applications">
     46 <title>Applications of OProfile</title>
     47 <para>
     48 OProfile is useful in a number of situations. You might want to use OProfile when you :
     49 </para>
     50 <itemizedlist>
     51 <listitem><para>need low overhead</para></listitem>
     52 <listitem><para>cannot use highly intrusive profiling methods</para></listitem>
     53 <listitem><para>need to profile interrupt handlers</para></listitem>
     54 <listitem><para>need to profile an application and its shared libraries</para></listitem>
     55 <listitem><para>need to profile dynamically compiled code of supported virtual machines (see <xref linkend="jitsupport"/>)</para></listitem>
     56 <listitem><para>need to capture the performance behaviour of entire system</para></listitem>
     57 <listitem><para>want to examine hardware effects such as cache misses</para></listitem>
     58 <listitem><para>want detailed source annotation</para></listitem>
     59 <listitem><para>want instruction-level profiles</para></listitem>
     60 <listitem><para>want call-graph profiles</para></listitem>
     61 </itemizedlist>
     62 <para>
     63 OProfile is not a panacea. OProfile might not be a complete solution when you :
     64 </para>
     65 <itemizedlist>
     66 <listitem><para>require call graph profiles on platforms other than 2.6/x86</para></listitem>
     67 <listitem><para>don't have root permissions</para></listitem>
     68 <listitem><para>require 100% instruction-accurate profiles</para></listitem>
     69 <listitem><para>need function call counts or an interstitial profiling API</para></listitem>
     70 <listitem><para>cannot tolerate any disturbance to the system whatsoever</para></listitem>
     71 <listitem><para>need to profile interpreted or dynamically compiled code of non-supported virtual machines</para></listitem>
     72 </itemizedlist>
     73 <sect2 id="jitsupport">
     74 <title>Support for dynamically compiled (JIT) code</title>
     75 <para>
     76 Older versions of OProfile were not capable of attributing samples to symbols from dynamically
     77 compiled code, i.e. "just-in-time (JIT) code". Typical JIT compilers load the JIT code into
     78 anonymous memory regions. OProfile reported the samples from such code, but the attribution
     79 provided was simply:
     80         <screen>"anon: &lt;tgid&gt;&lt;address range&gt;" </screen>
     81 Due to this limitation, it wasn't possible to profile applications executed by virtual machines (VMs)
     82 like the Java Virtual Machine. OProfile now contains an infrastructure to support JITed code.
     83 A development library is provided to allow developers
     84 to add support for any VM that produces dynamically compiled code (see the <emphasis>OProfile JIT agent
     85 developer guide</emphasis>).
     86 In addition, built-in support is included for the following:</para>
     87 <itemizedlist><listitem>JVMTI agent library for Java (1.5 and higher)</listitem>
     88 <listitem>JVMPI agent library for Java (1.5 and lower)</listitem>
     89 </itemizedlist>
     90 <para>
     91 For information on how to use OProfile's JIT support, see <xref linkend="setup-jit"/>.
     92 </para>
     93 </sect2>
     94 </sect1>
     95 
     96 <sect1 id="requirements">
     97 <title>System requirements</title>
     98 
     99 <variablelist>
    100 	<varlistentry>
    101 		<term>Linux kernel 2.2/2.4/2.6</term>
    102 		<listitem><para>
    103 			OProfile uses a kernel module that can be compiled for
    104 			2.2.11 or later and 2.4. 2.4.10 or above is required if you use the 
    105 			boot-time kernel option <option>nosmp</option>.  2.6 kernels are supported with the in-kernel
    106 			OProfile driver. Note that only 32-bit x86 and IA64 are supported on 2.2/2.4 kernels.
    107 			</para>
    108 
    109 			<para>
    110 			2.6 kernels are strongly recommended. Under 2.4, OProfile may cause system crashes if power
    111 			management is used, or the BIOS does not correctly deal with local APICs.
    112 			</para>
    113 
    114 			<para>
    115 			To use OProfile's JIT support, a kernel version 2.6.13 or later is required.
    116 			In earlier kernel versions, the anonymous memory regions are not reported to OProfile and results
    117 			in profiling reports without any samples in these regions.
    118 			</para>
    119 
    120 			<para>
    121 			PPC64 processors (Power4/Power5/PPC970, etc.) require a recent (&gt; 2.6.5) kernel with the line 
    122 			<constant>#define PV_970</constant> present in <filename>include/asm-ppc64/processor.h</filename>.
    123 <!-- FIXME: do we require always gte 2.4.10 for nosmp ? -->
    124                        </para>
    125                        <para>
    126                        Profiling the Cell Broadband Engine PowerPC Processing Element (PPE) requires a kernel version
    127                        of 2.6.18 or more recent.
    128                        Profiling the Cell Broadband Engine Synergistic Processing Element (SPE) requires a kernel version
    129                        of 2.6.22 or more recent.  Additionally, full support of SPE profiling requires a BFD library
    130                        from binutils code dated January 2007 or later.  To ensure the proper BFD support exists, run
    131                        the <code>configure</code> utility with <code>--with-target=cell-be</code>.
    132 
    133 		       Profiling the Cell Broadband Engine using SPU events requires a kernel version of 2.6.29-rc1
    134 		       or  more recent.
    135 
    136                        <note>Attempting to profile SPEs with kernel versions older than 2.6.22 may cause the
    137                        system to crash.</note>
    138                        </para>
    139 		
    140 			<para>
    141 			Instruction-Based Sampling (IBS) profile on AMD family10h processors requires 
    142 			kernel version 2.6.28-rc2 or later.
    143 			</para>
    144 		</listitem>
    145 	</varlistentry>
    146 	<varlistentry>
    147 		<term>modutils 2.4.6 or above</term>
    148 		<listitem><para>
    149 			You should have installed modutils 2.4.6 or higher (in fact earlier versions work well in almost all
    150 			cases).
    151 		</para></listitem>
    152 	</varlistentry>
    153 	<varlistentry>
    154 		<term>Supported architecture</term>
    155 		<listitem><para>
    156 			For Intel IA32, a CPU with either a P6 generation or Pentium 4 core is
    157 			required. In marketing terms this translates to anything
    158 			between an Intel Pentium Pro (not Pentium Classics) and
    159 			a Pentium 4 / Xeon, including all Celerons.  The AMD
    160 			Athlon, Opteron, Phenom, and Turion CPUs are also supported.  Other IA32
    161 			CPU types only support the RTC mode of OProfile; please
    162 			see later in this manual for details.  Hyper-threaded Pentium IVs
    163 			are not supported in 2.4. For 2.4 kernels, the Intel
    164 			IA-64 CPUs are also supported. For 2.6 kernels, there is additionally
    165 			support for Alpha processors, MIPS, ARM, x86-64, sparc64, ppc64, AVR32, and,
    166 			in timer mode, PA-RISC and s390.
    167 		</para></listitem>
    168 	</varlistentry>
    169 	<varlistentry>
    170 		<term>Uniprocessor or SMP</term>
    171 		<listitem><para>
    172 			SMP machines are fully supported.
    173 		</para></listitem>
    174 	</varlistentry>
    175 	<varlistentry>
    176 		<term>Required libraries</term>
    177 		<listitem><para>
    178 			These libraries are required : <filename>popt</filename>, <filename>bfd</filename>,
    179 			<filename>liberty</filename> (debian users: libiberty is provided in binutils-dev package), <filename>dl</filename>,
    180 			plus the standard C++ libraries.
    181 		</para></listitem>
    182 	</varlistentry>
    183 	<varlistentry>
    184 		<term>Required user account</term>
    185 		<listitem><para>
    186 			For secure processing of sample data from JIT virtual machines (e.g., Java),
    187 			the special user account "oprofile" must exist on the system.  The 'configure'
    188 			and 'make install' operations will print warning messages if this
    189 			account is not found.  If you intend to profile JITed code, you must create
    190 			a group account named 'oprofile' and then create the 'oprofile' user account,
    191 			setting the default group to 'oprofile'.  A runtime error message is printed to
    192 			the oprofile daemon log when processing JIT samples if this special user
    193 			account cannot be found.
    194 		</para></listitem>
    195 	</varlistentry>
    196 	<varlistentry>
    197 		<term>OProfile GUI</term>
    198 		<listitem><para>
    199 			The use of the GUI to start the profiler requires the <filename>Qt 2</filename> library. <filename>Qt 3</filename> should
    200 			also work.
    201 		</para></listitem>
    202 	</varlistentry>
    203 	<varlistentry>
    204  		<term><acronym>ELF</acronym></term>
    205 		<listitem><para>
    206 			Probably not too strenuous a requirement, but older <acronym>A.OUT</acronym> binaries/libraries are not supported.
    207 		</para></listitem>
    208 	</varlistentry>
    209 	<varlistentry>
    210 		<term>K&amp;R coding style</term>
    211 		<listitem><para>
    212 			OK, so it's not really a requirement, but I wish it was...
    213 		</para></listitem>
    214 	</varlistentry>
    215 </variablelist>
    216 
    217 
    218 </sect1>
    219 
    220 <sect1 id="resources">
    221 <title>Internet resources</title>
    222 
    223 <variablelist>
    224 	<varlistentry>
    225 		<term>Web page</term>
    226 		<listitem><para>
    227 			There is a web page (which you may be reading now) at
    228 			<ulink url="http://oprofile.sf.net/">http://oprofile.sf.net/</ulink>.
    229 		</para></listitem>
    230 	</varlistentry>
    231 	<varlistentry>
    232 		<term>Download</term>
    233 		<listitem><para>
    234 			You can download a source tarball or get anonymous CVS at the sourceforge page,
    235 			<ulink url="http://sf.net/projects/oprofile/">http://sf.net/projects/oprofile/</ulink>.
    236 		</para></listitem>
    237 	</varlistentry>
    238 	<varlistentry>
    239 		<term>Mailing list</term>
    240 		<listitem><para>
    241 			There is a low-traffic OProfile-specific mailing list, details at
    242 			<ulink url="http://sf.net/mail/?group_id=16191">http://sf.net/mail/?group_id=16191</ulink>.
    243 		</para></listitem>
    244 	</varlistentry>
    245 	<varlistentry>
    246 		<term>Bug tracker</term>
    247 		<listitem><para>
    248 			There is a bug tracker for OProfile at SourceForge,
    249 			<ulink url="http://sf.net/tracker/?group_id=16191&atid=116191">http://sf.net/tracker/?group_id=16191&atid=116191</ulink>.
    250 		</para></listitem>
    251 	</varlistentry>
    252 	<varlistentry>
    253 		<term>IRC channel</term>
    254 		<listitem><para>
    255 			Several OProfile developers and users sometimes hang out on channel <command>#oprofile</command>
    256 			on the <ulink url="http://oftc.net">OFTC</ulink> network. 
    257 		</para></listitem>
    258 	</varlistentry>
    259 </variablelist>
    260 
    261 </sect1>
    262 
    263 <sect1 id="install">
    264 <title>Installation</title>
    265 
    266 <para>
    267 First you need to build OProfile and install it. <command>./configure</command>, <command>make</command>, <command>make install</command>
    268 is often all you need, but note these arguments to <command>./configure</command> :
    269 </para>
    270 <variablelist>
    271 	<varlistentry>
    272 		<term><option>--with-linux</option></term>
    273 		<listitem><para>
    274 			Use this option to specify the location of the kernel source tree you wish
    275 			to compile against. The kernel module is built against this source and
    276 			will only work with a running kernel built from the same source with
    277 			exact same options, so it is important you specify this option if you need
    278 			to.
    279 		</para></listitem>
    280 	</varlistentry>
    281 	<varlistentry>
    282 		<term><option>--with-java</option></term>
    283 		<listitem>
    284 			<para>
    285 			Use this option if you need to profile Java applications.  Also, see
    286 			<xref linkend="requirements"/>, "Required user account".  This option
    287 			is used to specify the location of the Java Development Kit (JDK)
    288 			source tree you wish to use. This is necessary to get the interface description
    289 			of the JVMPI (or JVMTI) interface to compile the JIT support code successfully.
    290 			</para>
    291 			<note>
    292 				<para>
    293 				The Java Runtime Environment (JRE) does not include the development
    294 				files that are required to compile the JIT support code, so the full
    295 				JDK must be installed in order to use this option.
    296 				</para>
    297 			</note>
    298 			<para>
    299 			By default, the Oprofile JIT support libraries will be installed in
    300 			<filename>&lt;oprof_install_dir&gt;/lib/oprofile</filename>.  To build
    301 			and install OProfile and the JIT support libraries as 64-bit, you can
    302 			do something like the following:
    303 			<screen>
    304 			# CFLAGS="-m64" CXXFLAGS="-m64" ./configure \
    305 			--with-kernel-support --with-java={my_jdk_installdir} \
    306 			--libdir=/usr/local/lib64
    307 			</screen>
    308 			</para>
    309 			<note>
    310 				<para>
    311 				If you encounter errors building 64-bit, you should
    312 				install libtool 1.5.26 or later since that release of
    313 				libtool fixes known problems for certain platforms.
    314 				If you install libtool into a non-standard location,
    315 				you'll need to edit the invocation of 'aclocal' in
    316 				OProfile's autogen.sh as follows (assume an install
    317 				location of /usr/local):
    318 				</para>
    319 				<para>
    320 				<code>aclocal -I m4 -I /usr/local/share/aclocal</code>
    321 				</para>
    322 			</note> 
    323 		</listitem>
    324 	</varlistentry>
    325 	<varlistentry>
    326 		<term><option>--with-kernel-support</option></term>
    327 		<listitem><para>
    328 			Use this option with 2.6 and above kernels to indicate the 
    329 	    		kernel provides the OProfile device driver.
    330 		</para></listitem>
    331 	</varlistentry>
    332 	<varlistentry>
    333 		<term><option>--with-qt-dir/includes/libraries</option></term>
    334 		<listitem><para>
    335 			Specify the location of Qt headers and libraries. It defaults to searching in
    336 			<constant>$QTDIR</constant> if these are not specified.
    337 		</para></listitem>
    338 	</varlistentry>
    339 	<varlistentry id="disable-werror">
    340 		<term><option>--disable-werror</option></term>
    341 		<listitem><para>
    342 			Development versions of OProfile build by
    343 			default with <option>-Werror</option>. This option turns
    344 			<option>-Werror</option> off.
    345 		</para></listitem>
    346 	</varlistentry>
    347 	<varlistentry id="disable-optimization">
    348 		<term><option>--disable-optimization</option></term>
    349 		<listitem><para>
    350 			Disable the <option>-O2</option> compiler flag
    351 			(useful if you discover an OProfile bug and want to give a useful
    352 			back-trace etc.)
    353 		</para></listitem>
    354 	</varlistentry>
    355 </variablelist>
    356 <para>
    357 You'll need to have a configured kernel source for the current kernel
    358 to build the module for 2.4 kernels.  Since all distributions provide different kernels it's unlikely the running kernel match the configured source
    359 you installed. The safest way is to recompile your own kernel, run it and compile oprofile. It is also recommended that if you have a
    360 uniprocessor machine, you enable the local APIC / IO_APIC support for
    361 your kernel (this is automatically enabled for SMP kernels). With many BIOS, kernel &gt;= 2.6.9 and UP kernel it's not sufficient to enable the local APIC you must also turn it on explicitly at boot time by providing "lapic" option to the kernel. On
    362 machines with power management, such as laptops, the power management
    363 must be turned off when using OProfile with 2.4 kernels. The power management software
    364 in the BIOS cannot handle the non-maskable interrupts (NMIs) used by
    365 OProfile for data collection. If you use the NMI watchdog, be aware that
    366 the watchdog is disabled when profiling starts, and not re-enabled until the
    367 OProfile module is removed (or, in 2.6, when OProfile is not running). If you compile OProfile for
    368 a 2.2 kernel you must be root to compile the module. If you are using
    369 2.6 kernels or higher, you do not need kernel source, as long as the
    370 OProfile driver is enabled; additionally, you should not need to disable
    371 power management.
    372 </para>
    373 <para>
    374 Please note that you must save or have available the <filename>vmlinux</filename> file
    375 generated during a kernel compile, as OProfile needs it (you can use
    376 <option>--no-vmlinux</option>, but this will prevent kernel profiling).
    377 </para>
    378 
    379 </sect1>
    380 
    381 <sect1 id="uninstall">
    382 <title>Uninstalling OProfile</title>
    383 <para>
    384 You must have the source tree available to uninstall OProfile; a <command>make uninstall</command> will
    385 remove all installed files except your configuration file in the directory <filename>~/.oprofile</filename>.
    386 </para>
    387 </sect1>
    388 
    389 </chapter>
    390 
    391 <chapter id="overview"> 
    392 <title>Overview</title>
    393 
    394 <sect1 id="getting-started">
    395 <title>Getting started</title>
    396 <para>
    397 Before you can use OProfile, you must set it up. The minimum setup required for this
    398 is to tell OProfile where the <filename>vmlinux</filename> file corresponding to the
    399 running kernel is, for example :
    400 </para>
    401 <screen>opcontrol --vmlinux=/boot/vmlinux-`uname -r`</screen>
    402 <para>
    403 If you don't want to profile the kernel itself,
    404 you can tell OProfile you don't have a <filename>vmlinux</filename> file :
    405 </para>
    406 <screen>opcontrol --no-vmlinux</screen>
    407 <para>
    408 Now we are ready to start the daemon (<command>oprofiled</command>) which collects
    409 the profile data :
    410 </para>
    411 <screen>opcontrol --start</screen>
    412 <para>
    413 When I want to stop profiling, I can do so with :
    414 </para>
    415 <screen>opcontrol --shutdown</screen>
    416 <para>
    417 Note that unlike <command>gprof</command>, no instrumentation (<option>-pg</option>
    418 and <option>-a</option> options to <command>gcc</command>)
    419 is necessary.
    420 </para>
    421 <para>
    422 Periodically (or on <command>opcontrol --shutdown</command> or <command>opcontrol --dump</command>)
    423 the profile data is written out into the $SESSION_DIR/samples directory (by default at <filename>/var/lib/oprofile/samples</filename>).
    424 These profile files cover shared libraries, applications, the kernel (vmlinux), and kernel modules.
    425 You can clear the profile data (at any time) with <command>opcontrol --reset</command>.
    426 </para>
    427 <para>
    428 To place these sample database files in a specific directory instead of the default location (<filename>/var/lib/oprofile</filename>) use the <option>--session-dir=dir</option> option. You must also specify the <option>--session-dir</option> to tell the tools to continue using this directory. (In the future, we should allow this to be specified in an environment variable.) :
    429 </para>
    430 <screen>opcontrol --no-vmlinux --session-dir=/home/me/tmpsession</screen>
    431 <screen>opcontrol --start --session-dir=/home/me/tmpsession</screen>
    432 <para>
    433 You can get summaries of this data in a number of ways at any time. To get a summary of
    434 data across the entire system for all of these profiles, you can do :
    435 </para>
    436 <screen>opreport [--session-dir=dir]</screen>
    437 <para>
    438 Or to get a more detailed summary, for a particular image, you can do something like :
    439 </para>
    440 <screen>opreport -l /boot/vmlinux-`uname -r`</screen>
    441 <para>
    442 There are also a number of other ways of presenting the data, as described later in this manual.
    443 Note that OProfile will choose a default profiling setup for you. However, there are a number
    444 of options you can pass to <command>opcontrol</command> if you need to change something,
    445 also detailed later.
    446 </para>
    447 
    448 </sect1>
    449 
    450 <sect1 id="tools-overview">
    451 <title>Tools summary</title>
    452 <para>
    453 This section gives a brief description of the available OProfile utilities and their purpose.
    454 </para>
    455 <variablelist>
    456 <varlistentry>
    457 	<term><filename>ophelp</filename></term>
    458 	<listitem><para>
    459 		This utility lists the available events and short descriptions.
    460 	</para></listitem>
    461 </varlistentry>
    462 	
    463 <varlistentry>
    464 	<term><filename>opcontrol</filename></term>
    465 	<listitem><para>
    466 		Used for controlling the OProfile data collection, discussed in <xref linkend="controlling" />.
    467 	</para></listitem>
    468 </varlistentry>
    469 
    470 <varlistentry>
    471 	<term><filename>agent libraries</filename></term>
    472 	<listitem><para>
    473 			Used by virtual machines (like the Java VM) to record information about JITed code being profiled. See <xref linkend="setup-jit" />.
    474 		</para></listitem>
    475 </varlistentry>
    476 
    477 <varlistentry>
    478 	<term><filename>opreport</filename></term>
    479 	<listitem><para>
    480 		This is the main tool for retrieving useful profile data, described in
    481 		<xref linkend="opreport" />.
    482 	</para></listitem>
    483 </varlistentry>
    484 
    485 <varlistentry>
    486 	<term><filename>opannotate</filename></term>
    487 	<listitem><para>
    488 		This utility can be used to produce annotated source, assembly or mixed source/assembly.
    489 		Source level annotation is available only if the application was compiled with 
    490 		debugging symbols. See <xref linkend="opannotate" />.
    491 	</para></listitem>
    492 </varlistentry>
    493 
    494 <varlistentry>
    495 	<term><filename>opgprof</filename></term>
    496 	<listitem><para>
    497 		This utility can output gprof-style data files for a binary, for use with
    498 		<command>gprof -p</command>. See <xref linkend="opgprof" />.
    499 	</para></listitem>
    500 </varlistentry>
    501 
    502 <varlistentry>
    503 	<term><filename>oparchive</filename></term>
    504 	<listitem><para>
    505 		This utility can be used to collect executables, debuginfo,
    506 		and sample files and copy the files into an archive.
    507 		The archive is self-contained and can be moved to another
    508 		machine for further analysis.
    509 		See <xref linkend="oparchive" />.
    510 	</para></listitem>
    511 </varlistentry>
    512 
    513 <varlistentry>
    514 	<term><filename>opimport</filename></term>
    515 	<listitem><para>
    516 		This utility converts sample database files from a foreign binary format (abi) to
    517 		the native format. This is useful only when moving sample files between hosts,
    518 		for analysis on platforms other than the one used for collection.
    519 		See <xref linkend="opimport" />.
    520 	</para></listitem>
    521 </varlistentry>
    522 
    523 </variablelist>
    524 </sect1>
    525 	
    526 </chapter>
    527  
    528 <chapter id="controlling">
    529 <title>Controlling the profiler</title>
    530 
    531 <sect1 id="controlling-daemon">
    532 <title>Using <command>opcontrol</command></title>
    533 <para>
    534 In this section we describe the configuration and control of the profiling system
    535 with opcontrol in more depth.
    536 The <command>opcontrol</command> script has a default setup, but you
    537 can alter this with the options given below. In particular,
    538 if your hardware supports performance counters, you can configure them.
    539 There are a number of counters (for example, counter 0 and counter 1
    540 on the Pentium III). Each of these counters can be programmed with
    541 an event to count, such as cache misses or MMX operations. The event
    542 chosen for each counter is reflected in the profile data collected
    543 by OProfile: functions and binaries at the top of the profiles reflect
    544 that most of the chosen events happened within that code.
    545 </para>
    546 <para>
    547 Additionally, each counter has a "count" value: this corresponds to how
    548 detailed the profile is. The lower the value, the more frequently profile
    549 samples are taken. A counter can choose to sample only kernel code, user-space code,
    550 or both (both is the default). Finally, some events have a "unit mask"
    551 - this is a value that further restricts the types of event that are counted. 
    552 The event types and unit masks for your CPU are listed by <command>opcontrol
    553 --list-events</command>.
    554 </para>
    555 <para>
    556 The <command>opcontrol</command> script provides the following actions :
    557 </para>
    558 <variablelist>
    559 	<varlistentry>
    560 		<term><option>--init</option></term>
    561 		<listitem><para>
    562 		Loads the OProfile module if required and makes the OProfile driver
    563 		interface available.
    564 		</para></listitem>
    565 	</varlistentry>
    566 	<varlistentry>
    567 		<term><option>--setup</option></term>
    568 		<listitem><para>
    569 		    Followed by list arguments for profiling set up. List of arguments
    570 		    saved in <filename>/root/.oprofile/daemonrc</filename>.
    571 		    Giving this option is not necessary; you can just directly pass one
    572 		    of the setup options, e.g. <command>opcontrol --no-vmlinux</command>.
    573 		  </para></listitem>
    574 	</varlistentry>
    575 	<varlistentry>
    576 		<term><option>--status</option></term>
    577 		<listitem><para>
    578 		Show configuration information.
    579 		</para></listitem>
    580 	</varlistentry>
    581 	<varlistentry>
    582 		<term><option>--start-daemon</option></term>
    583 		<listitem><para>
    584 		    Start the oprofile daemon without starting actual profiling. The profiling
    585 		can then be started using <option>--start</option>. This is useful for avoiding
    586 		measuring the cost of daemon startup, as <option>--start</option> is a simple
    587 		write to a file in oprofilefs. Not available in 2.2/2.4 kernels.
    588 		</para></listitem>
    589 	</varlistentry>
    590 	<varlistentry>
    591 		<term><option>--start</option></term>
    592 		<listitem><para>
    593 		    Start data collection with either arguments provided by <option>--setup</option>
    594 		or information saved in <filename>/root/.oprofile/daemonrc</filename>. Specifying
    595 		the addition <option>--verbose</option> makes the daemon generate lots of debug data
    596 		whilst it is running.
    597 		</para></listitem>
    598 	</varlistentry>
    599 	<varlistentry>
    600 		<term><option>--dump</option></term>
    601 		<listitem><para>
    602 		    Force a flush of the collected profiling data to the daemon.
    603 		</para></listitem>
    604 	</varlistentry>
    605 	<varlistentry>
    606 		<term><option>--stop</option></term>
    607 		<listitem><para>
    608 		    Stop data collection (this separate step is not possible with 2.2 or 2.4 kernels).
    609 		</para></listitem>
    610 	</varlistentry>
    611 	<varlistentry>
    612 		<term><option>--shutdown</option></term>
    613 		<listitem><para>
    614 		    Stop data collection and kill the daemon.
    615 		</para></listitem>
    616 	</varlistentry>
    617 	<varlistentry>
    618 		<term><option>--reset</option></term>
    619 		<listitem><para>
    620 		    Clears out data from current session, but leaves saved sessions.
    621 		</para></listitem>
    622 	</varlistentry>
    623 	<varlistentry>
    624 		<term><option>--save=</option>session_name</term>
    625 		<listitem><para>
    626 		    Save data from current session to session_name.
    627 		</para></listitem>
    628 	</varlistentry>
    629 	<varlistentry>
    630 		<term><option>--deinit</option></term>
    631 		<listitem><para>
    632                 Shuts down daemon. Unload the OProfile module and oprofilefs.
    633 		</para></listitem>
    634 	</varlistentry>
    635 	<varlistentry>
    636 		<term><option>--list-events</option></term>
    637 		<listitem><para>
    638 		    List event types and unit masks.
    639 		</para></listitem>
    640 	</varlistentry>
    641 	<varlistentry>
    642 		<term><option>--help</option></term>
    643 		<listitem><para>
    644 		    Generate usage messages.
    645 		</para></listitem>
    646 	</varlistentry>
    647 </variablelist>
    648 
    649 <para>
    650 There are a number of possible settings, of which, only
    651 <option>--vmlinux</option> (or <option>--no-vmlinux</option>)
    652 is required. These settings are stored in <filename>~/.oprofile/daemonrc</filename>.
    653 </para>
    654 <variablelist>
    655 	<varlistentry>
    656 		<term><option>--buffer-size=</option>num</term>
    657 		<listitem><para>
    658 		Number of samples in kernel buffer. When using a 2.6 kernel
    659 		buffer watershed need to be tweaked when changing this value.
    660 		</para></listitem>
    661 	</varlistentry>
    662 	<varlistentry>
    663 		<term><option>--buffer-watershed=</option>num</term>
    664 		<listitem><para>
    665 		Set kernel buffer watershed to num samples (2.6 only). When it'll remain only
    666 		buffer-size - buffer-watershed free entry in the kernel buffer data will be
    667 		flushed to daemon, most usefull value are in the range [0.25 - 0.5] * buffer-size.
    668 		</para></listitem>
    669 	</varlistentry>
    670 	<varlistentry>
    671 		<term><option>--cpu-buffer-size=</option>num</term>
    672 		<listitem><para>
    673 		Number of samples in kernel per-cpu buffer (2.6 only). If you
    674 		profile at high rate it can help to increase this if the log
    675 		file show excessive count of sample lost cpu buffer overflow. 
    676 		</para></listitem>
    677 	</varlistentry>
    678 	<varlistentry>
    679 		<term><option>--event=</option>[eventspec]</term>
    680 		<listitem><para>
    681 		Use the given performance counter event to profile.
    682 		See <xref linkend="eventspec" /> below.
    683 		</para></listitem>
    684 	</varlistentry>
    685 	<varlistentry>
    686 		<term><option>--session-dir=</option>dir_path</term>
    687 		<listitem><para>
    688 		    Create/use sample database out of directory <filename>dir_path</filename> instead of
    689 		the default location (/var/lib/oprofile).
    690 		</para></listitem>
    691 	</varlistentry>
    692 	<varlistentry>
    693 		<term><option>--separate=</option>[none,lib,kernel,thread,cpu,all]</term>
    694 		<listitem><para>
    695 		By default, every profile is stored in a single file. Thus, for example,
    696 		samples in the C library are all accredited to the <filename>/lib/libc.o</filename>
    697 		profile. However, you choose to create separate sample files by specifying
    698 		one of the below options.
    699 		</para>
    700 		<informaltable frame="all">
    701 		<tgroup cols='2'> 
    702 		<tbody>
    703 		<row><entry><option>none</option></entry><entry>No profile separation (default)</entry></row>
    704 		<row><entry><option>lib</option></entry><entry>Create per-application profiles for libraries</entry></row>
    705 		<row><entry><option>kernel</option></entry><entry>Create per-application profiles for the kernel and kernel modules</entry></row>
    706 		<row><entry><option>thread</option></entry><entry>Create profiles for each thread and each task</entry></row>
    707 		<row><entry><option>cpu</option></entry><entry>Create profiles for each CPU</entry></row>
    708 		<row><entry><option>all</option></entry><entry>All of the above options</entry></row>
    709 		</tbody>
    710 		</tgroup>
    711 		</informaltable>
    712 		<para>
    713 		Note  that <option>--separate=kernel</option> also turns on <option>--separate=lib</option>.
    714 		<!-- FIXME: update if this change -->
    715 		When using <option>--separate=kernel</option>, samples in hardware interrupts, soft-irqs, or other
    716 		asynchronous kernel contexts are credited to the task currently running. This means you will see
    717 		seemingly nonsense profiles such as <filename>/bin/bash</filename> showing samples for the PPP modules,
    718 		etc.
    719 		</para>
    720 		<para>
    721 		On 2.2/2.4 only kernel threads already started when profiling begins are correctly profiled;
    722 		newly started kernel thread samples are credited to the vmlinux (kernel) profile.
    723 		</para>
    724 		<para>
    725 		Using <option>--separate=thread</option> creates a lot
    726 		of sample files if you leave OProfile running for a while; it's most
    727 		useful when used for short sessions, or when using image filtering.
    728 		</para>
    729 		</listitem>
    730 	</varlistentry>
    731 	<varlistentry>
    732 		<term><option>--callgraph=</option>#depth</term>
    733 		<listitem><para>
    734 		Enable call-graph sample collection with a maximum depth. Use 0 to disable
    735 		callgraph profiling.  NOTE: Callgraph support is available on a limited
    736 		number of platforms at this time; for example:
    737 		<para>
    738 		<itemizedlist>
    739 		<listitem><para>x86 with recent 2.6 kernel</para></listitem>
    740 		<listitem><para>ARM with recent 2.6 kernel</para></listitem>
    741 		<listitem><para>PowerPC with 2.6.17 kernel</para></listitem>
    742 		</itemizedlist>
    743 		</para>
    744 		</para></listitem>
    745 	</varlistentry>
    746 	<varlistentry>
    747 		<term><option>--image=</option>image,[images]|"all"</term>
    748 		<listitem><para>
    749 		Image filtering. If you specify one or more absolute
    750 		paths to binaries, OProfile will only produce profile results for those
    751 		binary images. This is useful for restricting the sometimes voluminous
    752 		output you may get otherwise, especially with
    753 		<option>--separate=thread</option>. Note that if you are using
    754 		<option>--separate=lib</option> or
    755 		<option>--separate=kernel</option>, then if you specification an
    756 		application binary, the shared libraries and kernel code
    757 		<emphasis>are</emphasis> included. Specify the value
    758 		"all" to profile everything (the default).
    759 		</para></listitem>
    760 	</varlistentry>
    761 	<varlistentry>
    762 		<term><option>--vmlinux=</option>file</term>
    763 		<listitem><para>
    764 		vmlinux kernel image.
    765 		</para></listitem>
    766 	</varlistentry>
    767 	<varlistentry>
    768 		<term><option>--no-vmlinux</option></term>
    769 		<listitem><para>
    770 		Use this when you don't have a kernel vmlinux file, and you don't want
    771 		to profile the kernel. This still counts the total number of kernel samples,
    772 		but can't give symbol-based results for the kernel or any modules.
    773 		</para></listitem>
    774 	</varlistentry>
    775 </variablelist>
    776 
    777 <sect2 id="opcontrolexamples">
    778 <title>Examples</title>
    779 
    780 <sect3 id="examplesperfctr">
    781 <title>Intel performance counter setup</title>
    782 <para>
    783 Here, we have a Pentium III running at 800MHz, and we want to look at where data memory
    784 references are happening most, and also get results for CPU time.
    785 </para>
    786 <screen>
    787 # opcontrol --event=CPU_CLK_UNHALTED:400000 --event=DATA_MEM_REFS:10000
    788 # opcontrol --vmlinux=/boot/2.6.0/vmlinux
    789 # opcontrol --start
    790 </screen>
    791 </sect3>
    792 
    793 <sect3 id="examplesrtc">
    794 <title>RTC mode</title>
    795 <para>
    796 Here, we have an Intel laptop without support for performance counters, running on 2.4 kernels.
    797 </para>
    798 <screen>
    799 # ophelp -r
    800 CPU with RTC device
    801 # opcontrol --vmlinux=/boot/2.4.13/vmlinux --event=RTC_INTERRUPTS:1024
    802 # opcontrol --start
    803 </screen>
    804 </sect3>
    805 
    806 <sect3 id="examplesstartdaemon">
    807 <title>Starting the daemon separately</title>
    808 <para>
    809 If we're running 2.6 kernels, we can use <option>--start-daemon</option> to avoid
    810 the profiler startup affecting results.
    811 </para>
    812 <screen>
    813 # opcontrol --vmlinux=/boot/2.6.0/vmlinux
    814 # opcontrol --start-daemon
    815 # my_favourite_benchmark --init
    816 # opcontrol --start ; my_favourite_benchmark --run ; opcontrol --stop
    817 </screen>
    818 </sect3>
    819 
    820 <sect3 id="exampleseparate">
    821 <title>Separate profiles for libraries and the kernel</title>
    822 <para>
    823 Here, we want to see a profile of the OProfile daemon itself, including when
    824 it was running inside the kernel driver, and its use of shared libraries.
    825 </para>
    826 <screen>
    827 # opcontrol --separate=kernel --vmlinux=/boot/2.6.0/vmlinux
    828 # opcontrol --start
    829 # my_favourite_stress_test --run
    830 # opreport -l -p /lib/modules/2.6.0/kernel /usr/local/bin/oprofiled
    831 </screen>
    832 </sect3>
    833 
    834 <sect3 id="examplessessions">
    835 <title>Profiling sessions</title>
    836 <para>
    837 It can often be useful to split up profiling data into several different
    838 time periods. For example, you may want to collect data on an application's
    839 startup separately from the normal runtime data. You can use the simple
    840 command <command>opcontrol --save</command> to do this. For example :
    841 </para>
    842 <screen>
    843 # opcontrol --save=blah
    844 </screen>
    845 <para>
    846 will create a sub-directory in <filename>$SESSION_DIR/samples</filename> containing the samples
    847 up to that point (the current session's sample files are moved into this
    848 directory). You can then pass this session name as a parameter to the post-profiling
    849 analysis tools, to only get data up to the point you named the
    850 session. If you do not want to save a session, you can do
    851 <command>rm -rf $SESSION_DIR/samples/sessionname</command> or, for the
    852 current session, <command>opcontrol --reset</command>.
    853 </para>
    854 </sect3>
    855 </sect2> 
    856 
    857 <sect2 id="eventspec">
    858 <title>Specifying performance counter events</title>
    859 <para>
    860 The <option>--event</option> option to <command>opcontrol</command>
    861 takes a specification that indicates how the details of each
    862 hardware performance counter should be setup. If you want to
    863 revert to OProfile's default setting (<option>--event</option>
    864 is strictly optional), use <option>--event=default</option>. Use of this
    865 option over-rides all previous event selections.
    866 </para>
    867 <para>
    868 You can pass multiple event specifications. OProfile will allocate
    869 hardware counters as necessary. Note that some combinations are not
    870 allowed by the CPU; running <command>opcontrol --list-events</command> gives the details
    871 of each event. The event specification is a colon-separated string
    872 of the form <option><emphasis>name</emphasis>:<emphasis>count</emphasis>:<emphasis>unitmask</emphasis>:<emphasis>kernel</emphasis>:<emphasis>user</emphasis></option> as described in this table:
    873 </para>
    874 <informaltable frame="all">
    875 <tgroup cols='2'> 
    876 <tbody>
    877 <row><entry><option>name</option></entry><entry>The symbolic event name, e.g. <constant>CPU_CLK_UNHALTED</constant></entry></row>
    878 <row><entry><option>count</option></entry><entry>The counter reset value, e.g. 100000</entry></row>
    879 <row><entry><option>unitmask</option></entry><entry>The unit mask, as given in the events list, e.g. 0x0f</entry></row>
    880 <row><entry><option>kernel</option></entry><entry>Whether to profile kernel code</entry></row>
    881 <row><entry><option>user</option></entry><entry>Whether to profile userspace code</entry></row>
    882 </tbody>
    883 </tgroup>
    884 </informaltable>
    885 <para>
    886 The last three values are optional, if you omit them (e.g. <option>--event=DATA_MEM_REFS:30000</option>),
    887 they will be set to the default values (a unit mask of 0, and profiling both kernel and
    888 userspace code). Note that some events require a unit mask.
    889 </para>
    890 <note><para>
    891 For the PowerPC platforms, all events specified must be in the same group; i.e., the group number
    892 appended to the event name (e.g. <constant>&lt;<emphasis>some-event-name</emphasis>&gt;_GRP9</constant>) must be the same.
    893 </para></note>
    894 <para>
    895 If OProfile is using RTC mode, and you want to alter the default counter value,
    896 you can use something like <option>--event=RTC_INTERRUPTS:2048</option>. Note the last
    897 three values here are ignored.
    898 If OProfile is using timer-interrupt mode, there is no configuration possible.
    899 </para>
    900 <para>
    901 The table below lists the events selected by default
    902 (<option>--event=default</option>) for the various computer architectures:
    903 </para>
    904 <informaltable frame="all">
    905 <tgroup cols='3'> 
    906 <tbody>
    907 <row><entry>Processor</entry><entry>cpu_type</entry><entry>Default event</entry></row>
    908 <row><entry>Alpha EV4</entry><entry>alpha/ev4</entry><entry>CYCLES:100000:0:1:1</entry></row>
    909 <row><entry>Alpha EV5</entry><entry>alpha/ev5</entry><entry>CYCLES:100000:0:1:1</entry></row>
    910 <row><entry>Alpha PCA56</entry><entry>alpha/pca56</entry><entry>CYCLES:100000:0:1:1</entry></row>
    911 <row><entry>Alpha EV6</entry><entry>alpha/ev6</entry><entry>CYCLES:100000:0:1:1</entry></row>
    912 <row><entry>Alpha EV67</entry><entry>alpha/ev67</entry><entry>CYCLES:100000:0:1:1</entry></row>
    913 <row><entry>ARM/XScale PMU1</entry><entry>arm/xscale1</entry><entry>CPU_CYCLES:100000:0:1:1</entry></row>
    914 <row><entry>ARM/XScale PMU2</entry><entry>arm/xscale2</entry><entry>CPU_CYCLES:100000:0:1:1</entry></row>
    915 <row><entry>ARM/MPCore</entry><entry>arm/mpcore</entry><entry>CPU_CYCLES:100000:0:1:1</entry></row>
    916 <row><entry>AVR32</entry><entry>avr32</entry><entry>CPU_CYCLES:100000:0:1:1</entry></row>
    917 <row><entry>Athlon</entry><entry>i386/athlon</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
    918 <row><entry>Pentium Pro</entry><entry>i386/ppro</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
    919 <row><entry>Pentium II</entry><entry>i386/pii</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
    920 <row><entry>Pentium III</entry><entry>i386/piii</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
    921 <row><entry>Pentium M (P6 core)</entry><entry>i386/p6_mobile</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
    922 <row><entry>Pentium 4 (non-HT)</entry><entry>i386/p4</entry><entry>GLOBAL_POWER_EVENTS:100000:1:1:1</entry></row>
    923 <row><entry>Pentium 4 (HT)</entry><entry>i386/p4-ht</entry><entry>GLOBAL_POWER_EVENTS:100000:1:1:1</entry></row>
    924 <row><entry>Hammer</entry><entry>x86-64/hammer</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
    925 <row><entry>Family10h</entry><entry>x86-64/family10</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
    926 <row><entry>Family11h</entry><entry>x86-64/family11h</entry><entry>CPU_CLK_UNHALTED:100000:0:1:1</entry></row>
    927 <row><entry>Itanium</entry><entry>ia64/itanium</entry><entry>CPU_CYCLES:100000:0:1:1</entry></row>
    928 <row><entry>Itanium 2</entry><entry>ia64/itanium2</entry><entry>CPU_CYCLES:100000:0:1:1</entry></row>
    929 <row><entry>TIMER_INT</entry><entry>timer</entry><entry>None selectable</entry></row>
    930 <row><entry>IBM iseries</entry><entry>PowerPC 4/5/970</entry><entry>CYCLES:10000:0:1:1</entry></row>
    931 <row><entry>IBM pseries</entry><entry>PowerPC 4/5/970/Cell</entry><entry>CYCLES:10000:0:1:1</entry></row>
    932 <row><entry>IBM s390</entry><entry>timer</entry><entry>None selectable</entry></row>
    933 <row><entry>IBM s390x</entry><entry>timer</entry><entry>None selectable</entry></row>
    934 </tbody>
    935 </tgroup>
    936 </informaltable>
    937 
    938 </sect2>
    939 
    940 </sect1>
    941  
    942 <sect1 id="setup-jit">
    943 	<title>Setting up the JIT profiling feature</title>
    944 	<para>
    945 		To gather information about JITed code from a virtual machine,
    946 		it needs to be instrumented with an agent library. We use the
    947 		agent libraries for Java in the following example. To use the
    948 		Java profiling feature, you must build OProfile with the "--with-java" option
    949                 (<xref linkend="install" />).
    950 
    951 	</para>
    952 
    953 	<sect2 id="setup-jit-jvm">
    954 		<title>JVM instrumentation</title>
    955 		<para>
    956 			Add this to the startup parameters of the JVM (for JVMTI):
    957 
    958 			<screen><option>-agentpath:&lt;libdir&gt;/libjvmti_oprofile.so[=&lt;options&gt;]</option> </screen>
    959 			or
    960 			<screen><option>-agentlib:jvmti_oprofile[=&lt;options&gt;]</option> </screen>
    961 		</para>
    962 		<para>
    963 			The JVMPI agent implementation is enabled with the command line option
    964 			<screen><option>-Xrunjvmpi_oprofile[:&lt;options&gt;]</option> </screen>
    965 		</para>
    966 		<para>
    967 			Currently, there is just one option available -- <option>debug</option>. For JVMPI,
    968 			the convention for specifying an option is <option>option_name=[yes|no]</option>.
    969 			For JVMTI, the option specification is simply the option name, implying
    970 			"yes"; no option specified implies "no".
    971 		</para>
    972                 <para>
    973                         The agent library (installed in <filename>&lt;oprof_install_dir&gt;/lib/oprofile</filename>)
    974                         needs to be in the library search path (e.g. add the library directory
    975                         to <constant>LD_LIBRARY_PATH</constant>). If the command line of
    976                         the JVM is not accessible, it may be buried within shell scripts or a
    977                         launcher program. It may also be possible to set an environment variable to add
    978                         the instrumentation.
    979                         For Sun JVMs this is <constant>JAVA_TOOL_OPTIONS</constant>. Please check
    980                         your JVM documentation for
    981                         further information on the agent startup options.
    982                 </para>
    983 
    984 	</sect2>
    985 </sect1>
    986 
    987 <sect1 id="oprofile-gui">
    988 <title>Using <command>oprof_start</command></title>
    989 <para>
    990 The <command>oprof_start</command> application provides a convenient way to start the profiler.
    991 Note that <command>oprof_start</command> is just a wrapper around the <command>opcontrol</command> script,
    992 so it does not provide more services than the script itself.
    993 </para>
    994 <para>
    995 After <command>oprof_start</command> is started you can select the event type for each counter;
    996 the sampling rate and other related parameters are explained in <xref linkend="controlling-daemon" />.
    997 The "Configuration" section allows you to set general parameters such as the buffer size, kernel filename
    998 etc. The counter setup interface should be self-explanatory; <xref linkend="hardware-counters" /> and related 
    999 links contain information on using unit masks.
   1000 </para>
   1001 <para>
   1002 A status line shows the current status of the profiler: how long it has been running, and the average
   1003 number of interrupts received per second and the total, over all processors.
   1004 Note that quitting <command>oprof_start</command> does not stop the profiler.
   1005 </para>
   1006 <para>
   1007 Your configuration is saved in the same file as <command>opcontrol</command> uses; that is,
   1008 <filename>~/.oprofile/daemonrc</filename>.
   1009 </para>
   1010 
   1011 </sect1>
   1012 
   1013 <sect1 id="detailed-parameters">
   1014 <title>Configuration details</title>
   1015 
   1016 <sect2 id="hardware-counters">
   1017 <title>Hardware performance counters</title>
   1018 <note>
   1019 <para>
   1020 Your CPU type may not include the requisite support for hardware performance counters, in which case
   1021 you must use OProfile in RTC mode in 2.4 (see <xref linkend="rtc" />), or timer mode in 2.6 (see <xref linkend="timer" />). 
   1022 You do not really need to read this section unless you are interested in using 
   1023 events other than the default event chosen by OProfile.
   1024 </para>
   1025 </note>
   1026 <para>
   1027 The Intel hardware performance counters are detailed in the Intel IA-32 Architecture Manual, Volume 3, available
   1028 from <ulink url="http://developer.intel.com/">http://developer.intel.com/</ulink>. 
   1029 The AMD Athlon/Opteron/Phenom/Turion implementation is detailed in <ulink
   1030 url="http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pdf">
   1031 http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/22007.pdf</ulink>.
   1032 For PowerPC64 processors in IBM iSeries, pSeries, and blade server systems, processor documentation
   1033 is available at <ulink url="http://www-01.ibm.com/chips/techlib/techlib.nsf/productfamilies/PowerPC/">
   1034 http://www-01.ibm.com/chips/techlib/techlib.nsf/productfamilies/PowerPC</ulink>.  (For example, the
   1035 specific publication containing information on the performance monitor unit for the PowerPC970 is 
   1036 "IBM PowerPC 970FX RISC Microprocessor User's Manual.")
   1037 These processors are capable of delivering an interrupt when a counter overflows.
   1038 This is the basic mechanism on which OProfile is based. The delivery mode is <acronym>NMI</acronym>,
   1039 so blocking interrupts in the kernel does not prevent profiling. When the interrupt handler is called,
   1040 the current <acronym>PC</acronym> value and the current task are recorded into the profiling structure.
   1041 This allows the overflow event to be attached to a specific assembly instruction in a binary image.
   1042 The daemon receives this data from the kernel, and writes it to the sample files.
   1043 </para>
   1044 <para>
   1045 If we use an event such as <constant>CPU_CLK_UNHALTED</constant> or <constant>INST_RETIRED</constant>
   1046 (<constant>GLOBAL_POWER_EVENTS</constant> or <constant>INSTR_RETIRED</constant>, respectively, on the Pentium 4), we can
   1047 use the overflow counts as an estimate of actual time spent in each part of code. Alternatively we can profile interesting
   1048 data such as the cache behaviour of routines with the other available counters.
   1049 </para>
   1050 <para>
   1051 However there are several caveats. First, there are those issues listed in the Intel manual. There is a delay
   1052 between the counter overflow and the interrupt delivery that can skew results on a small scale - this means
   1053 you cannot rely on the profiles at the instruction level as being perfectly accurate.
   1054 If you are using an "event-mode" counter such as the cache counters, a count registered against it doesn't mean
   1055 that it is responsible for that event. However, it implies that the counter overflowed in the dynamic
   1056 vicinity of that instruction, to within a few instructions. Further details on this problem can be found in 
   1057 <xref linkend="interpreting" /> and also in the Digital paper "ProfileMe: A Hardware Performance Counter".
   1058 </para>
   1059 <para>
   1060 Each counter has several configuration parameters.
   1061 First, there is the unit mask: this simply further specifies what to count.
   1062 Second, there is the counter value, discussed below. Third, there is a parameter whether to increment counts
   1063 whilst in kernel or user space. You can configure these separately for each counter.
   1064 </para>
   1065 <para>
   1066 After each overflow event, the counter will be re-initialized
   1067 such that another overflow will occur after this many events have been counted. Thus, higher
   1068 values mean less-detailed profiling, and lower values mean more detail, but higher overhead.
   1069 Picking a good value for this
   1070 parameter is, unfortunately, somewhat of a black art. It is of course dependent on the event
   1071 you have chosen.
   1072 Specifying too large a value will mean not enough interrupts are generated
   1073 to give a realistic profile (though this problem can be ameliorated by profiling for <emphasis>longer</emphasis>).
   1074 Specifying too small a value can lead to higher performance overhead.
   1075 </para>
   1076 
   1077 </sect2>
   1078 
   1079 <sect2 id="rtc">
   1080 <title>OProfile in RTC mode</title>
   1081 <note><para>
   1082 This section applies to 2.2/2.4 kernels only.
   1083 </para></note>
   1084 <para>
   1085 Some CPU types do not provide the needed hardware support to use the hardware performance counters. This includes
   1086 some laptops, classic Pentiums, and other CPU types not yet supported by OProfile (such as Cyrix). 
   1087 On these machines, OProfile falls
   1088 back to using the real-time clock interrupt to collect samples. This interrupt is also used by the <command>rtc</command>
   1089 module: you cannot have both the OProfile and rtc modules loaded nor the rtc support compiled in the kernel.
   1090 </para>
   1091 <para>
   1092 RTC mode is less capable than the hardware counters mode; in particular, it is unable to profile sections of
   1093 the kernel where interrupts are disabled. There is just one available event, "RTC interrupts", and its value 
   1094 corresponds to the number of interrupts generated per second (that is, a higher number means a better profiling
   1095 resolution, and higher overhead). The current implementation of the real-time clock supports only power-of-two
   1096 sampling rates from 2 to 4096 per second.  Other values within this range are rounded to the nearest power of
   1097 two.
   1098 </para>
   1099 <para>
   1100 You can force use of the RTC interrupt with the <option>force_rtc=1</option> module parameter.
   1101 </para>
   1102 <para>
   1103 Setting the value from the GUI should be straightforward. On the command line, you need to specify the
   1104 event to <command>opcontrol</command>, e.g. :
   1105 </para>
   1106 <para><command>opcontrol --event=RTC_INTERRUPTS:256</command></para>
   1107 </sect2>
   1108 
   1109 <sect2 id="timer">
   1110 <title>OProfile in timer interrupt mode</title>
   1111 <note><para>
   1112 This section applies to 2.6 kernels and above only.
   1113 </para></note>
   1114 <para>
   1115 In 2.6 kernels on CPUs without OProfile support for the hardware performance counters, the driver
   1116 falls back to using the timer interrupt for profiling. Like the RTC mode in 2.4 kernels, this is not able to
   1117 profile code that has interrupts disabled. Note that there are no configuration parameters for
   1118 setting this, unlike the RTC and hardware performance counter setup.
   1119 </para>
   1120 <para>
   1121 You can force use of the timer interrupt by using the <option>timer=1</option> module
   1122 parameter (or <option>oprofile.timer=1</option> on the boot command line if OProfile is
   1123 built-in).
   1124 </para>
   1125 </sect2>
   1126 
   1127 <sect2 id="p4">
   1128 <title>Pentium 4 support</title>
   1129 <para>
   1130 The Pentium 4 / Xeon performance counters are organized around 3 types of model specific registers (MSRs): 45 event
   1131 selection control registers (ESCRs), 18 counter configuration control registers (CCCRs) and 18 counters. ESCRs describe a
   1132 particular set of events which are to be recorded, and CCCRs bind ESCRs to counters and configure their
   1133 operation. Unfortunately the relationship between these registers is quite complex; they cannot all be used with one
   1134 another at any time. There is, however, a subset of 8 counters, 8 ESCRs, and 8 CCCRs which can be used independently of
   1135 one another, so OProfile only accesses those registers, treating them as a bank of 8 "normal" counters, similar
   1136 to those in the P6 or Athlon/Opteron/Phenom/Turion families of CPU.
   1137 </para>
   1138 <para>
   1139 There is currently no support for Precision Event-Based Sampling (PEBS), nor any advanced uses of the Debug Store
   1140 (DS). Current support is limited to the conservative extension of OProfile's existing interrupt-based model described
   1141 above.  Performance monitoring hardware on Pentium 4 / Xeon processors with Hyperthreading enabled (multiple logical
   1142 processors on a single die) is not supported in 2.4 kernels (you can use OProfile if you disable hyper-threading,
   1143 though).
   1144 </para>
   1145 </sect2>
   1146 
   1147 <sect2 id="ia64">
   1148 <title>Intel Itanium 2 support</title>
   1149 <para>
   1150 The Itanium 2 performance monitoring unit (PMU) organizes the counters as four
   1151 pairs of performance event monitoring registers. Each pair is composed of a
   1152 Performance Monitoring Configuration (PMC) register and Performance Monitoring
   1153 Data (PMD) register.  The PMC selects the performance event being monitored and
   1154 the PMD determines the sampling interval. The IA64 Performance Monitoring Unit
   1155 (PMU) triggers sampling with maskable interrupts. Thus, samples will not occur
   1156 in sections of the IA64 kernel where interrupts are disabled.
   1157 </para>
   1158 <para>
   1159 None of the advance features of the Itanium 2 performance monitoring unit
   1160 such as opcode matching, address range matching, or precise event sampling are
   1161 supported by this version of OProfile.  The Itanium 2 support only maps OProfile's
   1162 existing interrupt-based model to the PMU hardware.
   1163 </para>
   1164 </sect2>
   1165 
   1166 <sect2 id="ppc64">
   1167 <title>PowerPC64 support</title>
   1168 <para>
   1169 The performance monitoring unit (PMU) for the IBM PowerPC 64-bit processors 
   1170 consists of between 4 and 8 counters (depending on the model), plus three
   1171 special purpose registers used for programming the counters -- MMCR0, MMCR1,
   1172 and MMCRA.  Advanced features such as instruction matching and thresholding are
   1173 not supported by this version of OProfile.
   1174 <note>Later versions of the IBM POWER5+ processor (beginning with revision 3.0)
   1175 run the performance monitor unit in POWER6 mode, effectively removing OProfile's
   1176 access to counters 5 and 6.  These two counters are dedicated to counting
   1177 instructions completed and cycles, respectively.  In POWER6 mode, however, the
   1178 counters do not generate an interrupt on overflow and so are unusable by
   1179 OProfile.  Kernel versions 2.6.23 and higher will recognize this mode
   1180 and export "ppc64/power5++" as the cpu_type to the oprofilefs pseudo filesystem.
   1181 OProfile userspace responds to this cpu_type by removing these counters from
   1182 the list of potential events to count.  Without this kernel support, attempts
   1183 to profile using an event from one of these counters will yield incorrect
   1184 results -- typically, zero (or near zero) samples in the generated report.
   1185 </note>
   1186 </para>
   1187 
   1188 </sect2>
   1189 
   1190 <sect2 id="cell-be">
   1191 <title>Cell Broadband Engine support</title>
   1192 <para>
   1193 The Cell Broadband Engine (CBE) processor core consists of a PowerPC Processing
   1194 Element (PPE) and 8 Synergistic Processing Elements (SPE).  PPEs and SPEs each
   1195 consist of a processing unit (PPU and SPU, respectively) and other hardware
   1196 components, such as memory controllers.
   1197 </para>
   1198 <para>
   1199 A PPU has two hardware threads (aka "virtual CPUs").  The performance monitor
   1200 unit of the CBE collects event information on one hardware thread at a time.
   1201 Therefore, when profiling PPE events,
   1202 OProfile collects the profile based on the selected events by time slicing the
   1203 performance counter hardware between the two threads.   The user must ensure the
   1204 collection interval is long enough so that the time spent collecting data for
   1205 each PPU is sufficient to obtain a good profile.
   1206 </para>
   1207 <para>
   1208 To profile an SPU application, the user should specify the SPU_CYCLES event.
   1209 When starting OProfile with SPU_CYCLES, the opcontrol script enforces certain
   1210 separation parameters (separate=cpu,lib) to ensure that sufficient information
   1211 is collected in the sample data in order to generate a complete report.  The
   1212 --merge=cpu option can be used to obtain a more readable report if analyzing
   1213 the performance of each separate SPU is not necessary.
   1214 </para>
   1215 <para>
   1216 Profiling with an SPU event (events 4100 through 4163) is not compatible with any other
   1217 event.  Further more, only one SPU event can be specified at a time.  The hardware only
   1218 supports profiling on one SPU per node at a time.  The OProfile kernel code time slices
   1219 between the eight SPUs to collect data on all SPUs.
   1220 </para>
   1221 <para>
   1222 SPU profile reports have some unique characteristics compared to reports for
   1223 standard architectures:
   1224 </para>
   1225 <itemizedlist>
   1226 <listitem>Typically no "app name" column.  This is really standard OProfile behavior
   1227 when the report contains samples for just a single application, which is
   1228 commonly the case when profiling SPUs.</listitem>
   1229 <listitem>"CPU" equates to "SPU"</listitem>
   1230 <listitem>Specifying '--long-filenames' on the opreport command does not always result
   1231 in long filenames.  This happens when the SPU application code is embedded in
   1232 the PPE executable or shared library.  The embedded SPU ELF data contains only the
   1233 short filename (i.e., no path information) for the SPU binary file that was used as
   1234 the source for embedding.   The reason that just the short filename is used is because
   1235 the original SPU binary file may not exist or be accessible at runtime.  The performance
   1236 analyst must have sufficient knowledge of the application to be able to correlate the
   1237 SPU binary image names found in the  report to the application's source files.
   1238 <note>
   1239 Compile the application with -g and generate the OProfile report
   1240 with -g to facilitate finding the right source file(s) on which to focus.
   1241 </note>
   1242 </listitem>
   1243 </itemizedlist>
   1244 
   1245 </sect2>
   1246 
   1247 <sect2 id="amd-ibs-support">
   1248 <title>AMD64 (x86_64) Instruction-Based Sampling (IBS) support</title>
   1249 
   1250 <para>
   1251 Instruction-Based Sampling (IBS) is a new performance measurement technique
   1252 available on AMD Family 10h processors. Traditional performance counter
   1253 sampling is not precise enough to isolate performance issues to individual
   1254 instructions. IBS, however, precisely identifies instructions which are not
   1255 making the best use of the processor pipeline and memory hierarchy.
   1256 For more information, please refer to the "Instruction-Based Sampling:
   1257 A New Performance Analysis Technique for AMD Family 10h Processors" (
   1258 <ulink url="http://developer.amd.com/assets/AMD_IBS_paper_EN.pdf">
   1259 http://developer.amd.com/assets/AMD_IBS_paper_EN.pdf</ulink>).
   1260 There are two types of IBS profile types, described in the following sections.
   1261 </para>
   1262 
   1263 <sect3 id="ibs-fetch">
   1264 <title>IBS Fetch</title>
   1265 
   1266 <para>
   1267 IBS fetch sampling is a statistical sampling method which counts completed
   1268 fetch operations. When the number of completed fetch operations reaches the
   1269 maximum fetch count (the sampling period), IBS tags the fetch operation and
   1270 monitors that operation until it either completes or aborts. When a tagged
   1271 fetch completes or aborts, a sampling interrupt is generated and an IBS fetch
   1272 sample is taken. An IBS fetch sample contains a timestamp, the identifier of
   1273 the interrupted process, the virtual fetch address, and several event flags
   1274 and values that describe what happened during the fetch operation. 
   1275 </para>
   1276 
   1277 </sect3>
   1278 
   1279 <sect3 id="ibs-op">
   1280 <title>IBS Op</title>
   1281 
   1282 <para>
   1283 IBS op sampling selects, tags, and monitors macro-ops as issued from AMD64
   1284 instructions. Two options are available for selecting ops for sampling:
   1285 </para>
   1286 
   1287 <itemizedlist>
   1288 <listitem>
   1289 Cycles-based selection counts CPU clock cycles. The op is tagged and monitored
   1290 when the count reaches a threshold (the sampling period) and a valid op is
   1291 available. 
   1292 </listitem>
   1293 
   1294 <listitem>
   1295 Dispatched op-based selection counts dispatched macro-ops.
   1296 When the count reaches a threshold, the next valid op is tagged and monitored. 
   1297 </listitem>
   1298 </itemizedlist>
   1299 
   1300 <para>
   1301 In both cases, an IBS sample is generated only if the tagged op retires.
   1302 Thus, IBS op event information does not measure speculative execution activity.
   1303 The execution stages of the pipeline monitor the tagged macro-op. When the
   1304 tagged macro-op retires, a sampling interrupt is generated and an IBS op
   1305 sample is taken. An IBS op sample contains a timestamp, the identifier of
   1306 the interrupted process, the virtual address of the AMD64 instruction from
   1307 which the op was issued, and several event flags and values that describe
   1308 what happened when the macro-op executed.
   1309 </para>
   1310 
   1311 </sect3>
   1312 
   1313 <para>
   1314 Enabling IBS profiling is done simply by specifying IBS performance events
   1315 through the "--event=" options. These events are listed in the
   1316 <function>opcontrol --list-events</function>.
   1317 </para>
   1318 
   1319 <screen>
   1320 opcontrol --event=IBS_FETCH_XXX:&lt;count&gt;:&lt;um&gt;:&lt;kernel&gt;:&lt;user&gt;
   1321 opcontrol --event=IBS_OP_XXX:&lt;count&gt;:&lt;um&gt;:&lt;kernel&gt;:&lt;user&gt;
   1322 
   1323 Note: * All IBS fetch event must have the same event count and unitmask,
   1324         as do those for IBS op.
   1325 </screen>
   1326 
   1327 </sect2>
   1328 
   1329 
   1330 <sect2 id="misuse">
   1331 <title>Dangerous counter settings</title>
   1332 <para>
   1333 OProfile is a low-level profiler which allow continuous profiling with a low-overhead cost.
   1334 If too low a count reset value is set for a counter, the system can become overloaded with counter
   1335 interrupts, and seem as if the system has frozen. Whilst some validation is done, it
   1336 is not foolproof.
   1337 </para>
   1338 <note><para>
   1339 This can happen as follows: When the profiler count
   1340 reaches zero an NMI handler is called which stores the sample values in an internal buffer, then resets the counter
   1341 to its original value. If the count is very low, a pending NMI can be sent before the NMI handler has
   1342 completed. Due to the priority of the NMI, the local APIC delivers the pending interrupt immediately after
   1343 completion of the previous interrupt handler, and control never returns to other parts of the system.
   1344 In this way the system seems to be frozen.
   1345 </para></note>
   1346 <para>If this happens, it will be impossible to bring the system back to a workable state.
   1347 There is no way to provide real security against this happening, other than making sure to use a reasonable value
   1348 for the counter reset. For example, setting <constant>CPU_CLK_UNHALTED</constant> event type with a ridiculously low reset count (e.g. 500)
   1349 is likely to freeze the system.
   1350 </para>
   1351 <para>
   1352 In short : <command>Don't try a foolish sample count value</command>. Unfortunately the definition of a foolish value
   1353 is really dependent on the event type - if ever in doubt, e-mail </para>
   1354 <address><email>oprofile-list (a] lists.sf.net</email>.</address>
   1355 </sect2>
   1356 
   1357 </sect1>
   1358  
   1359 </chapter>
   1360 
   1361 <chapter id="results">
   1362 <title>Obtaining results</title>
   1363 <para>
   1364 OK, so the profiler has been running, but it's not much use unless we can get some data out. Fairly often,
   1365 OProfile does a little <emphasis>too</emphasis> good a job of keeping overhead low, and no data reaches
   1366 the profiler. This can happen on lightly-loaded machines. Remember you can force a dump at any time with :
   1367 </para>
   1368 <para><command>opcontrol --dump</command></para>
   1369 <para>Remember to do this before complaining there is no profiling data !
   1370 Now that we've got some data, it has to be processed. That's the job of <command>opreport</command>,
   1371 <command>opannotate</command>, or <command>opgprof</command>.
   1372 </para>
   1373 
   1374 <sect1 id="profile-spec">
   1375 <title>Profile specifications</title>
   1376 
   1377 <para>
   1378 All of the analysis tools take a <emphasis>profile specification</emphasis>.
   1379 This is a set of definitions that describe which actual profiles should be
   1380 examined. The simplest profile specification is empty: this will match all
   1381 the available profile files for the current session (this is what happens
   1382 when you do <command>opreport</command>).
   1383 </para>
   1384 <para>
   1385 Specification parameters are of the form <option>name:value[,value]</option>.
   1386 For example, if I wanted to get a combined symbol summary for
   1387 <filename>/bin/myprog</filename> and <filename>/bin/myprog2</filename>,
   1388 I could do <command>opreport -l image:/bin/myprog,/bin/myprog2</command>.
   1389 As a special case, you don't actually need to specify the <option>image:</option>
   1390 part here: anything left on the command line is assumed to be an
   1391 <option>image:</option> name. Similarly, if no <option>session:</option>
   1392 is specified, then <option>session:current</option> is assumed ("current"
   1393 is a special name of the current / last profiling session).
   1394 </para>
   1395 <para>
   1396 In addition to the comma-separated list shown above, some of the 
   1397 specification parameters can take <command>glob</command>-style
   1398 values. For example, if I want to see image summaries for all
   1399 binaries profiled in <filename>/usr/bin/</filename>, I could do
   1400 <command>opreport image:/usr/bin/\*</command>. Note the necessity
   1401 to escape the special character from the shell.
   1402 </para>
   1403 <para>
   1404 For <command>opreport</command>, profile specifications can be used to
   1405 define two profiles, giving differential output. This is done by
   1406 enclosing each of the two specifications within curly braces, as shown
   1407 in the examples below. Any specifications outside of curly braces are
   1408 shared across both.
   1409 </para>
   1410 
   1411 <sect2 id="profile-spec-examples">
   1412 <title>Examples</title>
   1413 
   1414 <para>
   1415 Image summaries for all profiles with <constant>DATA_MEM_REFS</constant>
   1416 samples in the saved session called "stresstest" :
   1417 </para>
   1418 <screen>
   1419 # opreport session:stresstest event:DATA_MEM_REFS
   1420 </screen>
   1421 
   1422 <para>
   1423 Symbol summary for the application called "test_sym53c8xx,9xx". Note the
   1424 escaping is necessary as <option>image:</option> takes a comma-separated list.
   1425 </para>
   1426 <screen>
   1427 # opreport -l ./test/test_sym53c8xx\,9xx
   1428 </screen>
   1429 
   1430 <para>
   1431 Image summaries for all binaries in the <filename>test</filename> directory,
   1432 excepting <filename>boring-test</filename> :
   1433 </para>
   1434 <screen>
   1435 # opreport image:./test/\* image-exclude:./test/boring-test
   1436 </screen>
   1437 
   1438 <para>
   1439 Differential profile of a binary stored in two archives :
   1440 </para>
   1441 <screen>
   1442 # opreport -l /bin/bash { archive:./orig } { archive:./new }
   1443 </screen>
   1444 
   1445 <para>
   1446 Differential profile of an archived binary with the current session :
   1447 </para>
   1448 <screen>
   1449 # opreport -l /bin/bash { archive:./orig } { }
   1450 </screen>
   1451 
   1452 </sect2> <!-- profile spec examples -->
   1453 
   1454 <sect2 id="profile-spec-details">
   1455 <title>Profile specification parameters</title>
   1456 
   1457 <variablelist>
   1458 	<varlistentry>
   1459 		<term><option>archive:</option><emphasis>archivepath</emphasis></term>
   1460 		<listitem><para>
   1461 		A path to an archive made with <command>oparchive</command>.
   1462 		Absence of this tag, unlike others, means "the current system",
   1463 		equivalent to specifying "archive:".
   1464 		</para></listitem>
   1465 	</varlistentry>
   1466 	<varlistentry>
   1467 		<term><option>session:</option><emphasis>sessionlist</emphasis></term>
   1468 		<listitem><para>
   1469 		A comma-separated list of session names to resolve in. Absence of this
   1470 		tag, unlike others, means "the current session", equivalent to
   1471 		specifying "session:current".
   1472 		</para></listitem>
   1473 	</varlistentry>
   1474 	<varlistentry>
   1475 		<term><option>session-exclude:</option><emphasis>sessionlist</emphasis></term>
   1476 		<listitem><para>
   1477                 A comma-separated list of sessions to exclude.
   1478 		</para></listitem>
   1479 	</varlistentry>
   1480 	<varlistentry>
   1481 		<term><option>image:</option><emphasis>imagelist</emphasis></term>
   1482 		<listitem><para>
   1483                 A comma-separated list of image names to resolve. Each entry may be relative
   1484                 path, <command>glob</command>-style name, or full path, e.g.</para>
   1485 		<screen>opreport 'image:/usr/bin/oprofiled,*op*,./opreport'</screen>
   1486 		</listitem>
   1487 	</varlistentry>
   1488 
   1489 	<varlistentry>
   1490 		<term><option>image-exclude:</option><emphasis>imagelist</emphasis></term>
   1491 		<listitem><para>
   1492 		Same as <option>image:</option>, but the matching images are excluded.
   1493 		</para></listitem>
   1494 	</varlistentry>
   1495 
   1496 	<varlistentry>
   1497 		<term><option>lib-image:</option><emphasis>imagelist</emphasis></term>
   1498 		<listitem><para>
   1499 		Same as <option>image:</option>, but only for images that are for
   1500 		a particular primary binary image (namely, an application). This only
   1501 		makes sense to use if you're using <option>--separate</option>.
   1502 		This includes kernel modules and the kernel when using
   1503 		<option>--separate=kernel</option>.
   1504 		</para></listitem>
   1505 	</varlistentry>
   1506 
   1507 	<varlistentry>
   1508 		<term><option>lib-image-exclude:</option><emphasis>imagelist</emphasis></term>
   1509 		<listitem><para>
   1510 		Same as <option>lib-image:</option>, but the matching images
   1511 		are excluded.
   1512 		</para></listitem>
   1513 	</varlistentry>
   1514 
   1515 	<varlistentry>
   1516 		<term><option>event:</option><emphasis>eventlist</emphasis></term>
   1517 		<listitem><para>
   1518 		The symbolic event name to match on, e.g. <option>event:DATA_MEM_REFS</option>.
   1519 		You can pass a list of events for side-by-side comparison with <command>opreport</command>.
   1520 		When using the timer interrupt, the event is always "TIMER".
   1521 		</para></listitem>
   1522 	</varlistentry>
   1523 
   1524 	<varlistentry>
   1525 		<term><option>count:</option><emphasis>eventcountlist</emphasis></term>
   1526 		<listitem><para>
   1527 		The event count to match on, e.g. <option>event:DATA_MEM_REFS count:30000</option>.
   1528 		Note that this value refers to the setting used for <command>opcontrol</command>
   1529 		only, and has nothing to do with the sample counts in the profile data
   1530 		itself.
   1531 		You can pass a list of events for side-by-side comparison with <command>opreport</command>.
   1532 		When using the timer interrupt, the count is always 0 (indicating it cannot be set).
   1533 		</para></listitem>
   1534 	</varlistentry>
   1535 
   1536 	<varlistentry>
   1537 		<term><option>unit-mask:</option><emphasis>masklist</emphasis></term>
   1538 		<listitem><para>
   1539 		The unit mask value of the event to match on, e.g. <option>unit-mask:1</option>.
   1540 		You can pass a list of events for side-by-side comparison with <command>opreport</command>.
   1541 		</para></listitem>
   1542 	</varlistentry>
   1543 
   1544 	<varlistentry>
   1545 		<term><option>cpu:</option><emphasis>cpulist</emphasis></term>
   1546 		<listitem><para>
   1547 		Only consider profiles for the given numbered CPU (starting from zero).
   1548 		This is only useful when using CPU profile separation.
   1549 		</para></listitem>
   1550 	</varlistentry>
   1551 
   1552 	<varlistentry>
   1553 		<term><option>tgid:</option><emphasis>pidlist</emphasis></term>
   1554 		<listitem><para>
   1555 		Only consider profiles for the given task groups. Unless some program
   1556 		is using threads, the task group ID of a process is the same
   1557 		as its process ID. This option corresponds to the POSIX
   1558 		notion of a thread group.
   1559 		This is only useful when using per-process profile separation.
   1560 		</para></listitem>
   1561 	</varlistentry>
   1562 
   1563 	<varlistentry>
   1564 		<term><option>tid:</option><emphasis>tidlist</emphasis></term>
   1565 		<listitem><para>
   1566 		Only consider profiles for the given threads. When using
   1567 		recent thread libraries, all threads in a process share the
   1568 		same task group ID, but have different thread IDs. You can
   1569 		use this option in combination with <option>tgid:</option> to
   1570 		restrict the results to particular threads within a process.
   1571 		This is only useful when using per-process profile separation.
   1572 		</para></listitem>
   1573 	</varlistentry>
   1574 </variablelist>
   1575 
   1576 </sect2>
   1577 
   1578 <sect2 id="locating-and-managing-binary-images">
   1579 <title>Locating and managing binary images</title>
   1580 <para>
   1581 Each session's sample files can be found in the $SESSION_DIR/samples/ directory (default: <filename>/var/lib/oprofile/samples/</filename>).
   1582 These are used, along with the binary image files, to produce human-readable data.
   1583 In some circumstances (kernel modules in an initrd, or modules on 2.6 kernels), OProfile
   1584 will not be able to find the binary images. All the tools have an <option>--image-path</option>
   1585 option to which you can pass a comma-separated list of alternate paths to search. For example,
   1586 I can let OProfile find my 2.6 modules by using <command>--image-path /lib/modules/2.6.0/kernel/</command>.
   1587 It is your responsibility to ensure that the correct images are found when using this
   1588 option.
   1589 </para>
   1590 <para>
   1591 Note that if a binary image changes after the sample file was created, you won't be able to get useful
   1592 symbol-based data out. This situation is detected for you. If you replace a binary, you should
   1593 make sure to save the old binary if you need to do comparative profiles.
   1594 </para>
   1595 
   1596 </sect2>
   1597 
   1598 <sect2 id="no-results">
   1599 <title>What to do when you don't get any results</title>
   1600 <para>
   1601 When attempting to get output, you may see the error :
   1602 </para>
   1603 <screen>
   1604 error: no sample files found: profile specification too strict ?
   1605 </screen>
   1606 <para>
   1607 What this is saying is that the profile specification you passed in,
   1608 when matched against the available sample files, resulted in no matches.
   1609 There are a number of reasons this might happen:
   1610 </para>
   1611 <variablelist>
   1612 <varlistentry><term>spelling</term><listitem><para>
   1613 You specified a binary name, but spelt it wrongly. Check your spelling !
   1614 </para></listitem></varlistentry>
   1615 <varlistentry><term>profiler wasn't running</term><listitem><para>
   1616 Make very sure that OProfile was actually up and running when you ran
   1617 the binary.
   1618 </para></listitem></varlistentry>
   1619 <varlistentry><term>binary didn't run long enough</term><listitem><para>
   1620 Remember OProfile is a statistical profiler - you're not guaranteed to
   1621 get samples for short-running programs. You can help this by using a
   1622 lower count for the performance counter, so there are a lot more samples
   1623 taken per second.
   1624 </para></listitem></varlistentry>
   1625 <varlistentry><term>binary spent most of its time in libraries</term><listitem><para>
   1626 Similarly, if the binary spends little time in the main binary image
   1627 itself, with most of it spent in shared libraries it uses, you might
   1628 not see any samples for the binary image itself. You can check this
   1629 by using <command>opcontrol --separate=lib</command> before the
   1630 profiling session, so <command>opreport</command> and friends show
   1631 the library profiles on a per-application basis.
   1632 </para></listitem></varlistentry>
   1633 <varlistentry><term>specification was really too strict</term><listitem><para>
   1634 For example, you specified something like <option>tgid:3433</option>,
   1635 but no task with that group ID ever ran the code.
   1636 </para></listitem></varlistentry>
   1637 <varlistentry><term>binary didn't generate any events</term><listitem><para>
   1638 If you're using a particular event counter, for example counting MMX
   1639 operations, the code might simply have not generated any events in the
   1640 first place. Verify the code you're profiling does what you expect it
   1641 to.
   1642 </para></listitem></varlistentry>
   1643 <varlistentry><term>you didn't specify kernel module name correctly</term><listitem><para>
   1644 If you're using 2.6 kernels, and trying to get reports for a kernel
   1645 module, make sure to use the <option>-p</option> option, and specify the
   1646 module name <emphasis>with</emphasis> the <filename>.ko</filename>
   1647 extension. Check if the module is one loaded from initrd.
   1648 </para></listitem></varlistentry>
   1649 </variablelist>
   1650 
   1651 </sect2>
   1652 
   1653 </sect1> <!-- profile-spec -->
   1654 
   1655 <sect1 id="opreport">
   1656 <title>Image summaries and symbol summaries (<command>opreport</command>)</title>
   1657 <para>
   1658 The <command>opreport</command> utility is the primary utility you will use for 
   1659 getting formatted data out of OProfile. It produces two types of data: image summaries
   1660 and symbol summaries. An image summary lists the number of samples for individual
   1661 binary images such as libraries or applications. Symbol summaries provide per-symbol
   1662 profile data. In the following example, we're getting an image summary for the whole
   1663 system:
   1664 </para>
   1665 <screen>
   1666 $ opreport --long-filenames
   1667 CPU: PIII, speed 863.195 MHz (estimated)
   1668 Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a unit mask of 0x00 (No unit mask) count 23150
   1669    905898 59.7415 /usr/lib/gcc-lib/i386-redhat-linux/3.2/cc1plus
   1670    214320 14.1338 /boot/2.6.0/vmlinux
   1671    103450  6.8222 /lib/i686/libc-2.3.2.so
   1672     60160  3.9674 /usr/local/bin/madplay
   1673     31769  2.0951 /usr/local/oprofile-pp/bin/oprofiled
   1674     26550  1.7509 /usr/lib/libartsflow.so.1.0.0
   1675     23906  1.5765 /usr/bin/as
   1676     18770  1.2378 /oprofile
   1677     15528  1.0240 /usr/lib/qt-3.0.5/lib/libqt-mt.so.3.0.5
   1678     11979  0.7900 /usr/X11R6/bin/XFree86
   1679     11328  0.7471 /bin/bash
   1680     ...
   1681 </screen>
   1682 <para>
   1683 If we had specified <option>--symbols</option> in the previous command, we would have
   1684 gotten a symbol summary of all the images across the entire system. We can restrict this to only
   1685 part of the system profile; for example,
   1686 below is a symbol summary of the OProfile daemon. Note that as we used
   1687 <command>opcontrol --separate=kernel</command>, symbols from images that <command>oprofiled</command>
   1688 has used are also shown.
   1689 </para>
   1690 <screen>
   1691 $ opreport -l `which oprofiled` 2>/dev/null | more
   1692 CPU: PIII, speed 863.195 MHz (estimated)
   1693 Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a unit mask of 0x00 (No unit mask) count 23150
   1694 vma      samples  %           image name               symbol name
   1695 0804be10 14971    28.1993     oprofiled                odb_insert
   1696 0804afdc 7144     13.4564     oprofiled                pop_buffer_value
   1697 c01daea0 6113     11.5144     vmlinux                  __copy_to_user_ll
   1698 0804b060 2816      5.3042     oprofiled                opd_put_sample
   1699 0804b4a0 2147      4.0441     oprofiled                opd_process_samples
   1700 0804acf4 1855      3.4941     oprofiled                opd_put_image_sample
   1701 0804ad84 1766      3.3264     oprofiled                opd_find_image
   1702 0804a5ec 1084      2.0418     oprofiled                opd_find_module
   1703 0804ba5c 741       1.3957     oprofiled                odb_hash_add_node
   1704 ...
   1705 </screen>
   1706 
   1707 <para>
   1708 These are the two basic ways you are most likely to use regularly, but <command>opreport</command>
   1709 can do a lot more than that, as described below.
   1710 </para>
   1711 
   1712 <sect2 id="opreport-merging">
   1713 <title>Merging separate profiles</title>
   1714 
   1715 If you have used one of the <option>--separate=</option> options
   1716 whilst profiling, there can be several separate profiles for
   1717 a single binary image within a session. Normally the output
   1718 will keep these images separated (so, for example, the image summary
   1719 output shows library image summaries on a per-application basis,
   1720 when using <option>--separate=lib</option>).
   1721 Sometimes it can be useful to merge these results back together
   1722 before getting results. The <option>--merge</option> option allows
   1723 you to do that.
   1724 </sect2>
   1725 
   1726 <sect2 id="opreport-comparison">
   1727 <title>Side-by-side multiple results</title>
   1728 If you have used multiple events when profiling, by default you get
   1729 side-by-side results of each event's sample values from <command>opreport</command>.
   1730 You can restrict which events to list by appropriate use of the
   1731 <option>event:</option> profile specifications, etc.
   1732 </sect2>
   1733 
   1734 <sect2 id="opreport-callgraph">
   1735 <title>Callgraph output</title>
   1736 <para>
   1737 This section provides details on how to use the OProfile callgraph feature.
   1738 </para>
   1739 <sect3 id="op-cg1">
   1740 <title>Callgraph details</title>
   1741 <para>
   1742 When using the <option>opcontrol --callgraph</option> option, you can see what
   1743 functions are calling other functions in the output. Consider the
   1744 following program:
   1745 </para>
   1746 <screen>
   1747 #include &lt;string.h&gt;
   1748 #include &lt;stdlib.h&gt;
   1749 #include &lt;stdio.h&gt;
   1750 
   1751 #define SIZE 500000
   1752 
   1753 static int compare(const void *s1, const void *s2)
   1754 {
   1755         return strcmp(s1, s2);
   1756 }
   1757 
   1758 static void repeat(void)
   1759 {
   1760         int i;
   1761         char *strings[SIZE];
   1762         char str[] = "abcdefghijklmnopqrstuvwxyz";
   1763 
   1764         for (i = 0; i &lt; SIZE; ++i) {
   1765                 strings[i] = strdup(str);
   1766                 strfry(strings[i]);
   1767         }
   1768 
   1769         qsort(strings, SIZE, sizeof(char *), compare);
   1770 }
   1771 
   1772 int main()
   1773 {
   1774         while (1)
   1775                 repeat();
   1776 }
   1777 </screen>
   1778 <para>
   1779 When running with the call-graph option, OProfile will
   1780 record the function stack every time it takes a sample.
   1781 <command>opreport --callgraph</command> outputs an entry for each
   1782 function, where each entry looks similar to:
   1783 </para>
   1784 <screen>
   1785 samples  %        image name               symbol name
   1786   197       0.1548  cg                       main
   1787   127036   99.8452  cg                       repeat
   1788 84590    42.5084  libc-2.3.2.so            strfry
   1789   84590    66.4838  libc-2.3.2.so            strfry [self]
   1790   39169    30.7850  libc-2.3.2.so            random_r
   1791   3475      2.7312  libc-2.3.2.so            __i686.get_pc_thunk.bx
   1792 -------------------------------------------------------------------------------
   1793 </screen>
   1794 <para>
   1795 Here the non-indented line is the function we're focussing upon
   1796 (<function>strfry()</function>). This
   1797 line is the same as you'd get from a normal <command>opreport</command>
   1798 output.
   1799 </para>
   1800 <para>
   1801 Above the non-indented line we find the functions that called this
   1802 function (for example, <function>repeat()</function> calls
   1803 <function>strfry()</function>). The samples and percentage values here
   1804 refer to the number of times we took a sample where this call was found
   1805 in the stack; the percentage is relative to all other callers of the
   1806 function we're focussing on. Note that these values are
   1807 <emphasis>not</emphasis> call counts; they only reflect the call stack
   1808 every time a sample is taken; that is, if a call is found in the stack
   1809 at the time of a sample, it is recorded in this count.
   1810 </para>
   1811 <para>
   1812 Below the line are functions that are called by
   1813 <function>strfry()</function> (called <emphasis>callees</emphasis>).
   1814 It's clear here that <function>strfry()</function> calls
   1815 <function>random_r()</function>. We also see a special entry with a
   1816 "[self]" marker. This records the normal samples for the function, but
   1817 the percentage becomes relative to all callees. This allows you to
   1818 compare time spent in the function itself compared to functions it
   1819 calls. Note that if a function calls itself, then it will appear in the
   1820 list of callees of itself, but without the "[self]" marker; so recursive
   1821 calls are still clearly separable.
   1822 </para>
   1823 <para>
   1824 You may have noticed that the output lists <function>main()</function>
   1825 as calling <function>strfry()</function>, but it's clear from the source
   1826 that this doesn't actually happen. See <xref
   1827 linkend="interpreting-callgraph" /> for an explanation.
   1828 </para>
   1829 </sect3>
   1830 <sect3 id="cg-with-jitsupport">
   1831 <title>Callgraph and JIT support</title>
   1832 <para>
   1833 Callgraph output where anonymously mapped code is in the callstack can sometimes be misleading.
   1834 For all such code, the samples for the anonymously mapped code are stored in a samples subdirectory
   1835 named <filename>{anon:anon}/&lt;tgid&gt;.&lt;begin_addr&gt;.&lt;end_addr&gt;</filename>.
   1836 As stated earlier, if this anonymously mapped code is JITed code from a supported VM like Java,
   1837 OProfile creates an ELF file to provide a (somewhat) permanent backing file for the code.
   1838 However, when viewing callgraph output, any anonymously mapped code in the callstack
   1839 will be attributed to <filename>anon (&lt;tgid&gt;: range:&lt;begin_addr&gt;-&lt;end_addr&gt;</filename>,
   1840 even if a <filename>.jo</filename> ELF file had been created for it.  See the example below.
   1841 </para>
   1842 <screen>
   1843 -------------------------------------------------------------------------------
   1844   1         2.2727  libj9ute23.so            java.bin                 traceV
   1845   2         4.5455  libj9ute23.so            java.bin                 utsTraceV
   1846   4         9.0909  libj9trc23.so            java.bin                 fillInUTInterfaces
   1847   37       84.0909  libj9trc23.so            java.bin                 twGetSequenceCounter
   1848 8         0.0154  libj9prt23.so            java.bin                 j9time_hires_clock
   1849   27       61.3636  anon (tgid:10014 range:0x100000-0x103000) java.bin                 (no symbols)
   1850   9        20.4545  libc-2.4.so              java.bin                 gettimeofday
   1851   8        18.1818  libj9prt23.so            java.bin                 j9time_hires_clock [self]
   1852 -------------------------------------------------------------------------------
   1853 </screen>
   1854 <para>
   1855 The output shows that "anon (tgid:10014 range:0x100000-0x103000)" was a callee of
   1856 <code>j9time_hires_clock</code>, even though the ELF file <filename>10014.jo</filename> was
   1857 created for this profile run.  Unfortunately, there is currently no way to correlate
   1858 that anonymous callgraph entry with its corresponding <filename>.jo</filename> file.
   1859 </para>
   1860 </sect3>
   1861 
   1862 
   1863 </sect2> <!-- opreport-callgraph -->
   1864 
   1865 <sect2 id="opreport-diff">
   1866 <title>Differential profiles with <command>opreport</command></title>
   1867 
   1868 <para>
   1869 Often, we'd like to be able to compare two profiles. For example, when
   1870 analysing the performance of an application, we'd like to make code
   1871 changes and examine the effect of the change. This is supported in
   1872 <command>opreport</command> by giving a profile specification that
   1873 identifies two different profiles. The general form is of:
   1874 </para>
   1875 <screen>
   1876 $ opreport &lt;shared-spec&gt; { &lt;first-profile&gt; } { &lt;second-profile&gt; }
   1877 </screen>
   1878 <note><para>
   1879 We lost our Dragon book down the back of the sofa, so you have to be
   1880 careful to have spaces around those braces, or things will get
   1881 hopelessly confused. We can only apologise.
   1882 </para></note>
   1883 <para>
   1884 For each of the profiles, the shared section is prefixed, and then the
   1885 specification is analysed. The usual parameters work both within the
   1886 shared section, and in the sub-specification within the curly braces.
   1887 </para>
   1888 <para>
   1889 A typical way to use this feature is with archives created with
   1890 <command>oparchive</command>. Let's look at an example:
   1891 </para>
   1892 <screen>
   1893 $ ./a
   1894 $ oparchive -o orig ./a
   1895 $ opcontrol --reset
   1896   # edit and recompile a
   1897 $ ./a
   1898   # now compare the current profile of a with the archived profile
   1899 $ opreport -xl ./a { archive:./orig } { }
   1900 CPU: PIII, speed 863.233 MHz (estimated)
   1901 Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a
   1902 unit mask of 0x00 (No unit mask) count 100000
   1903 samples  %        diff %    symbol name
   1904 92435    48.5366  +0.4999   a
   1905 54226    ---      ---       c
   1906 49222    25.8459  +++       d
   1907 48787    25.6175  -2.2e-01  b
   1908 </screen>
   1909 <para>
   1910 Note that we specified an empty second profile in the curly braces, as
   1911 we wanted to use the current session; alternatively, we could
   1912 have specified another archive, or a tgid etc. We specified the binary
   1913 <command>a</command> in the shared section, so we matched that in both
   1914 the profiles we're diffing.
   1915 </para>
   1916 <para>
   1917 As in the normal output, the results are sorted by the number of
   1918 samples, and the percentage field represents the relative percentage of
   1919 the symbol's samples in the second profile.
   1920 </para>
   1921 <para>
   1922 Notice the new column in the output. This value represents the
   1923 percentage change of the relative percent between the first and the
   1924 second profile: roughly, "how much more important this symbol is".
   1925 Looking at the symbol <function>a()</function>, we can see that it took
   1926 roughly the same amount of the total profile in both the first and the
   1927 second profile. The function <function>c()</function> was not in the new
   1928 profile, so has been marked with <function>---</function>. Note that the
   1929 sample value is the number of samples in the first profile; since we're
   1930 displaying results for the second profile, we don't list a percentage
   1931 value for it, as it would be meaningless. <function>d()</function> is
   1932 new in the second profile, and consequently marked with
   1933 <function>+++</function>.
   1934 </para>
   1935 <para>
   1936 When comparing profiles between different binaries, it should be clear
   1937 that functions can change in terms of VMA and size. To avoid this
   1938 problem, <command>opreport</command> considers a symbol to be the same
   1939 if the symbol name, image name, and owning application name all match;
   1940 any other factors are ignored. Note that the check for application name
   1941 means that trying to compare library profiles between two different
   1942 applications will not work as you might expect: each symbol will be
   1943 considered different.
   1944 </para>
   1945 
   1946 </sect2> <!-- opreport-diff -->
   1947 
   1948 <sect2 id="opreport-anon">
   1949 <title>Anonymous executable mappings</title>
   1950 <para>
   1951 Many applications, typically ones involving dynamic compilation into
   1952 machine code (just-in-time, or "JIT", compilation), have executable mappings that
   1953 are not backed by an ELF file. <command>opreport</command> has basic support for showing the
   1954 samples taken in these regions; for example:
   1955 <screen>
   1956 $ opreport /usr/bin/mono -l
   1957 CPU: ppc64 POWER5, speed 1654.34 MHz (estimated)
   1958 Counted CYCLES events (Processor Cycles using continuous sampling) with a unit mask of 0x00 (No unit mask) count 100000
   1959 samples  %        image name    		                symbol name
   1960 47       58.7500  mono                     			(no symbols)
   1961 14       17.5000  anon (tgid:3189 range:0xf72aa000-0xf72fa000)  (no symbols)
   1962 9        11.2500  anon (tgid:3189 range:0xf6cca000-0xf6dd9000)  (no symbols)
   1963 .	 .	  .						.
   1964 </screen>
   1965 </para>
   1966 <para>
   1967 Note that, since such mappings are dependent upon individual invocations of
   1968 a binary, these mappings are always listed as a dependent image,
   1969 even when using <option>--separate=none</option>.
   1970 Equally, the results are not affected by the <option>--merge</option>
   1971 option.
   1972 </para>
   1973 <para>
   1974 As shown in the opreport output above, OProfile is unable to attribute the samples to any
   1975 symbol(s) because there is no ELF file for this code.
   1976 Enhanced support for JITed code is now available for some virtual machines; 
   1977 e.g., the Java Virtual Machine.  For details about OProfile output for
   1978 JITed code, see <xref linkend="getting-jit-reports" />.
   1979 </para>
   1980 <para>For more information about JIT support in OProfile, see <xref linkend="jitsupport"/>.
   1981 </para>
   1982 </sect2> <!-- opreport-anon -->
   1983 
   1984 <sect2 id="opreport-xml">
   1985 <title>XML formatted output</title>
   1986 <para>
   1987 The -xml option can be used to generate XML instead of the usual
   1988 text format.  This allows opreport to eliminate some of the constraints
   1989 dictated by the two dimensional text format.  For example, it is possible
   1990 to separate the sample data across multiple events, cpus and threads.  The XML
   1991 schema implemented by opreport is found in doc/opreport.xsd. It contains
   1992 more detailed comments about the structure of the XML generated by opreport.
   1993 </para>
   1994 <para>
   1995 Since XML is consumed by a client program rather than a user, its structure
   1996 is fairly static.  In particular, the --sort option is incompatible with the
   1997 --xml option.  Percentages are not dislayed in the XML so the options related
   1998 to percentages will have no effect.  Full pathnames are always displayed in
   1999 the XML so --long-filenames is not necessary.  The --details option will cause
   2000 all of the individual sample data to be included in the XML as well as the
   2001 instruction byte stream for each symbol (for doing disassembly) and can result
   2002 in very large XML files.
   2003 </para>
   2004 </sect2> <!-- opreport-xml -->
   2005 
   2006 <sect2 id="opreport-options">
   2007 <title>Options for <command>opreport</command></title>
   2008 
   2009 <variablelist>
   2010 <varlistentry><term><option>--accumulated / -a</option></term><listitem><para>
   2011 Accumulate sample and percentage counts in the symbol list.
   2012 </para></listitem></varlistentry>
   2013 <varlistentry><term><option>--callgraph / -c</option></term><listitem><para>
   2014 Show callgraph information.
   2015 </para></listitem></varlistentry>
   2016 <varlistentry><term><option>--debug-info / -g</option></term><listitem><para>
   2017 Show source file and line for each symbol.
   2018 </para></listitem></varlistentry>
   2019 <varlistentry><term><option>--demangle / -D none|normal|smart</option></term><listitem><para>
   2020 none: no demangling. normal: use default demangler (default) smart: use
   2021 pattern-matching to make C++ symbol demangling more readable.
   2022 </para></listitem></varlistentry>
   2023 <varlistentry><term><option>--details / -d</option></term><listitem><para>
   2024 Show per-instruction details for all selected symbols. Note that, for
   2025 binaries without symbol information, the VMA values shown are raw file
   2026 offsets for the image binary.
   2027 </para></listitem></varlistentry>
   2028 <varlistentry><term><option>--exclude-dependent / -x</option></term><listitem><para>
   2029 Do not include application-specific images for libraries, kernel modules
   2030 and the kernel. This option only makes sense if the profile session
   2031 used --separate.
   2032 </para></listitem></varlistentry>
   2033 <varlistentry><term><option>--exclude-symbols / -e [symbols]</option></term><listitem><para>
   2034 Exclude all the symbols in the given comma-separated list.
   2035 </para></listitem></varlistentry>
   2036 <varlistentry><term><option>--global-percent / -%</option></term><listitem><para>
   2037 Make all percentages relative to the whole profile.
   2038 </para></listitem></varlistentry>
   2039 <varlistentry><term><option>--help / -? / --usage</option></term><listitem><para>
   2040 Show help message.
   2041 </para></listitem></varlistentry>
   2042 <varlistentry><term><option>--image-path / -p [paths]</option></term><listitem><para>
   2043 Comma-separated list of additional paths to search for binaries.
   2044 This is needed to find modules in kernels 2.6 and upwards.
   2045 </para></listitem></varlistentry>
   2046 <varlistentry><term><option>--root / -R [path]</option></term><listitem><para>
   2047 A path to a filesystem to search for additional binaries.
   2048 </para></listitem></varlistentry>
   2049 <varlistentry><term><option>--include-symbols / -i [symbols]</option></term><listitem><para>
   2050 Only include symbols in the given comma-separated list.
   2051 </para></listitem></varlistentry>
   2052 <varlistentry><term><option>--long-filenames / -f</option></term><listitem><para>
   2053 Output full paths instead of basenames.
   2054 </para></listitem></varlistentry>
   2055 <varlistentry><term><option>--merge / -m [lib,cpu,tid,tgid,unitmask,all]</option></term><listitem><para>
   2056 Merge any profiles separated in a --separate session.
   2057 </para></listitem></varlistentry>
   2058 <varlistentry><term><option>--no-header</option></term><listitem><para>
   2059 Don't output a header detailing profiling parameters.
   2060 </para></listitem></varlistentry>
   2061 <varlistentry><term><option>--output-file / -o [file]</option></term><listitem><para>
   2062 Output to the given file instead of stdout.
   2063 </para></listitem></varlistentry>
   2064 <varlistentry><term><option>--reverse-sort / -r</option></term><listitem><para>
   2065 Reverse the sort from the default.
   2066 </para></listitem></varlistentry>
   2067 <varlistentry><term><option>--session-dir=</option>dir_path</term><listitem><para>
   2068 Use sample database out of directory <filename>dir_path</filename> 
   2069 instead of the default location (/var/lib/oprofile).
   2070 </para></listitem></varlistentry>
   2071 <varlistentry><term><option>--show-address / -w</option></term><listitem><para>
   2072 Show the VMA address of each symbol (off by default).
   2073 </para></listitem></varlistentry>
   2074 <varlistentry><term><option>--sort / -s [vma,sample,symbol,debug,image]</option></term><listitem><para>
   2075 Sort the list of symbols by, respectively, symbol address,
   2076 number of samples, symbol name, debug filename and line number,
   2077 binary image filename.
   2078 </para></listitem></varlistentry>
   2079 <varlistentry><term><option>--symbols / -l</option></term><listitem><para>
   2080 List per-symbol information instead of a binary image summary.
   2081 </para></listitem></varlistentry>
   2082 <varlistentry><term><option>--threshold / -t [percentage]</option></term><listitem><para>
   2083 Only output data for symbols that have more than the given percentage
   2084 of total samples.
   2085 </para></listitem></varlistentry>
   2086 <varlistentry><term><option>--verbose / -V [options]</option></term><listitem><para>
   2087 Give verbose debugging output.
   2088 </para></listitem></varlistentry>
   2089 <varlistentry><term><option>--version / -v</option></term><listitem><para>
   2090 Show version.
   2091 </para></listitem></varlistentry>
   2092 <varlistentry><term><option>--xml / -X</option></term><listitem><para>
   2093 Generate XML output.
   2094 </para></listitem></varlistentry>
   2095 </variablelist>
   2096 
   2097 </sect2>
   2098 
   2099 </sect1> <!-- opreport -->
   2100 
   2101 <sect1 id="opannotate">
   2102 <title>Outputting annotated source (<command>opannotate</command>)</title>
   2103 <para>
   2104 The <command>opannotate</command> utility generates annotated source files or assembly listings, optionally
   2105 mixed with source.
   2106 If you want to see the source file, the profiled application needs to have debug information, and the source
   2107 must be available through this debug information. For GCC, you must use the <option>-g</option> option
   2108 when you are compiling.
   2109 If the binary doesn't contain sufficient debug information, you can still
   2110 use <command>opannotate <option>--assembly</option></command> to get annotated assembly.
   2111 </para>
   2112 <para>
   2113 Note that for the reason explained in <xref linkend="hardware-counters" /> the results can be
   2114 inaccurate. The debug information itself can add other problems; for example, the line number for a symbol can be
   2115 incorrect. Assembly instructions can be re-ordered and moved by the compiler, and this can lead to
   2116 crediting source lines with samples not really "owned" by this line. Also see
   2117 <xref linkend="interpreting" />.
   2118 </para>
   2119 <para>
   2120 You can output the annotation to one single file, containing all the source found using the
   2121 <option>--source</option>. You can use this in conjunction with <option>--assembly</option>
   2122 to get combined source/assembly output.
   2123 </para>
   2124 <para>
   2125 You can also output a directory of annotated source files that maintains the structure of
   2126 the original sources. Each line in the annotated source is prepended with the samples
   2127 for that line. Additionally, each symbol is annotated giving details for the symbol
   2128 as a whole. An example:
   2129 </para>
   2130 <screen>
   2131 $ opannotate --source --output-dir=annotated /usr/local/oprofile-pp/bin/oprofiled
   2132 $ ls annotated/home/moz/src/oprofile-pp/daemon/
   2133 opd_cookie.h  opd_image.c  opd_kernel.c  opd_sample_files.c  oprofiled.c
   2134 </screen>
   2135 <para>
   2136 Line numbers are maintained in the source files, but each file has
   2137 a footer appended describing the profiling details. The actual annotation
   2138 looks something like this :
   2139 </para>
   2140 <screen>
   2141 ...
   2142                :static uint64_t pop_buffer_value(struct transient * trans)
   2143  11510  1.9661 :{ /* pop_buffer_value total:  89901 15.3566 */
   2144                :        uint64_t val;
   2145                :
   2146  10227  1.7469 :        if (!trans->remaining) {
   2147                :                fprintf(stderr, "BUG: popping empty buffer !\n");
   2148                :                exit(EXIT_FAILURE);
   2149                :        }
   2150                :
   2151                :        val = get_buffer_value(trans->buffer, 0);
   2152   2281  0.3896 :        trans->remaining--;
   2153   2296  0.3922 :        trans->buffer += kernel_pointer_size;
   2154                :        return val;
   2155  10454  1.7857 :}
   2156 ...
   2157 </screen>
   2158 
   2159 <para>
   2160 The first number on each line is the number of samples, whilst the second is
   2161 the relative percentage of total samples.
   2162 </para>
   2163 
   2164 <sect2 id="opannotate-finding-source">
   2165 <title>Locating source files</title>
   2166 <para>
   2167 Of course, <command>opannotate</command> needs to be able to locate the source files
   2168 for the binary image(s) in order to produce output. Some binary images have debug
   2169 information where the given source file paths are relative, not absolute. You can
   2170 specify search paths to look for these files (similar to <command>gdb</command>'s
   2171 <option>dir</option> command) with the <option>--search-dirs</option> option.
   2172 </para>
   2173 <para>
   2174 Sometimes you may have a binary image which gives absolute paths for the source files,
   2175 but you have the actual sources elsewhere (commonly, you've installed an SRPM for
   2176 a binary on your system and you want annotation from an existing profile). You can
   2177 use the <option>--base-dirs</option> option to redirect OProfile to look somewhere
   2178 else for source files. For example, imagine we have a binary generated from a source
   2179 file that is given in the debug information as <filename>/tmp/build/libfoo/foo.c</filename>,
   2180 and you have the source tree matching that binary installed in <filename>/home/user/libfoo/</filename>.
   2181 You can redirect OProfile to find <filename>foo.c</filename> correctly like this :
   2182 </para>
   2183 <screen>
   2184 $ opannotate --source --base-dirs=/tmp/build/libfoo/ --search-dirs=/home/user/libfoo/ --output-dir=annotated/ /lib/libfoo.so
   2185 </screen>
   2186 <para>
   2187 You can specify multiple (comma-separated) paths to both options.
   2188 </para>
   2189 </sect2>
   2190 
   2191 <sect2 id="opannotate-details">
   2192 <title>Usage of <command>opannotate</command></title>
   2193 
   2194 <variablelist>
   2195 <varlistentry><term><option>--assembly / -a</option></term><listitem><para>
   2196 Output annotated assembly. If this is combined with --source, then mixed
   2197 source / assembly annotations are output.
   2198 </para></listitem></varlistentry>
   2199 <varlistentry><term><option>--base-dirs / -b [paths]/</option></term><listitem><para>
   2200 Comma-separated list of path prefixes. This can be used to point OProfile to a
   2201 different location for source files when the debug information specifies an
   2202 absolute path on your system for the source that does not exist. The prefix
   2203 is stripped from the debug source file paths, then searched in the search dirs
   2204 specified by <option>--search-dirs</option>.
   2205 </para></listitem></varlistentry>
   2206 <varlistentry><term><option>--demangle / -D none|normal|smart</option></term><listitem><para>
   2207 none: no demangling. normal: use default demangler (default) smart: use
   2208 pattern-matching to make C++ symbol demangling more readable.
   2209 </para></listitem></varlistentry>
   2210 <varlistentry><term><option>--exclude-dependent / -x</option></term><listitem><para>
   2211 Do not include application-specific images for libraries, kernel modules
   2212 and the kernel. This option only makes sense if the profile session
   2213 used --separate.
   2214 </para></listitem></varlistentry>
   2215 <varlistentry><term><option>--exclude-file [files]</option></term><listitem><para>
   2216 Exclude all files in the given comma-separated list of glob patterns.
   2217 </para></listitem></varlistentry>
   2218 <varlistentry><term><option>--exclude-symbols / -e [symbols]</option></term><listitem><para>
   2219 Exclude all the symbols in the given comma-separated list.
   2220 </para></listitem></varlistentry>
   2221 <varlistentry><term><option>--help / -? / --usage</option></term><listitem><para>
   2222 Show help message.
   2223 </para></listitem></varlistentry>
   2224 <varlistentry><term><option>--image-path / -p [paths]</option></term><listitem><para>
   2225 Comma-separated list of additional paths to search for binaries.
   2226 This is needed to find modules in kernels 2.6 and upwards.
   2227 </para></listitem></varlistentry>
   2228 <varlistentry><term><option>--root / -R [path]</option></term><listitem><para>
   2229 A path to a filesystem to search for additional binaries.
   2230 </para></listitem></varlistentry>
   2231 <varlistentry><term><option>--include-file [files]</option></term><listitem><para>
   2232 Only include files in the given comma-separated list of glob patterns.
   2233 </para></listitem></varlistentry>
   2234 <varlistentry><term><option>--include-symbols / -i [symbols]</option></term><listitem><para>
   2235 Only include symbols in the given comma-separated list.
   2236 </para></listitem></varlistentry>
   2237 <varlistentry><term><option>--objdump-params [params]</option></term><listitem><para>
   2238 Pass the given parameters as extra values when calling objdump.
   2239 </para></listitem></varlistentry>
   2240 <varlistentry><term><option>--output-dir / -o [dir]</option></term><listitem><para>
   2241 Output directory. This makes opannotate output one annotated file for each
   2242 source file. This option can't be used in conjunction with --assembly.
   2243 </para></listitem></varlistentry>
   2244 <varlistentry><term><option>--search-dirs / -d [paths]</option></term><listitem><para>
   2245 Comma-separated list of paths to search for source files. This is useful to find
   2246 source files when the debug information only contains relative paths.
   2247 </para></listitem></varlistentry>
   2248 <varlistentry><term><option>--source / -s</option></term><listitem><para>
   2249 Output annotated source. This requires debugging information to be available
   2250 for the binaries.
   2251 </para></listitem></varlistentry>
   2252 <varlistentry><term><option>--threshold / -t [percentage]</option></term><listitem><para>
   2253 Only output data for symbols that have more than the given percentage
   2254 of total samples.
   2255 </para></listitem></varlistentry>
   2256 <varlistentry><term><option>--verbose / -V [options]</option></term><listitem><para>
   2257 Give verbose debugging output.
   2258 </para></listitem></varlistentry>
   2259 <varlistentry><term><option>--version / -v</option></term><listitem><para>
   2260 Show version.
   2261 </para></listitem></varlistentry>
   2262 </variablelist>
   2263 
   2264 
   2265 </sect2> <!-- opannotate-details -->
   2266 
   2267 </sect1> <!-- opannotate -->
   2268 
   2269 <sect1 id="getting-jit-reports">
   2270 	<title>OProfile results with JIT samples</title>
   2271 	<para>
   2272 		After profiling a Java (or other supported VM) application, the command
   2273 		<screen><command>"opcontrol --dump"</command> </screen>
   2274 		flushes the sample buffers and creates ELF binaries from the
   2275 		intermediate files that were written by the agent library.
   2276 		The ELF binaries are named <filename>&lt;tgid&gt;.jo</filename>.
   2277 		With the symbol information stored in these ELF files, it is
   2278 		possible to map samples to the appropriate symbols.
   2279 	</para>
   2280 	<para>
   2281 		The usual analysis tools (<command>opreport</command> and/or 
   2282 		<command>opannotate</command>) can now be used
   2283 		to get symbols and assembly code for the instrumented VM processes.
   2284 	</para>
   2285 <para>
   2286 Below is an example of a profile report of a Java application that has been
   2287 instrumented with the provided agent library.
   2288 <screen>
   2289 $ opreport -l /usr/lib/jvm/jre-1.5.0-ibm/bin/java
   2290 CPU: Core Solo / Duo, speed 2167 MHz (estimated)
   2291 Counted CPU_CLK_UNHALTED events (Unhalted clock cycles) with a unit mask of 0x00 (Unhalted core cycles) count 100000
   2292 samples  %        image name               symbol name
   2293 186020   50.0523  no-vmlinux               no-vmlinux               (no symbols)
   2294 34333     9.2380  7635.jo                  java                     void test.f1()
   2295 19022     5.1182  libc-2.5.so              libc-2.5.so              _IO_file_xsputn@@GLIBC_2.1
   2296 18762     5.0483  libc-2.5.so              libc-2.5.so              vfprintf
   2297 16408     4.4149  7635.jo                  java                     void test$HelloThread.run()
   2298 16250     4.3724  7635.jo                  java                     void test$test_1.f2(int)
   2299 15303     4.1176  7635.jo                  java                     void test.f2(int, int)
   2300 13252     3.5657  7635.jo                  java                     void test.f2(int)
   2301 5165      1.3897  7635.jo                  java                     void test.f4()
   2302 955       0.2570  7635.jo                  java                     void test$HelloThread.run()~
   2303 
   2304 </screen>
   2305 </para>
   2306 <note><para>
   2307 	  Depending on the JVM that is used, certain options of opreport and opannotate
   2308 	  do NOT work since they rely on debug information (e.g. source code line number)
   2309 	  that is not always available. The Sun JVM does provide the necessary debug
   2310 	  information via the JVMTI[PI] interface,
   2311 	  but other JVMs do not.
   2312   </para></note>
   2313 	<para>
   2314 		As you can see in the opreport output, the JIT support agent for Java
   2315 		generates symbols to include the class and method signature.
   2316 		A symbol with the suffix &tilde;&lt;n&gt; (e.g.
   2317 		<code>void test$HelloThread.run()&tilde;1</code>) means that this is
   2318 		the &lt;n&gt;th occurrence of the identical name. This happens if a method is re-JITed.
   2319 		A symbol with the suffix %&lt;n&gt;, means that the address space of this symbol
   2320 		was reused during the sample session (see <xref linkend="overlapping-symbols" />).
   2321 		The value &lt;n&gt; is the percentage of time that this symbol/code was present in
   2322 		relation to the total lifetime of all overlapping other symbols. A symbol of the form
   2323 		<code>&lt;return_val&gt; &lt;class_name&gt;$&lt;method_sig&gt;</code> denotes an
   2324 		inner class.
   2325 	</para>
   2326 </sect1>
   2327 
   2328 <sect1 id="opgprof">
   2329 <title><command>gprof</command>-compatible output (<command>opgprof</command>)</title>
   2330 <para>
   2331 If you're familiar with the output produced by <command>GNU gprof</command>,
   2332 you may find <command>opgprof</command> useful. It takes a single binary
   2333 as an argument, and produces a <filename>gmon.out</filename> file for use
   2334 with <command>gprof -p</command>. If call-graph profiling is enabled,
   2335 then this is also included.
   2336 </para>
   2337 <screen>
   2338 $ opgprof `which oprofiled` # generates gmon.out file
   2339 $ gprof -p `which oprofiled` | head
   2340 Flat profile:
   2341 
   2342 Each sample counts as 1 samples.
   2343   %   cumulative   self              self     total
   2344  time   samples   samples    calls  T1/call  T1/call  name
   2345  33.13 206237.00 206237.00                             odb_insert
   2346  22.67 347386.00 141149.00                             pop_buffer_value
   2347   9.56 406881.00 59495.00                             opd_put_sample
   2348   7.34 452599.00 45718.00                             opd_find_image
   2349   7.19 497327.00 44728.00                             opd_process_samples
   2350 </screen>
   2351 
   2352 <sect2 id="opgprof-details">
   2353 <title>Usage of <command>opgprof</command></title>
   2354 
   2355 <variablelist>
   2356 <varlistentry><term><option>--help / -? / --usage</option></term><listitem><para>
   2357 Show help message.
   2358 </para></listitem></varlistentry>
   2359 <varlistentry><term><option>--image-path / -p [paths]</option></term><listitem><para>
   2360 Comma-separated list of additional paths to search for binaries.
   2361 This is needed to find modules in kernels 2.6 and upwards.
   2362 </para></listitem></varlistentry>
   2363 <varlistentry><term><option>--root / -R [path]</option></term><listitem><para>
   2364 A path to a filesystem to search for additional binaries.
   2365 </para></listitem></varlistentry>
   2366 <varlistentry><term><option>--output-filename / -o [file]</option></term><listitem><para>
   2367 Output to the given file instead of the default, gmon.out
   2368 </para></listitem></varlistentry>
   2369 <varlistentry><term><option>--threshold / -t [percentage]</option></term><listitem><para>
   2370 Only output data for symbols that have more than the given percentage
   2371 of total samples.
   2372 </para></listitem></varlistentry>
   2373 <varlistentry><term><option>--verbose / -V [options]</option></term><listitem><para>
   2374 Give verbose debugging output.
   2375 </para></listitem></varlistentry>
   2376 <varlistentry><term><option>--version / -v</option></term><listitem><para>
   2377 Show version.
   2378 </para></listitem></varlistentry>
   2379 </variablelist>
   2380 
   2381 </sect2> <!-- opgprof-details -->
   2382 
   2383 </sect1> <!-- opgprof -->
   2384 
   2385 <sect1 id="oparchive">
   2386 <title>Archiving measurements (<command>oparchive</command>)</title>
   2387 <para>
   2388 	The <command>oparchive</command> utility generates a directory populated
   2389 	with executable, debug, and oprofile sample files. This directory can be
   2390 	moved to another machine via <command>tar</command> and analyzed without
   2391 	further use of the data collection machine.
   2392 </para>
   2393 
   2394 <para>
   2395 	The following command would collect the sample files, the executables
   2396 	associated with the sample files, and the debuginfo files associated
   2397 	with the executables and copy them into
   2398 	<filename>/tmp/current_data</filename>:
   2399 </para>
   2400 
   2401 <screen>
   2402 # oparchive -o /tmp/current_data
   2403 </screen>
   2404 
   2405 <sect2 id="oparchive-details">
   2406 <title>Usage of <command>oparchive</command></title>
   2407 
   2408 <variablelist>
   2409 <varlistentry><term><option>--help / -? / --usage</option></term><listitem><para>
   2410 Show help message.
   2411 </para></listitem></varlistentry>
   2412 <varlistentry><term><option>--exclude-dependent / -x</option></term><listitem><para>
   2413 Do not include application-specific images for libraries, kernel modules
   2414 and the kernel. This option only makes sense if the profile session
   2415 used --separate.
   2416 </para></listitem></varlistentry>
   2417 <varlistentry><term><option>--image-path / -p [paths]</option></term><listitem><para>
   2418 Comma-separated list of additional paths to search for binaries.
   2419 This is needed to find modules in kernels 2.6 and upwards.
   2420 </para></listitem></varlistentry>
   2421 <varlistentry><term><option>--root / -R [path]</option></term><listitem><para>
   2422 A path to a filesystem to search for additional binaries.
   2423 </para></listitem></varlistentry>
   2424 <varlistentry><term><option>--output-directory / -o [directory]</option></term><listitem><para>
   2425 Output to the given directory. There is no default. This must be specified.
   2426 </para></listitem></varlistentry>
   2427 <varlistentry><term><option>--list-files / -l</option></term><listitem><para>
   2428 Only list the files that would be archived, don't copy them.
   2429 </para></listitem></varlistentry>
   2430 <varlistentry><term><option>--verbose / -V [options]</option></term><listitem><para>
   2431 Give verbose debugging output.
   2432 </para></listitem></varlistentry>
   2433 <varlistentry><term><option>--version / -v</option></term><listitem><para>
   2434 Show version.
   2435 </para></listitem></varlistentry>
   2436 </variablelist>
   2437 
   2438 </sect2> <!-- oparchive-details -->
   2439 
   2440 </sect1> <!-- oparchive -->
   2441 
   2442 <sect1 id="opimport">
   2443 <title>Converting sample database files (<command>opimport</command>)</title>
   2444 <para>
   2445 	This utility converts sample database files from a foreign binary format (abi) to
   2446 	the native format. This is useful only when moving sample files between hosts,
   2447 	for analysis on platforms other than the one used for collection. The abi format
   2448 	of the file to be imported is described in a text file located in <filename>$SESSION_DIR/abi</filename>.
   2449 </para>
   2450 
   2451 <para>
   2452 	The following command would convert the input samples files to the
   2453 	output samples files using the given abi file as a binary description
   2454 	of the input file and the curent platform abi as a binary description
   2455 	of the output file.
   2456 </para>
   2457 
   2458 <screen>
   2459 # opimport -a /var/lib/oprofile/abi -o /tmp/current/.../GLOBAL_POWER_EVENTS.200000.1.all.all.all /var/lib/.../mprime/GLOBAL_POWER_EVENTS.200000.1.all.all.all
   2460 </screen>
   2461 
   2462 <sect2 id="opimport-details">
   2463 <title>Usage of <command>opimport</command></title>
   2464 
   2465 <variablelist>
   2466 <varlistentry><term><option>--help / -? / --usage</option></term><listitem><para>
   2467 Show help message.
   2468 </para></listitem></varlistentry>
   2469 <varlistentry><term><option>--abi / -a [filename]</option></term><listitem><para>
   2470 Input abi file description location.
   2471 </para></listitem></varlistentry>
   2472 <varlistentry><term><option>--force / -f</option></term><listitem><para>
   2473 Force conversion even if the input and output abi are identical.
   2474 </para></listitem></varlistentry>
   2475 <varlistentry><term><option>--output / -o [filename]</option></term><listitem><para>
   2476 Specify the output filename. If the output file already exists, the file is
   2477 not overwritten but data are accumulated in. Sample filename are informative
   2478 for post profile tools and must be kept identical, in other word the pathname
   2479 from the first path component containing a '{' must be kept as it in the
   2480 output filename.
   2481 </para></listitem></varlistentry>
   2482 <varlistentry><term><option>--verbose / -V</option></term><listitem><para>
   2483 Give verbose debugging output.
   2484 </para></listitem></varlistentry>
   2485 <varlistentry><term><option>--version / -v</option></term><listitem><para>
   2486 Show version.
   2487 </para></listitem></varlistentry>
   2488 </variablelist>
   2489 
   2490 </sect2> <!-- opimport-details -->
   2491 
   2492 </sect1> <!-- opimport -->
   2493 
   2494 </chapter>
   2495 
   2496 <chapter id="interpreting">
   2497 <title>Interpreting profiling results</title>
   2498 <para>
   2499 The standard caveats of profiling apply in interpreting the results from OProfile:
   2500 profile realistic situations, profile different scenarios, profile
   2501 for as long as a time as possible, avoid system-specific artifacts, don't trust
   2502 the profile data too much. Also bear in mind the comments on the performance
   2503 counters above - you <emphasis>cannot</emphasis> rely on totally accurate
   2504 instruction-level profiling.  However, for almost all circumstances the data
   2505 can be useful. Ideally a utility such as Intel's VTUNE would be available to
   2506 allow careful instruction-level analysis; go hassle Intel for this, not me ;)
   2507 </para>
   2508 <sect1 id="irq-latency">
   2509 <title>Profiling interrupt latency</title>
   2510 <para>
   2511 This is an example of how the latency of delivery of profiling interrupts
   2512 can impact the reliability of the profiling data. This is pretty much a 
   2513 worst-case-scenario example: these problems are fairly rare.
   2514 </para>
   2515 <screen>
   2516 double fun(double a, double b, double c)
   2517 {
   2518  double result = 0;
   2519  for (int i = 0 ; i &lt; 10000; ++i) {
   2520   result += a;
   2521   result *= b;
   2522   result /= c;
   2523  }
   2524  return result;
   2525 }
   2526 </screen>
   2527 <para>
   2528 Here the last instruction of the loop is very costly, and you would expect the result
   2529 reflecting that - but (cutting the instructions inside the loop):
   2530 </para>
   2531 <screen>
   2532 $ opannotate -a -t 10 ./a.out
   2533 
   2534      88 15.38% : 8048337:       fadd   %st(3),%st
   2535      48 8.391% : 8048339:       fmul   %st(2),%st
   2536      68 11.88% : 804833b:       fdiv   %st(1),%st
   2537     368 64.33% : 804833d:       inc    %eax
   2538                : 804833e:       cmp    $0x270f,%eax
   2539                : 8048343:       jle    8048337
   2540 </screen>
   2541 <para>
   2542 The problem comes from the x86 hardware; when the counter overflows the IRQ
   2543 is asserted but the hardware has features that can delay the NMI interrupt:
   2544 x86 hardware is synchronous (i.e. cannot interrupt during an instruction);
   2545 there is also a latency when the IRQ is asserted, and the multiple
   2546 execution units and the out-of-order model of modern x86 CPUs also causes
   2547 problems. This is the same function, with annotation :
   2548 </para>
   2549 <screen>
   2550 $ opannotate -s -t 10 ./a.out
   2551 
   2552                :double fun(double a, double b, double c)
   2553                :{ /* _Z3funddd total:     572 100.0% */
   2554                : double result = 0;
   2555     368 64.33% : for (int i = 0 ; i &lt; 10000; ++i) {
   2556      88 15.38% :  result += a;
   2557      48 8.391% :  result *= b;
   2558      68 11.88% :  result /= c;
   2559                : }
   2560                : return result;
   2561                :}
   2562 </screen>
   2563 <para>
   2564 The conclusion: don't trust samples coming at the end of a loop,
   2565 particularly if the last instruction generated by the compiler is costly. This
   2566 case can also occur for branches. Always bear in mind that samples
   2567 can be delayed by a few cycles from its real position. That's a hardware
   2568 problem and OProfile can do nothing about it.
   2569 </para>
   2570 </sect1>
   2571 <sect1 id="kernel-profiling">
   2572 <title>Kernel profiling</title>
   2573 <sect2 id="irq-masking">
   2574 <title>Interrupt masking</title>
   2575 <para>
   2576 OProfile uses non-maskable interrupts (NMI) on the P6 generation, Pentium 4,
   2577 Athlon, Opteron, Phenom, and Turion processors. These interrupts can occur even in section of the
   2578 Linux where interrupts are disabled, allowing collection of samples in virtually
   2579 all executable code.  The RTC, timer interrupt mode, and Itanium 2 collection mechanisms
   2580 use maskable interrupts. Thus, the RTC and Itanium 2 data collection mechanism have "sample
   2581 shadows", or blind spots: regions where no samples will be collected. Typically, the samples
   2582 will be attributed to the code immediately after the interrupts are re-enabled.
   2583 </para>
   2584 </sect2>
   2585 <sect2 id="idle">
   2586 <title>Idle time</title>
   2587 <para>
   2588 Your kernel is likely to support halting the processor when a CPU is idle. As
   2589 the typical hardware events like <constant>CPU_CLK_UNHALTED</constant> do not
   2590 count when the CPU is halted, the kernel profile will not reflect the actual
   2591 amount of time spent idle. You can change this behaviour by booting with
   2592 the <option>idle=poll</option> option, which uses a different idle routine. This
   2593 will appear as <function>poll_idle()</function> in your kernel profile.
   2594 </para>
   2595 </sect2>
   2596 <sect2 id="kernel-modules">
   2597 <title>Profiling kernel modules</title>
   2598 <para>
   2599 OProfile profiles kernel modules by default. However, there are a couple of problems
   2600 you may have when trying to get results. First, you may have booted via an initrd;
   2601 this means that the actual path for the module binaries cannot be determined automatically.
   2602 To get around this, you can use the <option>-p</option> option to the profiling tools
   2603 to specify where to look for the kernel modules.
   2604 </para>
   2605 <para>
   2606 In 2.6, the information on where kernel module binaries are located has been removed.
   2607 This means OProfile needs guiding with the <option>-p</option> option to find your
   2608 modules. Normally, you can just use your standard module top-level directory for this.
   2609 Note that due to this problem, OProfile cannot check that the modification times match;
   2610 it is your responsibility to make sure you do not modify a binary after a profile
   2611 has been created.
   2612 </para>
   2613 <para>
   2614 If you have run <command>insmod</command> or <command>modprobe</command> to insert a module
   2615 in a particular directory, it is important that you specify this directory with the 
   2616 <option>-p</option> option first, so that it over-rides an older module binary that might
   2617 exist in other directories you've specified with <option>-p</option>. It is up to you
   2618 to make sure that these values are correct: 2.6 kernels simply do not provide enough
   2619 information for OProfile to get this information.
   2620 </para>
   2621 </sect2>
   2622 </sect1>
   2623 
   2624 <sect1 id="interpreting-callgraph">
   2625 <title>Interpreting call-graph profiles</title>
   2626 <para>
   2627 Sometimes the results from call-graph profiles may be different to what
   2628 you expect to see. The first thing to check is whether the target
   2629 binaries where compiled with frame pointers enabled (if the binary was
   2630 compiled using <command>gcc</command>'s
   2631 <option>-fomit-frame-pointer</option> option, you will not get
   2632 meaningful results). Note that as of this writing, the GCC developers
   2633 plan to disable frame pointers by default. The Linux kernel is built
   2634 without frame pointers by default; there is a configuration option you
   2635 can use to turn it on under the "Kernel Hacking" menu.
   2636 </para>
   2637 <para>
   2638 Often you may see a caller of a function that does not actually directly
   2639 call the function you're looking at (e.g. if <function>a()</function>
   2640 calls <function>b()</function>, which in turn calls
   2641 <function>c()</function>, you may see an entry for
   2642 <function>a()->c()</function>).  What's actually occurring is that we
   2643 are taking samples at the very start (or the very end) of
   2644 <function>c()</function>; at these few instructions, we haven't yet
   2645 created the new function's frame, so it appears as if
   2646 <function>a()</function> is calling directly into
   2647 <function>c()</function>. Be careful not to be misled by these
   2648 entries.
   2649 </para>
   2650 <para>
   2651 Like the rest of OProfile, call-graph profiling uses a statistical
   2652 approach; this means that sometimes a backtrace sample is truncated, or
   2653 even partially wrong. Bear this in mind when examining results.
   2654 </para>
   2655 <!--  FIXME: what do we need here ? -->
   2656 </sect1>
   2657 
   2658 <sect1 id="debug-info">
   2659 <title>Inaccuracies in annotated source</title>
   2660 <sect2 id="effect-of-optimizations">
   2661 <title>Side effects of optimizations</title>
   2662 <para>
   2663 The compiler can introduce some pitfalls in the annotated source output.
   2664 The optimizer can move pieces of code in such manner that two line of codes
   2665 are interlaced (instruction scheduling). Also debug info generated by the compiler 
   2666 can show strange behavior. This is especially true for complex expressions e.g. inside
   2667 an if statement:
   2668 </para>
   2669 <screen>
   2670 	if (a &amp;&amp; ..
   2671 	    b &amp;&amp; ..
   2672 	    c &amp;&amp;)
   2673 </screen>
   2674 <para>
   2675 here the problem come from the position of line number. The available debug
   2676 info does not give enough details for the if condition, so all samples are
   2677 accumulated at the position of the right brace of the expression. Using
   2678 <command>opannotate <option>-a</option></command> can help to show the real
   2679 samples at an assembly level.
   2680 </para>
   2681 </sect2>
   2682 <sect2 id="prologues">
   2683 <title>Prologues and epilogues</title>
   2684 <para>
   2685 The compiler generally needs to generate "glue" code across function calls, dependent
   2686 on the particular function call conventions used. Additionally other things
   2687 need to happen, like stack pointer adjustment for the local variables; this
   2688 code is known as the function prologue. Similar code is needed at function return,
   2689 and is known as the function epilogue. This will show up in annotations as
   2690 samples at the very start and end of a function, where there is no apparent
   2691 executable code in the source.
   2692 </para>
   2693 </sect2>
   2694 <sect2 id="inlined-function">
   2695 <title>Inlined functions</title>
   2696 <para>
   2697 You may see that a function is credited with a certain number of samples, but
   2698 the listing does not add up to the correct total. To pick a real example :
   2699 </para>
   2700 <screen>
   2701                :internal_sk_buff_alloc_security(struct sk_buff *skb)
   2702  353 2.342%    :{ /* internal_sk_buff_alloc_security total: 1882 12.48% */
   2703                :
   2704                :        sk_buff_security_t *sksec;
   2705   15 0.0995%   :        int rc = 0;
   2706                :
   2707   10 0.06633%  :        sksec = skb-&gt;lsm_security;
   2708  468 3.104%    :        if (sksec &amp;&amp; sksec-&gt;magic == DSI_MAGIC) {
   2709                :                goto out;
   2710                :        }
   2711                :
   2712                :        sksec = (sk_buff_security_t *) get_sk_buff_memory(skb);
   2713    3 0.0199%   :        if (!sksec) {
   2714   38 0.2521%   :                rc = -ENOMEM;
   2715                :                goto out;
   2716   10 0.06633%  :        }
   2717                :        memset(sksec, 0, sizeof (sk_buff_security_t));
   2718   44 0.2919%   :        sksec-&gt;magic = DSI_MAGIC;
   2719   32 0.2123%   :        sksec-&gt;skb = skb;
   2720   45 0.2985%   :        sksec-&gt;sid = DSI_SID_NORMAL;
   2721   31 0.2056%   :        skb-&gt;lsm_security = sksec;
   2722                :
   2723                :      out:
   2724                :
   2725  146 0.9685%   :        return rc;
   2726                :
   2727   98 0.6501%   :}
   2728 </screen>
   2729 <para>
   2730 Here, the function is credited with 1,882 samples, but the annotations
   2731 below do not account for this. This is usually because of inline functions -
   2732 the compiler marks such code with debug entries for the inline function
   2733 definition, and this is where <command>opannotate</command> annotates
   2734 such samples. In the case above, <function>memset</function> is the most
   2735 likely candidate for this problem. Examining the mixed source/assembly
   2736 output can help identify such results.
   2737 </para>
   2738 <para>
   2739 This problem is more visible when there is no source file available, in the
   2740 following example it's trivially visible the sums of symbols samples is less
   2741 than the number of the samples for this file. The difference must be accounted
   2742 to inline functions.
   2743 </para>
   2744 <screen>
   2745 /*
   2746  * Total samples for file : "arch/i386/kernel/process.c"
   2747  *
   2748  *    109  2.4616
   2749  */
   2750 
   2751  /* default_idle total:     84  1.8970 */
   2752  /* cpu_idle total:         21  0.4743 */
   2753  /* flush_thread total:      1  0.0226 */
   2754  /* prepare_to_copy total:   1  0.0226 */
   2755  /* __switch_to total:      18  0.4065 */
   2756 </screen>
   2757 <para>
   2758 The missing samples are not lost, they will be credited to another source
   2759 location where the inlined function is defined. The inlined function will be
   2760 credited from multiple call site and merged in one place in the annotated
   2761 source file so there is no way to see from what call site are coming the
   2762 samples for an inlined function.
   2763 </para>
   2764 <para>
   2765 When running <command>opannotate</command>, you may get a warning
   2766 "some functions compiled without debug information may have incorrect source line attributions".
   2767 In some rare cases, OProfile is not able to verify that the derived source line
   2768 is correct (when some parts of the binary image are compiled without debugging
   2769 information). Be wary of results if this warning appears.
   2770 </para>
   2771 <para>
   2772 Furthermore, for some languages the compiler can implicitly generate functions,
   2773 such as default copy constructors. Such functions are labelled by the compiler
   2774 as having a line number of 0, which means the source annotation can be confusing.
   2775 </para>
   2776 <!-- FIXME so what *actually* happens to those samples ? ignored ? -->
   2777 </sect2>
   2778 <sect2 id="wrong-linenr-info">
   2779 <title>Inaccuracy in line number information</title>
   2780 <para>
   2781 Depending on your compiler you can fall into the following problem:
   2782 </para>
   2783 <screen>
   2784 struct big_object { int a[500]; };
   2785 
   2786 int main()
   2787 {
   2788 	big_object a, b;
   2789 	for (int i = 0 ; i != 1000 * 1000; ++i)
   2790 		b = a;
   2791 	return 0;
   2792 }
   2793 
   2794 </screen>
   2795 <para>
   2796 Compiled with <command>gcc</command> 3.0.4 the annotated source is clearly inaccurate:
   2797 </para>
   2798 <screen>
   2799                :int main()
   2800                :{  /* main total: 7871 100% */
   2801                :        big_object a, b;
   2802                :        for (int i = 0 ; i != 1000 * 1000; ++i)
   2803                :                b = a;
   2804  7871 100%     :        return 0;
   2805                :}
   2806 </screen>
   2807 <para>
   2808 The problem here is distinct from the IRQ latency problem; the debug line number
   2809 information is not precise enough; again, looking at output of <command>opannoatate -as</command> can help.
   2810 </para>
   2811 <screen>
   2812                :int main()
   2813                :{
   2814                :        big_object a, b;
   2815                :        for (int i = 0 ; i != 1000 * 1000; ++i)
   2816                : 80484c0:       push   %ebp
   2817                : 80484c1:       mov    %esp,%ebp
   2818                : 80484c3:       sub    $0xfac,%esp
   2819                : 80484c9:       push   %edi
   2820                : 80484ca:       push   %esi
   2821                : 80484cb:       push   %ebx
   2822                :                b = a;
   2823                : 80484cc:       lea    0xfffff060(%ebp),%edx
   2824                : 80484d2:       lea    0xfffff830(%ebp),%eax
   2825                : 80484d8:       mov    $0xf423f,%ebx
   2826                : 80484dd:       lea    0x0(%esi),%esi
   2827                :        return 0;
   2828     3 0.03811% : 80484e0:       mov    %edx,%edi
   2829                : 80484e2:       mov    %eax,%esi
   2830     1 0.0127%  : 80484e4:       cld
   2831     8 0.1016%  : 80484e5:       mov    $0x1f4,%ecx
   2832  7850 99.73%   : 80484ea:       repz movsl %ds:(%esi),%es:(%edi)
   2833     9 0.1143%  : 80484ec:       dec    %ebx
   2834                : 80484ed:       jns    80484e0
   2835                : 80484ef:       xor    %eax,%eax
   2836                : 80484f1:       pop    %ebx
   2837                : 80484f2:       pop    %esi
   2838                : 80484f3:       pop    %edi
   2839                : 80484f4:       leave
   2840                : 80484f5:       ret
   2841 </screen>
   2842 <para>
   2843 So here it's clear that copying is correctly credited with of all the samples, but the
   2844 line number information is misplaced. <command>objdump -dS</command> exposes the
   2845 same problem. Note that maintaining accurate debug information for compilers when optimizing is difficult, so this problem is not suprising.
   2846 The problem of debug information
   2847 accuracy is also dependent on the binutils version used; some BFD library versions
   2848 contain a work-around for known problems of <command>gcc</command>, some others do not. This is unfortunate but we must live with that,
   2849 since profiling is pointless when you disable optimisation (which would give better debugging entries).
   2850 </para>
   2851 </sect2>
   2852 </sect1>
   2853 <sect1 id="symbol-without-debug-info">
   2854 <title>Assembly functions</title>
   2855 <para>
   2856 Often the assembler cannot generate debug information automatically.
   2857 This means that you cannot get a source report unless 
   2858 you manually define the neccessary debug information; read your assembler documentation for how you might
   2859 do that. The only
   2860 debugging info needed currently by OProfile is the line-number/filename-VMA association. When profiling assembly
   2861 without debugging info you can always get report for symbols, and optionally for VMA, through <command>opreport -l</command>
   2862 or <command>opreport -d</command>, but this works only for symbols with the right attributes.
   2863 For <command>gas</command> you can get this by
   2864 </para>
   2865 <screen>
   2866 .globl foo
   2867 	.type	foo,@function
   2868 </screen>
   2869 <para> 
   2870 whilst for <command>nasm</command> you must use
   2871 </para>
   2872 <screen>
   2873 	  GLOBAL foo:function		; [1]
   2874 </screen>
   2875 <para>
   2876 Note that OProfile does not need the global attribute, only the function attribute.
   2877 </para>
   2878 </sect1>
   2879 <!-- 
   2880 
   2881 FIXME: I commented this bit out until we've written something ...
   2882 
   2883 improve this ? but look first why this file is special 
   2884 <sect2 id="small-functions">
   2885 <title>Small functions</title>
   2886 <para>
   2887 Very small functions can show strange behavior. The file in your source
   2888 directory of OProfile <filename>$SRC/test-oprofile/understanding/puzzle.c</filename>
   2889 show such example
   2890 </para>
   2891 </sect2>
   2892 --> 
   2893 
   2894 <sect1 id="overlapping-symbols">
   2895 	<title>Overlapping symbols in JITed code</title>
   2896 	<para>
   2897 	Some virtual machines (e.g., Java) may re-JIT a method, resulting in previously
   2898 	allocated space for a piece of compiled code to be reused. This means that, at one distinct
   2899 	code address, multiple symbols/methods may be present during the run time of the application.
   2900 	</para>
   2901 	<para>
   2902 	Since OProfile samples are buffered and don&prime;t have timing information, there is no way
   2903 	to correlate samples with the (possibly) varying address ranges in which the code for a symbol
   2904 	may reside.
   2905 	An alternative would be flushing the OProfile sampling buffer when we get an unload event,
   2906 	but this could result in high overhead.
   2907 	</para>
   2908 	<para>
   2909 	To moderate the problem of overlapping symbols, OProfile tries to select the symbol that was
   2910 	present at this address range most of the time. Additionally, other overlapping symbols
   2911 	are truncated in the overlapping area.
   2912 	This gives reasonable results, because in reality, address reuse typically takes place
   2913 	during phase changes of the application -- in particular, during application  startup.
   2914 	Thus, for optimum profiling results, start the sampling session after application startup
   2915 	and burn in.
   2916 	</para>
   2917 </sect1>
   2918 
   2919 <sect1 id="hidden-cost">
   2920 <title>Other discrepancies</title>
   2921 <para>
   2922 Another cause of apparent problems is the hidden cost of instructions. A very
   2923 common example is two memory reads: one from L1 cache and the other from memory:
   2924 the second memory read is likely to have more samples.
   2925 There are many other causes of hidden cost of instructions. A non-exhaustive
   2926 list: mis-predicted branch, TLB cache miss, partial register stall,
   2927 partial register dependencies, memory mismatch stall, re-executed ops. If you want to write
   2928 programs at the assembly level, be sure to take a look at the Intel and
   2929 AMD documentation at <ulink url="http://developer.intel.com/">http://developer.intel.com/</ulink>
   2930 and <ulink url="http://developer.amd.com/devguides.jsp/">http://developer.amd.com/devguides.jsp</ulink>.
   2931 </para>
   2932 </sect1>
   2933 </chapter>
   2934 
   2935 
   2936 <chapter id="ack">
   2937 <title>Acknowledgments</title>
   2938 <para>
   2939 Thanks to (in no particular order) : Arjan van de Ven, Rik van Riel, Juan Quintela, Philippe Elie,
   2940 Phillipp Rumpf, Tigran Aivazian, Alex Brown, Alisdair Rawsthorne, Bob Montgomery, Ray Bryant, H.J. Lu,
   2941 Jeff Esper, Will Cohen, Graydon Hoare, Cliff Woolley, Alex Tsariounov, Al Stone, Jason Yeh,
   2942 Randolph Chung, Anton Blanchard, Richard Henderson, Andries Brouwer, Bryan Rittmeyer,
   2943 Maynard P. Johnson,
   2944 Richard Reich (rreich (a] rdrtech.com), Zwane Mwaikambo, Dave Jones, Charles Filtness; and finally Pulp, for "Intro".
   2945 </para>
   2946 </chapter>
   2947 
   2948 </book>
   2949