| <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
| <html> |
| <!-- Copyright (C) 1988-2015 Free Software Foundation, Inc. |
| |
| Permission is granted to copy, distribute and/or modify this document |
| under the terms of the GNU Free Documentation License, Version 1.3 or |
| any later version published by the Free Software Foundation; with the |
| Invariant Sections being "Funding Free Software", the Front-Cover |
| Texts being (a) (see below), and with the Back-Cover Texts being (b) |
| (see below). A copy of the license is included in the section entitled |
| "GNU Free Documentation License". |
| |
| (a) The FSF's Front-Cover Text is: |
| |
| A GNU Manual |
| |
| (b) The FSF's Back-Cover Text is: |
| |
| You have freedom to copy and modify this GNU Manual, like GNU |
| software. Copies published by the Free Software Foundation raise |
| funds for GNU development. --> |
| <!-- Created by GNU Texinfo 5.2, http://www.gnu.org/software/texinfo/ --> |
| <head> |
| <title>GNU Compiler Collection (GCC) Internals: LTO Overview</title> |
| |
| <meta name="description" content="GNU Compiler Collection (GCC) Internals: LTO Overview"> |
| <meta name="keywords" content="GNU Compiler Collection (GCC) Internals: LTO Overview"> |
| <meta name="resource-type" content="document"> |
| <meta name="distribution" content="global"> |
| <meta name="Generator" content="makeinfo"> |
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
| <link href="index.html#Top" rel="start" title="Top"> |
| <link href="Option-Index.html#Option-Index" rel="index" title="Option Index"> |
| <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents"> |
| <link href="LTO.html#LTO" rel="up" title="LTO"> |
| <link href="LTO-object-file-layout.html#LTO-object-file-layout" rel="next" title="LTO object file layout"> |
| <link href="LTO.html#LTO" rel="prev" title="LTO"> |
| <style type="text/css"> |
| <!-- |
| a.summary-letter {text-decoration: none} |
| blockquote.smallquotation {font-size: smaller} |
| div.display {margin-left: 3.2em} |
| div.example {margin-left: 3.2em} |
| div.indentedblock {margin-left: 3.2em} |
| div.lisp {margin-left: 3.2em} |
| div.smalldisplay {margin-left: 3.2em} |
| div.smallexample {margin-left: 3.2em} |
| div.smallindentedblock {margin-left: 3.2em; font-size: smaller} |
| div.smalllisp {margin-left: 3.2em} |
| kbd {font-style:oblique} |
| pre.display {font-family: inherit} |
| pre.format {font-family: inherit} |
| pre.menu-comment {font-family: serif} |
| pre.menu-preformatted {font-family: serif} |
| pre.smalldisplay {font-family: inherit; font-size: smaller} |
| pre.smallexample {font-size: smaller} |
| pre.smallformat {font-family: inherit; font-size: smaller} |
| pre.smalllisp {font-size: smaller} |
| span.nocodebreak {white-space:nowrap} |
| span.nolinebreak {white-space:nowrap} |
| span.roman {font-family:serif; font-weight:normal} |
| span.sansserif {font-family:sans-serif; font-weight:normal} |
| ul.no-bullet {list-style: none} |
| --> |
| </style> |
| |
| |
| </head> |
| |
| <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000"> |
| <a name="LTO-Overview"></a> |
| <div class="header"> |
| <p> |
| Next: <a href="LTO-object-file-layout.html#LTO-object-file-layout" accesskey="n" rel="next">LTO object file layout</a>, Up: <a href="LTO.html#LTO" accesskey="u" rel="up">LTO</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p> |
| </div> |
| <hr> |
| <a name="Design-Overview"></a> |
| <h3 class="section">24.1 Design Overview</h3> |
| |
| <p>Link time optimization is implemented as a GCC front end for a |
| bytecode representation of GIMPLE that is emitted in special sections |
| of <code>.o</code> files. Currently, LTO support is enabled in most |
| ELF-based systems, as well as darwin, cygwin and mingw systems. |
| </p> |
| <p>Since GIMPLE bytecode is saved alongside final object code, object |
| files generated with LTO support are larger than regular object files. |
| This “fat” object format makes it easy to integrate LTO into |
| existing build systems, as one can, for instance, produce archives of |
| the files. Additionally, one might be able to ship one set of fat |
| objects which could be used both for development and the production of |
| optimized builds. A, perhaps surprising, side effect of this feature |
| is that any mistake in the toolchain that leads to LTO information not |
| being used (e.g. an older <code>libtool</code> calling <code>ld</code> directly). |
| This is both an advantage, as the system is more robust, and a |
| disadvantage, as the user is not informed that the optimization has |
| been disabled. |
| </p> |
| <p>The current implementation only produces “fat” objects, effectively |
| doubling compilation time and increasing file sizes up to 5x the |
| original size. This hides the problem that some tools, such as |
| <code>ar</code> and <code>nm</code>, need to understand symbol tables of LTO |
| sections. These tools were extended to use the plugin infrastructure, |
| and with these problems solved, GCC will also support “slim” objects |
| consisting of the intermediate code alone. |
| </p> |
| <p>At the highest level, LTO splits the compiler in two. The first half |
| (the “writer”) produces a streaming representation of all the |
| internal data structures needed to optimize and generate code. This |
| includes declarations, types, the callgraph and the GIMPLE representation |
| of function bodies. |
| </p> |
| <p>When <samp>-flto</samp> is given during compilation of a source file, the |
| pass manager executes all the passes in <code>all_lto_gen_passes</code>. |
| Currently, this phase is composed of two IPA passes: |
| </p> |
| <ul> |
| <li> <code>pass_ipa_lto_gimple_out</code> |
| This pass executes the function <code>lto_output</code> in |
| <samp>lto-streamer-out.c</samp>, which traverses the call graph encoding |
| every reachable declaration, type and function. This generates a |
| memory representation of all the file sections described below. |
| |
| </li><li> <code>pass_ipa_lto_finish_out</code> |
| This pass executes the function <code>produce_asm_for_decls</code> in |
| <samp>lto-streamer-out.c</samp>, which takes the memory image built in the |
| previous pass and encodes it in the corresponding ELF file sections. |
| </li></ul> |
| |
| <p>The second half of LTO support is the “reader”. This is implemented |
| as the GCC front end <samp>lto1</samp> in <samp>lto/lto.c</samp>. When |
| <samp>collect2</samp> detects a link set of <code>.o</code>/<code>.a</code> files with |
| LTO information and the <samp>-flto</samp> is enabled, it invokes |
| <samp>lto1</samp> which reads the set of files and aggregates them into a |
| single translation unit for optimization. The main entry point for |
| the reader is <samp>lto/lto.c</samp>:<code>lto_main</code>. |
| </p> |
| <a name="LTO-modes-of-operation"></a> |
| <h4 class="subsection">24.1.1 LTO modes of operation</h4> |
| |
| <p>One of the main goals of the GCC link-time infrastructure was to allow |
| effective compilation of large programs. For this reason GCC implements two |
| link-time compilation modes. |
| </p> |
| <ol> |
| <li> <em>LTO mode</em>, in which the whole program is read into the |
| compiler at link-time and optimized in a similar way as if it |
| were a single source-level compilation unit. |
| |
| </li><li> <em>WHOPR or partitioned mode</em>, designed to utilize multiple |
| CPUs and/or a distributed compilation environment to quickly link |
| large applications. WHOPR stands for WHOle Program optimizeR (not to |
| be confused with the semantics of <samp>-fwhole-program</samp>). It |
| partitions the aggregated callgraph from many different <code>.o</code> |
| files and distributes the compilation of the sub-graphs to different |
| CPUs. |
| |
| <p>Note that distributed compilation is not implemented yet, but since |
| the parallelism is facilitated via generating a <code>Makefile</code>, it |
| would be easy to implement. |
| </p></li></ol> |
| |
| <p>WHOPR splits LTO into three main stages: |
| </p><ol> |
| <li> Local generation (LGEN) |
| This stage executes in parallel. Every file in the program is compiled |
| into the intermediate language and packaged together with the local |
| call-graph and summary information. This stage is the same for both |
| the LTO and WHOPR compilation mode. |
| |
| </li><li> Whole Program Analysis (WPA) |
| WPA is performed sequentially. The global call-graph is generated, and |
| a global analysis procedure makes transformation decisions. The global |
| call-graph is partitioned to facilitate parallel optimization during |
| phase 3. The results of the WPA stage are stored into new object files |
| which contain the partitions of program expressed in the intermediate |
| language and the optimization decisions. |
| |
| </li><li> Local transformations (LTRANS) |
| This stage executes in parallel. All the decisions made during phase 2 |
| are implemented locally in each partitioned object file, and the final |
| object code is generated. Optimizations which cannot be decided |
| efficiently during the phase 2 may be performed on the local |
| call-graph partitions. |
| </li></ol> |
| |
| <p>WHOPR can be seen as an extension of the usual LTO mode of |
| compilation. In LTO, WPA and LTRANS are executed within a single |
| execution of the compiler, after the whole program has been read into |
| memory. |
| </p> |
| <p>When compiling in WHOPR mode, the callgraph is partitioned during |
| the WPA stage. The whole program is split into a given number of |
| partitions of roughly the same size. The compiler tries to |
| minimize the number of references which cross partition boundaries. |
| The main advantage of WHOPR is to allow the parallel execution of |
| LTRANS stages, which are the most time-consuming part of the |
| compilation process. Additionally, it avoids the need to load the |
| whole program into memory. |
| </p> |
| |
| <hr> |
| <div class="header"> |
| <p> |
| Next: <a href="LTO-object-file-layout.html#LTO-object-file-layout" accesskey="n" rel="next">LTO object file layout</a>, Up: <a href="LTO.html#LTO" accesskey="u" rel="up">LTO</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p> |
| </div> |
| |
| |
| |
| </body> |
| </html> |