| <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
| <html> |
| <!-- Copyright (C) 1988-2015 Free Software Foundation, Inc. |
| |
| Permission is granted to copy, distribute and/or modify this document |
| under the terms of the GNU Free Documentation License, Version 1.3 or |
| any later version published by the Free Software Foundation; with the |
| Invariant Sections being "Funding Free Software", the Front-Cover |
| Texts being (a) (see below), and with the Back-Cover Texts being (b) |
| (see below). A copy of the license is included in the section entitled |
| "GNU Free Documentation License". |
| |
| (a) The FSF's Front-Cover Text is: |
| |
| A GNU Manual |
| |
| (b) The FSF's Back-Cover Text is: |
| |
| You have freedom to copy and modify this GNU Manual, like GNU |
| software. Copies published by the Free Software Foundation raise |
| funds for GNU development. --> |
| <!-- Created by GNU Texinfo 5.2, http://www.gnu.org/software/texinfo/ --> |
| <head> |
| <title>Using the GNU Compiler Collection (GCC): x86 Options</title> |
| |
| <meta name="description" content="Using the GNU Compiler Collection (GCC): x86 Options"> |
| <meta name="keywords" content="Using the GNU Compiler Collection (GCC): x86 Options"> |
| <meta name="resource-type" content="document"> |
| <meta name="distribution" content="global"> |
| <meta name="Generator" content="makeinfo"> |
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
| <link href="index.html#Top" rel="start" title="Top"> |
| <link href="Option-Index.html#Option-Index" rel="index" title="Option Index"> |
| <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents"> |
| <link href="Submodel-Options.html#Submodel-Options" rel="up" title="Submodel Options"> |
| <link href="x86-Windows-Options.html#x86-Windows-Options" rel="next" title="x86 Windows Options"> |
| <link href="VxWorks-Options.html#VxWorks-Options" rel="prev" title="VxWorks Options"> |
| <style type="text/css"> |
| <!-- |
| a.summary-letter {text-decoration: none} |
| blockquote.smallquotation {font-size: smaller} |
| div.display {margin-left: 3.2em} |
| div.example {margin-left: 3.2em} |
| div.indentedblock {margin-left: 3.2em} |
| div.lisp {margin-left: 3.2em} |
| div.smalldisplay {margin-left: 3.2em} |
| div.smallexample {margin-left: 3.2em} |
| div.smallindentedblock {margin-left: 3.2em; font-size: smaller} |
| div.smalllisp {margin-left: 3.2em} |
| kbd {font-style:oblique} |
| pre.display {font-family: inherit} |
| pre.format {font-family: inherit} |
| pre.menu-comment {font-family: serif} |
| pre.menu-preformatted {font-family: serif} |
| pre.smalldisplay {font-family: inherit; font-size: smaller} |
| pre.smallexample {font-size: smaller} |
| pre.smallformat {font-family: inherit; font-size: smaller} |
| pre.smalllisp {font-size: smaller} |
| span.nocodebreak {white-space:nowrap} |
| span.nolinebreak {white-space:nowrap} |
| span.roman {font-family:serif; font-weight:normal} |
| span.sansserif {font-family:sans-serif; font-weight:normal} |
| ul.no-bullet {list-style: none} |
| --> |
| </style> |
| |
| |
| </head> |
| |
| <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000"> |
| <a name="x86-Options"></a> |
| <div class="header"> |
| <p> |
| Next: <a href="x86-Windows-Options.html#x86-Windows-Options" accesskey="n" rel="next">x86 Windows Options</a>, Previous: <a href="VxWorks-Options.html#VxWorks-Options" accesskey="p" rel="prev">VxWorks Options</a>, Up: <a href="Submodel-Options.html#Submodel-Options" accesskey="u" rel="up">Submodel Options</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p> |
| </div> |
| <hr> |
| <a name="x86-Options-1"></a> |
| <h4 class="subsection">3.17.53 x86 Options</h4> |
| <a name="index-x86-Options"></a> |
| |
| <p>These ‘<samp>-m</samp>’ options are defined for the x86 family of computers. |
| </p> |
| <dl compact="compact"> |
| <dt><code>-march=<var>cpu-type</var></code></dt> |
| <dd><a name="index-march-10"></a> |
| <p>Generate instructions for the machine type <var>cpu-type</var>. In contrast to |
| <samp>-mtune=<var>cpu-type</var></samp>, which merely tunes the generated code |
| for the specified <var>cpu-type</var>, <samp>-march=<var>cpu-type</var></samp> allows GCC |
| to generate code that may not run at all on processors other than the one |
| indicated. Specifying <samp>-march=<var>cpu-type</var></samp> implies |
| <samp>-mtune=<var>cpu-type</var></samp>. |
| </p> |
| <p>The choices for <var>cpu-type</var> are: |
| </p> |
| <dl compact="compact"> |
| <dt>‘<samp>native</samp>’</dt> |
| <dd><p>This selects the CPU to generate code for at compilation time by determining |
| the processor type of the compiling machine. Using <samp>-march=native</samp> |
| enables all instruction subsets supported by the local machine (hence |
| the result might not run on different machines). Using <samp>-mtune=native</samp> |
| produces code optimized for the local machine under the constraints |
| of the selected instruction set. |
| </p> |
| </dd> |
| <dt>‘<samp>i386</samp>’</dt> |
| <dd><p>Original Intel i386 CPU. |
| </p> |
| </dd> |
| <dt>‘<samp>i486</samp>’</dt> |
| <dd><p>Intel i486 CPU. (No scheduling is implemented for this chip.) |
| </p> |
| </dd> |
| <dt>‘<samp>i586</samp>’</dt> |
| <dt>‘<samp>pentium</samp>’</dt> |
| <dd><p>Intel Pentium CPU with no MMX support. |
| </p> |
| </dd> |
| <dt>‘<samp>pentium-mmx</samp>’</dt> |
| <dd><p>Intel Pentium MMX CPU, based on Pentium core with MMX instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>pentiumpro</samp>’</dt> |
| <dd><p>Intel Pentium Pro CPU. |
| </p> |
| </dd> |
| <dt>‘<samp>i686</samp>’</dt> |
| <dd><p>When used with <samp>-march</samp>, the Pentium Pro |
| instruction set is used, so the code runs on all i686 family chips. |
| When used with <samp>-mtune</samp>, it has the same meaning as ‘<samp>generic</samp>’. |
| </p> |
| </dd> |
| <dt>‘<samp>pentium2</samp>’</dt> |
| <dd><p>Intel Pentium II CPU, based on Pentium Pro core with MMX instruction set |
| support. |
| </p> |
| </dd> |
| <dt>‘<samp>pentium3</samp>’</dt> |
| <dt>‘<samp>pentium3m</samp>’</dt> |
| <dd><p>Intel Pentium III CPU, based on Pentium Pro core with MMX and SSE instruction |
| set support. |
| </p> |
| </dd> |
| <dt>‘<samp>pentium-m</samp>’</dt> |
| <dd><p>Intel Pentium M; low-power version of Intel Pentium III CPU |
| with MMX, SSE and SSE2 instruction set support. Used by Centrino notebooks. |
| </p> |
| </dd> |
| <dt>‘<samp>pentium4</samp>’</dt> |
| <dt>‘<samp>pentium4m</samp>’</dt> |
| <dd><p>Intel Pentium 4 CPU with MMX, SSE and SSE2 instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>prescott</samp>’</dt> |
| <dd><p>Improved version of Intel Pentium 4 CPU with MMX, SSE, SSE2 and SSE3 instruction |
| set support. |
| </p> |
| </dd> |
| <dt>‘<samp>nocona</samp>’</dt> |
| <dd><p>Improved version of Intel Pentium 4 CPU with 64-bit extensions, MMX, SSE, |
| SSE2 and SSE3 instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>core2</samp>’</dt> |
| <dd><p>Intel Core 2 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3 and SSSE3 |
| instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>nehalem</samp>’</dt> |
| <dd><p>Intel Nehalem CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, |
| SSE4.1, SSE4.2 and POPCNT instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>westmere</samp>’</dt> |
| <dd><p>Intel Westmere CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, |
| SSE4.1, SSE4.2, POPCNT, AES and PCLMUL instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>sandybridge</samp>’</dt> |
| <dd><p>Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, |
| SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>ivybridge</samp>’</dt> |
| <dd><p>Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3, |
| SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE, RDRND and F16C |
| instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>haswell</samp>’</dt> |
| <dd><p>Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, |
| SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, |
| BMI, BMI2 and F16C instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>broadwell</samp>’</dt> |
| <dd><p>Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, |
| SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, |
| BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>bonnell</samp>’</dt> |
| <dd><p>Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3 and SSSE3 |
| instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>silvermont</samp>’</dt> |
| <dd><p>Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, |
| SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>knl</samp>’</dt> |
| <dd><p>Intel Knight’s Landing CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, |
| SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, |
| BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, AVX512F, AVX512PF, AVX512ER and |
| AVX512CD instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>k6</samp>’</dt> |
| <dd><p>AMD K6 CPU with MMX instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>k6-2</samp>’</dt> |
| <dt>‘<samp>k6-3</samp>’</dt> |
| <dd><p>Improved versions of AMD K6 CPU with MMX and 3DNow! instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>athlon</samp>’</dt> |
| <dt>‘<samp>athlon-tbird</samp>’</dt> |
| <dd><p>AMD Athlon CPU with MMX, 3dNOW!, enhanced 3DNow! and SSE prefetch instructions |
| support. |
| </p> |
| </dd> |
| <dt>‘<samp>athlon-4</samp>’</dt> |
| <dt>‘<samp>athlon-xp</samp>’</dt> |
| <dt>‘<samp>athlon-mp</samp>’</dt> |
| <dd><p>Improved AMD Athlon CPU with MMX, 3DNow!, enhanced 3DNow! and full SSE |
| instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>k8</samp>’</dt> |
| <dt>‘<samp>opteron</samp>’</dt> |
| <dt>‘<samp>athlon64</samp>’</dt> |
| <dt>‘<samp>athlon-fx</samp>’</dt> |
| <dd><p>Processors based on the AMD K8 core with x86-64 instruction set support, |
| including the AMD Opteron, Athlon 64, and Athlon 64 FX processors. |
| (This supersets MMX, SSE, SSE2, 3DNow!, enhanced 3DNow! and 64-bit |
| instruction set extensions.) |
| </p> |
| </dd> |
| <dt>‘<samp>k8-sse3</samp>’</dt> |
| <dt>‘<samp>opteron-sse3</samp>’</dt> |
| <dt>‘<samp>athlon64-sse3</samp>’</dt> |
| <dd><p>Improved versions of AMD K8 cores with SSE3 instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>amdfam10</samp>’</dt> |
| <dt>‘<samp>barcelona</samp>’</dt> |
| <dd><p>CPUs based on AMD Family 10h cores with x86-64 instruction set support. (This |
| supersets MMX, SSE, SSE2, SSE3, SSE4A, 3DNow!, enhanced 3DNow!, ABM and 64-bit |
| instruction set extensions.) |
| </p> |
| </dd> |
| <dt>‘<samp>bdver1</samp>’</dt> |
| <dd><p>CPUs based on AMD Family 15h cores with x86-64 instruction set support. (This |
| supersets FMA4, AVX, XOP, LWP, AES, PCL_MUL, CX16, MMX, SSE, SSE2, SSE3, SSE4A, |
| SSSE3, SSE4.1, SSE4.2, ABM and 64-bit instruction set extensions.) |
| </p></dd> |
| <dt>‘<samp>bdver2</samp>’</dt> |
| <dd><p>AMD Family 15h core based CPUs with x86-64 instruction set support. (This |
| supersets BMI, TBM, F16C, FMA, FMA4, AVX, XOP, LWP, AES, PCL_MUL, CX16, MMX, |
| SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM and 64-bit instruction set |
| extensions.) |
| </p></dd> |
| <dt>‘<samp>bdver3</samp>’</dt> |
| <dd><p>AMD Family 15h core based CPUs with x86-64 instruction set support. (This |
| supersets BMI, TBM, F16C, FMA, FMA4, FSGSBASE, AVX, XOP, LWP, AES, |
| PCL_MUL, CX16, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM and |
| 64-bit instruction set extensions. |
| </p></dd> |
| <dt>‘<samp>bdver4</samp>’</dt> |
| <dd><p>AMD Family 15h core based CPUs with x86-64 instruction set support. (This |
| supersets BMI, BMI2, TBM, F16C, FMA, FMA4, FSGSBASE, AVX, AVX2, XOP, LWP, |
| AES, PCL_MUL, CX16, MOVBE, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, |
| SSE4.2, ABM and 64-bit instruction set extensions. |
| </p> |
| </dd> |
| <dt>‘<samp>btver1</samp>’</dt> |
| <dd><p>CPUs based on AMD Family 14h cores with x86-64 instruction set support. (This |
| supersets MMX, SSE, SSE2, SSE3, SSSE3, SSE4A, CX16, ABM and 64-bit |
| instruction set extensions.) |
| </p> |
| </dd> |
| <dt>‘<samp>btver2</samp>’</dt> |
| <dd><p>CPUs based on AMD Family 16h cores with x86-64 instruction set support. This |
| includes MOVBE, F16C, BMI, AVX, PCL_MUL, AES, SSE4.2, SSE4.1, CX16, ABM, |
| SSE4A, SSSE3, SSE3, SSE2, SSE, MMX and 64-bit instruction set extensions. |
| </p> |
| </dd> |
| <dt>‘<samp>winchip-c6</samp>’</dt> |
| <dd><p>IDT WinChip C6 CPU, dealt in same way as i486 with additional MMX instruction |
| set support. |
| </p> |
| </dd> |
| <dt>‘<samp>winchip2</samp>’</dt> |
| <dd><p>IDT WinChip 2 CPU, dealt in same way as i486 with additional MMX and 3DNow! |
| instruction set support. |
| </p> |
| </dd> |
| <dt>‘<samp>c3</samp>’</dt> |
| <dd><p>VIA C3 CPU with MMX and 3DNow! instruction set support. (No scheduling is |
| implemented for this chip.) |
| </p> |
| </dd> |
| <dt>‘<samp>c3-2</samp>’</dt> |
| <dd><p>VIA C3-2 (Nehemiah/C5XL) CPU with MMX and SSE instruction set support. |
| (No scheduling is |
| implemented for this chip.) |
| </p> |
| </dd> |
| <dt>‘<samp>geode</samp>’</dt> |
| <dd><p>AMD Geode embedded processor with MMX and 3DNow! instruction set support. |
| </p></dd> |
| </dl> |
| |
| </dd> |
| <dt><code>-mtune=<var>cpu-type</var></code></dt> |
| <dd><a name="index-mtune-14"></a> |
| <p>Tune to <var>cpu-type</var> everything applicable about the generated code, except |
| for the ABI and the set of available instructions. |
| While picking a specific <var>cpu-type</var> schedules things appropriately |
| for that particular chip, the compiler does not generate any code that |
| cannot run on the default machine type unless you use a |
| <samp>-march=<var>cpu-type</var></samp> option. |
| For example, if GCC is configured for i686-pc-linux-gnu |
| then <samp>-mtune=pentium4</samp> generates code that is tuned for Pentium 4 |
| but still runs on i686 machines. |
| </p> |
| <p>The choices for <var>cpu-type</var> are the same as for <samp>-march</samp>. |
| In addition, <samp>-mtune</samp> supports 2 extra choices for <var>cpu-type</var>: |
| </p> |
| <dl compact="compact"> |
| <dt>‘<samp>generic</samp>’</dt> |
| <dd><p>Produce code optimized for the most common IA32/AMD64/EM64T processors. |
| If you know the CPU on which your code will run, then you should use |
| the corresponding <samp>-mtune</samp> or <samp>-march</samp> option instead of |
| <samp>-mtune=generic</samp>. But, if you do not know exactly what CPU users |
| of your application will have, then you should use this option. |
| </p> |
| <p>As new processors are deployed in the marketplace, the behavior of this |
| option will change. Therefore, if you upgrade to a newer version of |
| GCC, code generation controlled by this option will change to reflect |
| the processors |
| that are most common at the time that version of GCC is released. |
| </p> |
| <p>There is no <samp>-march=generic</samp> option because <samp>-march</samp> |
| indicates the instruction set the compiler can use, and there is no |
| generic instruction set applicable to all processors. In contrast, |
| <samp>-mtune</samp> indicates the processor (or, in this case, collection of |
| processors) for which the code is optimized. |
| </p> |
| </dd> |
| <dt>‘<samp>intel</samp>’</dt> |
| <dd><p>Produce code optimized for the most current Intel processors, which are |
| Haswell and Silvermont for this version of GCC. If you know the CPU |
| on which your code will run, then you should use the corresponding |
| <samp>-mtune</samp> or <samp>-march</samp> option instead of <samp>-mtune=intel</samp>. |
| But, if you want your application performs better on both Haswell and |
| Silvermont, then you should use this option. |
| </p> |
| <p>As new Intel processors are deployed in the marketplace, the behavior of |
| this option will change. Therefore, if you upgrade to a newer version of |
| GCC, code generation controlled by this option will change to reflect |
| the most current Intel processors at the time that version of GCC is |
| released. |
| </p> |
| <p>There is no <samp>-march=intel</samp> option because <samp>-march</samp> indicates |
| the instruction set the compiler can use, and there is no common |
| instruction set applicable to all processors. In contrast, |
| <samp>-mtune</samp> indicates the processor (or, in this case, collection of |
| processors) for which the code is optimized. |
| </p></dd> |
| </dl> |
| |
| </dd> |
| <dt><code>-mcpu=<var>cpu-type</var></code></dt> |
| <dd><a name="index-mcpu-14"></a> |
| <p>A deprecated synonym for <samp>-mtune</samp>. |
| </p> |
| </dd> |
| <dt><code>-mfpmath=<var>unit</var></code></dt> |
| <dd><a name="index-mfpmath-1"></a> |
| <p>Generate floating-point arithmetic for selected unit <var>unit</var>. The choices |
| for <var>unit</var> are: |
| </p> |
| <dl compact="compact"> |
| <dt>‘<samp>387</samp>’</dt> |
| <dd><p>Use the standard 387 floating-point coprocessor present on the majority of chips and |
| emulated otherwise. Code compiled with this option runs almost everywhere. |
| The temporary results are computed in 80-bit precision instead of the precision |
| specified by the type, resulting in slightly different results compared to most |
| of other chips. See <samp>-ffloat-store</samp> for more detailed description. |
| </p> |
| <p>This is the default choice for x86-32 targets. |
| </p> |
| </dd> |
| <dt>‘<samp>sse</samp>’</dt> |
| <dd><p>Use scalar floating-point instructions present in the SSE instruction set. |
| This instruction set is supported by Pentium III and newer chips, |
| and in the AMD line |
| by Athlon-4, Athlon XP and Athlon MP chips. The earlier version of the SSE |
| instruction set supports only single-precision arithmetic, thus the double and |
| extended-precision arithmetic are still done using 387. A later version, present |
| only in Pentium 4 and AMD x86-64 chips, supports double-precision |
| arithmetic too. |
| </p> |
| <p>For the x86-32 compiler, you must use <samp>-march=<var>cpu-type</var></samp>, <samp>-msse</samp> |
| or <samp>-msse2</samp> switches to enable SSE extensions and make this option |
| effective. For the x86-64 compiler, these extensions are enabled by default. |
| </p> |
| <p>The resulting code should be considerably faster in the majority of cases and avoid |
| the numerical instability problems of 387 code, but may break some existing |
| code that expects temporaries to be 80 bits. |
| </p> |
| <p>This is the default choice for the x86-64 compiler. |
| </p> |
| </dd> |
| <dt>‘<samp>sse,387</samp>’</dt> |
| <dt>‘<samp>sse+387</samp>’</dt> |
| <dt>‘<samp>both</samp>’</dt> |
| <dd><p>Attempt to utilize both instruction sets at once. This effectively doubles the |
| amount of available registers, and on chips with separate execution units for |
| 387 and SSE the execution resources too. Use this option with care, as it is |
| still experimental, because the GCC register allocator does not model separate |
| functional units well, resulting in unstable performance. |
| </p></dd> |
| </dl> |
| |
| </dd> |
| <dt><code>-masm=<var>dialect</var></code></dt> |
| <dd><a name="index-masm_003ddialect"></a> |
| <p>Output assembly instructions using selected <var>dialect</var>. Also affects |
| which dialect is used for basic <code>asm</code> (see <a href="Basic-Asm.html#Basic-Asm">Basic Asm</a>) and |
| extended <code>asm</code> (see <a href="Extended-Asm.html#Extended-Asm">Extended Asm</a>). Supported choices (in dialect |
| order) are ‘<samp>att</samp>’ or ‘<samp>intel</samp>’. The default is ‘<samp>att</samp>’. Darwin does |
| not support ‘<samp>intel</samp>’. |
| </p> |
| </dd> |
| <dt><code>-mieee-fp</code></dt> |
| <dt><code>-mno-ieee-fp</code></dt> |
| <dd><a name="index-mieee_002dfp"></a> |
| <a name="index-mno_002dieee_002dfp"></a> |
| <p>Control whether or not the compiler uses IEEE floating-point |
| comparisons. These correctly handle the case where the result of a |
| comparison is unordered. |
| </p> |
| </dd> |
| <dt><code>-msoft-float</code></dt> |
| <dd><a name="index-msoft_002dfloat-13"></a> |
| <p>Generate output containing library calls for floating point. |
| </p> |
| <p><strong>Warning:</strong> the requisite libraries are not part of GCC. |
| Normally the facilities of the machine’s usual C compiler are used, but |
| this can’t be done directly in cross-compilation. You must make your |
| own arrangements to provide suitable library functions for |
| cross-compilation. |
| </p> |
| <p>On machines where a function returns floating-point results in the 80387 |
| register stack, some floating-point opcodes may be emitted even if |
| <samp>-msoft-float</samp> is used. |
| </p> |
| </dd> |
| <dt><code>-mno-fp-ret-in-387</code></dt> |
| <dd><a name="index-mno_002dfp_002dret_002din_002d387"></a> |
| <p>Do not use the FPU registers for return values of functions. |
| </p> |
| <p>The usual calling convention has functions return values of types |
| <code>float</code> and <code>double</code> in an FPU register, even if there |
| is no FPU. The idea is that the operating system should emulate |
| an FPU. |
| </p> |
| <p>The option <samp>-mno-fp-ret-in-387</samp> causes such values to be returned |
| in ordinary CPU registers instead. |
| </p> |
| </dd> |
| <dt><code>-mno-fancy-math-387</code></dt> |
| <dd><a name="index-mno_002dfancy_002dmath_002d387"></a> |
| <p>Some 387 emulators do not support the <code>sin</code>, <code>cos</code> and |
| <code>sqrt</code> instructions for the 387. Specify this option to avoid |
| generating those instructions. This option is the default on FreeBSD, |
| OpenBSD and NetBSD. This option is overridden when <samp>-march</samp> |
| indicates that the target CPU always has an FPU and so the |
| instruction does not need emulation. These |
| instructions are not generated unless you also use the |
| <samp>-funsafe-math-optimizations</samp> switch. |
| </p> |
| </dd> |
| <dt><code>-malign-double</code></dt> |
| <dt><code>-mno-align-double</code></dt> |
| <dd><a name="index-malign_002ddouble"></a> |
| <a name="index-mno_002dalign_002ddouble"></a> |
| <p>Control whether GCC aligns <code>double</code>, <code>long double</code>, and |
| <code>long long</code> variables on a two-word boundary or a one-word |
| boundary. Aligning <code>double</code> variables on a two-word boundary |
| produces code that runs somewhat faster on a Pentium at the |
| expense of more memory. |
| </p> |
| <p>On x86-64, <samp>-malign-double</samp> is enabled by default. |
| </p> |
| <p><strong>Warning:</strong> if you use the <samp>-malign-double</samp> switch, |
| structures containing the above types are aligned differently than |
| the published application binary interface specifications for the x86-32 |
| and are not binary compatible with structures in code compiled |
| without that switch. |
| </p> |
| </dd> |
| <dt><code>-m96bit-long-double</code></dt> |
| <dt><code>-m128bit-long-double</code></dt> |
| <dd><a name="index-m96bit_002dlong_002ddouble"></a> |
| <a name="index-m128bit_002dlong_002ddouble"></a> |
| <p>These switches control the size of <code>long double</code> type. The x86-32 |
| application binary interface specifies the size to be 96 bits, |
| so <samp>-m96bit-long-double</samp> is the default in 32-bit mode. |
| </p> |
| <p>Modern architectures (Pentium and newer) prefer <code>long double</code> |
| to be aligned to an 8- or 16-byte boundary. In arrays or structures |
| conforming to the ABI, this is not possible. So specifying |
| <samp>-m128bit-long-double</samp> aligns <code>long double</code> |
| to a 16-byte boundary by padding the <code>long double</code> with an additional |
| 32-bit zero. |
| </p> |
| <p>In the x86-64 compiler, <samp>-m128bit-long-double</samp> is the default choice as |
| its ABI specifies that <code>long double</code> is aligned on 16-byte boundary. |
| </p> |
| <p>Notice that neither of these options enable any extra precision over the x87 |
| standard of 80 bits for a <code>long double</code>. |
| </p> |
| <p><strong>Warning:</strong> if you override the default value for your target ABI, this |
| changes the size of |
| structures and arrays containing <code>long double</code> variables, |
| as well as modifying the function calling convention for functions taking |
| <code>long double</code>. Hence they are not binary-compatible |
| with code compiled without that switch. |
| </p> |
| </dd> |
| <dt><code>-mlong-double-64</code></dt> |
| <dt><code>-mlong-double-80</code></dt> |
| <dt><code>-mlong-double-128</code></dt> |
| <dd><a name="index-mlong_002ddouble_002d64-1"></a> |
| <a name="index-mlong_002ddouble_002d80"></a> |
| <a name="index-mlong_002ddouble_002d128-1"></a> |
| <p>These switches control the size of <code>long double</code> type. A size |
| of 64 bits makes the <code>long double</code> type equivalent to the <code>double</code> |
| type. This is the default for 32-bit Bionic C library. A size |
| of 128 bits makes the <code>long double</code> type equivalent to the |
| <code>__float128</code> type. This is the default for 64-bit Bionic C library. |
| </p> |
| <p><strong>Warning:</strong> if you override the default value for your target ABI, this |
| changes the size of |
| structures and arrays containing <code>long double</code> variables, |
| as well as modifying the function calling convention for functions taking |
| <code>long double</code>. Hence they are not binary-compatible |
| with code compiled without that switch. |
| </p> |
| </dd> |
| <dt><code>-malign-data=<var>type</var></code></dt> |
| <dd><a name="index-malign_002ddata"></a> |
| <p>Control how GCC aligns variables. Supported values for <var>type</var> are |
| ‘<samp>compat</samp>’ uses increased alignment value compatible uses GCC 4.8 |
| and earlier, ‘<samp>abi</samp>’ uses alignment value as specified by the |
| psABI, and ‘<samp>cacheline</samp>’ uses increased alignment value to match |
| the cache line size. ‘<samp>compat</samp>’ is the default. |
| </p> |
| </dd> |
| <dt><code>-mlarge-data-threshold=<var>threshold</var></code></dt> |
| <dd><a name="index-mlarge_002ddata_002dthreshold"></a> |
| <p>When <samp>-mcmodel=medium</samp> is specified, data objects larger than |
| <var>threshold</var> are placed in the large data section. This value must be the |
| same across all objects linked into the binary, and defaults to 65535. |
| </p> |
| </dd> |
| <dt><code>-mrtd</code></dt> |
| <dd><a name="index-mrtd-1"></a> |
| <p>Use a different function-calling convention, in which functions that |
| take a fixed number of arguments return with the <code>ret <var>num</var></code> |
| instruction, which pops their arguments while returning. This saves one |
| instruction in the caller since there is no need to pop the arguments |
| there. |
| </p> |
| <p>You can specify that an individual function is called with this calling |
| sequence with the function attribute <code>stdcall</code>. You can also |
| override the <samp>-mrtd</samp> option by using the function attribute |
| <code>cdecl</code>. See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>. |
| </p> |
| <p><strong>Warning:</strong> this calling convention is incompatible with the one |
| normally used on Unix, so you cannot use it if you need to call |
| libraries compiled with the Unix compiler. |
| </p> |
| <p>Also, you must provide function prototypes for all functions that |
| take variable numbers of arguments (including <code>printf</code>); |
| otherwise incorrect code is generated for calls to those |
| functions. |
| </p> |
| <p>In addition, seriously incorrect code results if you call a |
| function with too many arguments. (Normally, extra arguments are |
| harmlessly ignored.) |
| </p> |
| </dd> |
| <dt><code>-mregparm=<var>num</var></code></dt> |
| <dd><a name="index-mregparm"></a> |
| <p>Control how many registers are used to pass integer arguments. By |
| default, no registers are used to pass arguments, and at most 3 |
| registers can be used. You can control this behavior for a specific |
| function by using the function attribute <code>regparm</code>. |
| See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>. |
| </p> |
| <p><strong>Warning:</strong> if you use this switch, and |
| <var>num</var> is nonzero, then you must build all modules with the same |
| value, including any libraries. This includes the system libraries and |
| startup modules. |
| </p> |
| </dd> |
| <dt><code>-msseregparm</code></dt> |
| <dd><a name="index-msseregparm"></a> |
| <p>Use SSE register passing conventions for float and double arguments |
| and return values. You can control this behavior for a specific |
| function by using the function attribute <code>sseregparm</code>. |
| See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>. |
| </p> |
| <p><strong>Warning:</strong> if you use this switch then you must build all |
| modules with the same value, including any libraries. This includes |
| the system libraries and startup modules. |
| </p> |
| </dd> |
| <dt><code>-mvect8-ret-in-mem</code></dt> |
| <dd><a name="index-mvect8_002dret_002din_002dmem"></a> |
| <p>Return 8-byte vectors in memory instead of MMX registers. This is the |
| default on Solaris 8 and 9 and VxWorks to match the ABI of the Sun |
| Studio compilers until version 12. Later compiler versions (starting |
| with Studio 12 Update 1) follow the ABI used by other x86 targets, which |
| is the default on Solaris 10 and later. <em>Only</em> use this option if |
| you need to remain compatible with existing code produced by those |
| previous compiler versions or older versions of GCC. |
| </p> |
| </dd> |
| <dt><code>-mpc32</code></dt> |
| <dt><code>-mpc64</code></dt> |
| <dt><code>-mpc80</code></dt> |
| <dd><a name="index-mpc32"></a> |
| <a name="index-mpc64"></a> |
| <a name="index-mpc80"></a> |
| |
| <p>Set 80387 floating-point precision to 32, 64 or 80 bits. When <samp>-mpc32</samp> |
| is specified, the significands of results of floating-point operations are |
| rounded to 24 bits (single precision); <samp>-mpc64</samp> rounds the |
| significands of results of floating-point operations to 53 bits (double |
| precision) and <samp>-mpc80</samp> rounds the significands of results of |
| floating-point operations to 64 bits (extended double precision), which is |
| the default. When this option is used, floating-point operations in higher |
| precisions are not available to the programmer without setting the FPU |
| control word explicitly. |
| </p> |
| <p>Setting the rounding of floating-point operations to less than the default |
| 80 bits can speed some programs by 2% or more. Note that some mathematical |
| libraries assume that extended-precision (80-bit) floating-point operations |
| are enabled by default; routines in such libraries could suffer significant |
| loss of accuracy, typically through so-called “catastrophic cancellation”, |
| when this option is used to set the precision to less than extended precision. |
| </p> |
| </dd> |
| <dt><code>-mstackrealign</code></dt> |
| <dd><a name="index-mstackrealign"></a> |
| <p>Realign the stack at entry. On the x86, the <samp>-mstackrealign</samp> |
| option generates an alternate prologue and epilogue that realigns the |
| run-time stack if necessary. This supports mixing legacy codes that keep |
| 4-byte stack alignment with modern codes that keep 16-byte stack alignment for |
| SSE compatibility. See also the attribute <code>force_align_arg_pointer</code>, |
| applicable to individual functions. |
| </p> |
| </dd> |
| <dt><code>-mpreferred-stack-boundary=<var>num</var></code></dt> |
| <dd><a name="index-mpreferred_002dstack_002dboundary"></a> |
| <p>Attempt to keep the stack boundary aligned to a 2 raised to <var>num</var> |
| byte boundary. If <samp>-mpreferred-stack-boundary</samp> is not specified, |
| the default is 4 (16 bytes or 128 bits). |
| </p> |
| <p><strong>Warning:</strong> When generating code for the x86-64 architecture with |
| SSE extensions disabled, <samp>-mpreferred-stack-boundary=3</samp> can be |
| used to keep the stack boundary aligned to 8 byte boundary. Since |
| x86-64 ABI require 16 byte stack alignment, this is ABI incompatible and |
| intended to be used in controlled environment where stack space is |
| important limitation. This option leads to wrong code when functions |
| compiled with 16 byte stack alignment (such as functions from a standard |
| library) are called with misaligned stack. In this case, SSE |
| instructions may lead to misaligned memory access traps. In addition, |
| variable arguments are handled incorrectly for 16 byte aligned |
| objects (including x87 long double and __int128), leading to wrong |
| results. You must build all modules with |
| <samp>-mpreferred-stack-boundary=3</samp>, including any libraries. This |
| includes the system libraries and startup modules. |
| </p> |
| </dd> |
| <dt><code>-mincoming-stack-boundary=<var>num</var></code></dt> |
| <dd><a name="index-mincoming_002dstack_002dboundary"></a> |
| <p>Assume the incoming stack is aligned to a 2 raised to <var>num</var> byte |
| boundary. If <samp>-mincoming-stack-boundary</samp> is not specified, |
| the one specified by <samp>-mpreferred-stack-boundary</samp> is used. |
| </p> |
| <p>On Pentium and Pentium Pro, <code>double</code> and <code>long double</code> values |
| should be aligned to an 8-byte boundary (see <samp>-malign-double</samp>) or |
| suffer significant run time performance penalties. On Pentium III, the |
| Streaming SIMD Extension (SSE) data type <code>__m128</code> may not work |
| properly if it is not 16-byte aligned. |
| </p> |
| <p>To ensure proper alignment of this values on the stack, the stack boundary |
| must be as aligned as that required by any value stored on the stack. |
| Further, every function must be generated such that it keeps the stack |
| aligned. Thus calling a function compiled with a higher preferred |
| stack boundary from a function compiled with a lower preferred stack |
| boundary most likely misaligns the stack. It is recommended that |
| libraries that use callbacks always use the default setting. |
| </p> |
| <p>This extra alignment does consume extra stack space, and generally |
| increases code size. Code that is sensitive to stack space usage, such |
| as embedded systems and operating system kernels, may want to reduce the |
| preferred alignment to <samp>-mpreferred-stack-boundary=2</samp>. |
| </p> |
| </dd> |
| <dt><code>-mmmx</code></dt> |
| <dd><a name="index-mmmx"></a> |
| </dd> |
| <dt><code>-msse</code></dt> |
| <dd><a name="index-msse"></a> |
| </dd> |
| <dt><code>-msse2</code></dt> |
| <dt><code>-msse3</code></dt> |
| <dt><code>-mssse3</code></dt> |
| <dt><code>-msse4</code></dt> |
| <dt><code>-msse4a</code></dt> |
| <dt><code>-msse4.1</code></dt> |
| <dt><code>-msse4.2</code></dt> |
| <dt><code>-mavx</code></dt> |
| <dd><a name="index-mavx"></a> |
| </dd> |
| <dt><code>-mavx2</code></dt> |
| <dt><code>-mavx512f</code></dt> |
| <dt><code>-mavx512pf</code></dt> |
| <dt><code>-mavx512er</code></dt> |
| <dt><code>-mavx512cd</code></dt> |
| <dt><code>-msha</code></dt> |
| <dd><a name="index-msha"></a> |
| </dd> |
| <dt><code>-maes</code></dt> |
| <dd><a name="index-maes"></a> |
| </dd> |
| <dt><code>-mpclmul</code></dt> |
| <dd><a name="index-mpclmul"></a> |
| </dd> |
| <dt><code>-mclfushopt</code></dt> |
| <dd><a name="index-mclfushopt"></a> |
| </dd> |
| <dt><code>-mfsgsbase</code></dt> |
| <dd><a name="index-mfsgsbase"></a> |
| </dd> |
| <dt><code>-mrdrnd</code></dt> |
| <dd><a name="index-mrdrnd"></a> |
| </dd> |
| <dt><code>-mf16c</code></dt> |
| <dd><a name="index-mf16c"></a> |
| </dd> |
| <dt><code>-mfma</code></dt> |
| <dd><a name="index-mfma"></a> |
| </dd> |
| <dt><code>-mfma4</code></dt> |
| <dt><code>-mno-fma4</code></dt> |
| <dt><code>-mprefetchwt1</code></dt> |
| <dd><a name="index-mprefetchwt1"></a> |
| </dd> |
| <dt><code>-mxop</code></dt> |
| <dd><a name="index-mxop"></a> |
| </dd> |
| <dt><code>-mlwp</code></dt> |
| <dd><a name="index-mlwp"></a> |
| </dd> |
| <dt><code>-m3dnow</code></dt> |
| <dd><a name="index-m3dnow"></a> |
| </dd> |
| <dt><code>-mpopcnt</code></dt> |
| <dd><a name="index-mpopcnt"></a> |
| </dd> |
| <dt><code>-mabm</code></dt> |
| <dd><a name="index-mabm"></a> |
| </dd> |
| <dt><code>-mbmi</code></dt> |
| <dd><a name="index-mbmi"></a> |
| </dd> |
| <dt><code>-mbmi2</code></dt> |
| <dt><code>-mlzcnt</code></dt> |
| <dd><a name="index-mlzcnt"></a> |
| </dd> |
| <dt><code>-mfxsr</code></dt> |
| <dd><a name="index-mfxsr"></a> |
| </dd> |
| <dt><code>-mxsave</code></dt> |
| <dd><a name="index-mxsave"></a> |
| </dd> |
| <dt><code>-mxsaveopt</code></dt> |
| <dd><a name="index-mxsaveopt"></a> |
| </dd> |
| <dt><code>-mxsavec</code></dt> |
| <dd><a name="index-mxsavec"></a> |
| </dd> |
| <dt><code>-mxsaves</code></dt> |
| <dd><a name="index-mxsaves"></a> |
| </dd> |
| <dt><code>-mrtm</code></dt> |
| <dd><a name="index-mrtm"></a> |
| </dd> |
| <dt><code>-mtbm</code></dt> |
| <dd><a name="index-mtbm"></a> |
| </dd> |
| <dt><code>-mmpx</code></dt> |
| <dd><a name="index-mmpx"></a> |
| <p>These switches enable the use of instructions in the MMX, SSE, |
| SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, AVX512F, AVX512PF, AVX512ER, AVX512CD, |
| SHA, AES, PCLMUL, FSGSBASE, RDRND, F16C, FMA, SSE4A, FMA4, XOP, LWP, ABM, |
| BMI, BMI2, FXSR, XSAVE, XSAVEOPT, LZCNT, RTM, MPX or 3DNow! |
| extended instruction sets. Each has a corresponding <samp>-mno-</samp> option |
| to disable use of these instructions. |
| </p> |
| <p>These extensions are also available as built-in functions: see |
| <a href="x86-Built_002din-Functions.html#x86-Built_002din-Functions">x86 Built-in Functions</a>, for details of the functions enabled and |
| disabled by these switches. |
| </p> |
| <p>To generate SSE/SSE2 instructions automatically from floating-point |
| code (as opposed to 387 instructions), see <samp>-mfpmath=sse</samp>. |
| </p> |
| <p>GCC depresses SSEx instructions when <samp>-mavx</samp> is used. Instead, it |
| generates new AVX instructions or AVX equivalence for all SSEx instructions |
| when needed. |
| </p> |
| <p>These options enable GCC to use these extended instructions in |
| generated code, even without <samp>-mfpmath=sse</samp>. Applications that |
| perform run-time CPU detection must compile separate files for each |
| supported architecture, using the appropriate flags. In particular, |
| the file containing the CPU detection code should be compiled without |
| these options. |
| </p> |
| </dd> |
| <dt><code>-mdump-tune-features</code></dt> |
| <dd><a name="index-mdump_002dtune_002dfeatures"></a> |
| <p>This option instructs GCC to dump the names of the x86 performance |
| tuning features and default settings. The names can be used in |
| <samp>-mtune-ctrl=<var>feature-list</var></samp>. |
| </p> |
| </dd> |
| <dt><code>-mtune-ctrl=<var>feature-list</var></code></dt> |
| <dd><a name="index-mtune_002dctrl_003dfeature_002dlist"></a> |
| <p>This option is used to do fine grain control of x86 code generation features. |
| <var>feature-list</var> is a comma separated list of <var>feature</var> names. See also |
| <samp>-mdump-tune-features</samp>. When specified, the <var>feature</var> is turned |
| on if it is not preceded with ‘<samp>^</samp>’, otherwise, it is turned off. |
| <samp>-mtune-ctrl=<var>feature-list</var></samp> is intended to be used by GCC |
| developers. Using it may lead to code paths not covered by testing and can |
| potentially result in compiler ICEs or runtime errors. |
| </p> |
| </dd> |
| <dt><code>-mno-default</code></dt> |
| <dd><a name="index-mno_002ddefault"></a> |
| <p>This option instructs GCC to turn off all tunable features. See also |
| <samp>-mtune-ctrl=<var>feature-list</var></samp> and <samp>-mdump-tune-features</samp>. |
| </p> |
| </dd> |
| <dt><code>-mcld</code></dt> |
| <dd><a name="index-mcld"></a> |
| <p>This option instructs GCC to emit a <code>cld</code> instruction in the prologue |
| of functions that use string instructions. String instructions depend on |
| the DF flag to select between autoincrement or autodecrement mode. While the |
| ABI specifies the DF flag to be cleared on function entry, some operating |
| systems violate this specification by not clearing the DF flag in their |
| exception dispatchers. The exception handler can be invoked with the DF flag |
| set, which leads to wrong direction mode when string instructions are used. |
| This option can be enabled by default on 32-bit x86 targets by configuring |
| GCC with the <samp>--enable-cld</samp> configure option. Generation of <code>cld</code> |
| instructions can be suppressed with the <samp>-mno-cld</samp> compiler option |
| in this case. |
| </p> |
| </dd> |
| <dt><code>-mvzeroupper</code></dt> |
| <dd><a name="index-mvzeroupper"></a> |
| <p>This option instructs GCC to emit a <code>vzeroupper</code> instruction |
| before a transfer of control flow out of the function to minimize |
| the AVX to SSE transition penalty as well as remove unnecessary <code>zeroupper</code> |
| intrinsics. |
| </p> |
| </dd> |
| <dt><code>-mprefer-avx128</code></dt> |
| <dd><a name="index-mprefer_002davx128"></a> |
| <p>This option instructs GCC to use 128-bit AVX instructions instead of |
| 256-bit AVX instructions in the auto-vectorizer. |
| </p> |
| </dd> |
| <dt><code>-mcx16</code></dt> |
| <dd><a name="index-mcx16"></a> |
| <p>This option enables GCC to generate <code>CMPXCHG16B</code> instructions. |
| <code>CMPXCHG16B</code> allows for atomic operations on 128-bit double quadword |
| (or oword) data types. |
| This is useful for high-resolution counters that can be updated |
| by multiple processors (or cores). This instruction is generated as part of |
| atomic built-in functions: see <a href="_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins">__sync Builtins</a> or |
| <a href="_005f_005fatomic-Builtins.html#g_t_005f_005fatomic-Builtins">__atomic Builtins</a> for details. |
| </p> |
| </dd> |
| <dt><code>-msahf</code></dt> |
| <dd><a name="index-msahf"></a> |
| <p>This option enables generation of <code>SAHF</code> instructions in 64-bit code. |
| Early Intel Pentium 4 CPUs with Intel 64 support, |
| prior to the introduction of Pentium 4 G1 step in December 2005, |
| lacked the <code>LAHF</code> and <code>SAHF</code> instructions |
| which are supported by AMD64. |
| These are load and store instructions, respectively, for certain status flags. |
| In 64-bit mode, the <code>SAHF</code> instruction is used to optimize <code>fmod</code>, |
| <code>drem</code>, and <code>remainder</code> built-in functions; |
| see <a href="Other-Builtins.html#Other-Builtins">Other Builtins</a> for details. |
| </p> |
| </dd> |
| <dt><code>-mmovbe</code></dt> |
| <dd><a name="index-mmovbe"></a> |
| <p>This option enables use of the <code>movbe</code> instruction to implement |
| <code>__builtin_bswap32</code> and <code>__builtin_bswap64</code>. |
| </p> |
| </dd> |
| <dt><code>-mcrc32</code></dt> |
| <dd><a name="index-mcrc32"></a> |
| <p>This option enables built-in functions <code>__builtin_ia32_crc32qi</code>, |
| <code>__builtin_ia32_crc32hi</code>, <code>__builtin_ia32_crc32si</code> and |
| <code>__builtin_ia32_crc32di</code> to generate the <code>crc32</code> machine instruction. |
| </p> |
| </dd> |
| <dt><code>-mrecip</code></dt> |
| <dd><a name="index-mrecip-1"></a> |
| <p>This option enables use of <code>RCPSS</code> and <code>RSQRTSS</code> instructions |
| (and their vectorized variants <code>RCPPS</code> and <code>RSQRTPS</code>) |
| with an additional Newton-Raphson step |
| to increase precision instead of <code>DIVSS</code> and <code>SQRTSS</code> |
| (and their vectorized |
| variants) for single-precision floating-point arguments. These instructions |
| are generated only when <samp>-funsafe-math-optimizations</samp> is enabled |
| together with <samp>-finite-math-only</samp> and <samp>-fno-trapping-math</samp>. |
| Note that while the throughput of the sequence is higher than the throughput |
| of the non-reciprocal instruction, the precision of the sequence can be |
| decreased by up to 2 ulp (i.e. the inverse of 1.0 equals 0.99999994). |
| </p> |
| <p>Note that GCC implements <code>1.0f/sqrtf(<var>x</var>)</code> in terms of <code>RSQRTSS</code> |
| (or <code>RSQRTPS</code>) already with <samp>-ffast-math</samp> (or the above option |
| combination), and doesn’t need <samp>-mrecip</samp>. |
| </p> |
| <p>Also note that GCC emits the above sequence with additional Newton-Raphson step |
| for vectorized single-float division and vectorized <code>sqrtf(<var>x</var>)</code> |
| already with <samp>-ffast-math</samp> (or the above option combination), and |
| doesn’t need <samp>-mrecip</samp>. |
| </p> |
| </dd> |
| <dt><code>-mrecip=<var>opt</var></code></dt> |
| <dd><a name="index-mrecip_003dopt-1"></a> |
| <p>This option controls which reciprocal estimate instructions |
| may be used. <var>opt</var> is a comma-separated list of options, which may |
| be preceded by a ‘<samp>!</samp>’ to invert the option: |
| </p> |
| <dl compact="compact"> |
| <dt>‘<samp>all</samp>’</dt> |
| <dd><p>Enable all estimate instructions. |
| </p> |
| </dd> |
| <dt>‘<samp>default</samp>’</dt> |
| <dd><p>Enable the default instructions, equivalent to <samp>-mrecip</samp>. |
| </p> |
| </dd> |
| <dt>‘<samp>none</samp>’</dt> |
| <dd><p>Disable all estimate instructions, equivalent to <samp>-mno-recip</samp>. |
| </p> |
| </dd> |
| <dt>‘<samp>div</samp>’</dt> |
| <dd><p>Enable the approximation for scalar division. |
| </p> |
| </dd> |
| <dt>‘<samp>vec-div</samp>’</dt> |
| <dd><p>Enable the approximation for vectorized division. |
| </p> |
| </dd> |
| <dt>‘<samp>sqrt</samp>’</dt> |
| <dd><p>Enable the approximation for scalar square root. |
| </p> |
| </dd> |
| <dt>‘<samp>vec-sqrt</samp>’</dt> |
| <dd><p>Enable the approximation for vectorized square root. |
| </p></dd> |
| </dl> |
| |
| <p>So, for example, <samp>-mrecip=all,!sqrt</samp> enables |
| all of the reciprocal approximations, except for square root. |
| </p> |
| </dd> |
| <dt><code>-mveclibabi=<var>type</var></code></dt> |
| <dd><a name="index-mveclibabi-1"></a> |
| <p>Specifies the ABI type to use for vectorizing intrinsics using an |
| external library. Supported values for <var>type</var> are ‘<samp>svml</samp>’ |
| for the Intel short |
| vector math library and ‘<samp>acml</samp>’ for the AMD math core library. |
| To use this option, both <samp>-ftree-vectorize</samp> and |
| <samp>-funsafe-math-optimizations</samp> have to be enabled, and an SVML or ACML |
| ABI-compatible library must be specified at link time. |
| </p> |
| <p>GCC currently emits calls to <code>vmldExp2</code>, |
| <code>vmldLn2</code>, <code>vmldLog102</code>, <code>vmldLog102</code>, <code>vmldPow2</code>, |
| <code>vmldTanh2</code>, <code>vmldTan2</code>, <code>vmldAtan2</code>, <code>vmldAtanh2</code>, |
| <code>vmldCbrt2</code>, <code>vmldSinh2</code>, <code>vmldSin2</code>, <code>vmldAsinh2</code>, |
| <code>vmldAsin2</code>, <code>vmldCosh2</code>, <code>vmldCos2</code>, <code>vmldAcosh2</code>, |
| <code>vmldAcos2</code>, <code>vmlsExp4</code>, <code>vmlsLn4</code>, <code>vmlsLog104</code>, |
| <code>vmlsLog104</code>, <code>vmlsPow4</code>, <code>vmlsTanh4</code>, <code>vmlsTan4</code>, |
| <code>vmlsAtan4</code>, <code>vmlsAtanh4</code>, <code>vmlsCbrt4</code>, <code>vmlsSinh4</code>, |
| <code>vmlsSin4</code>, <code>vmlsAsinh4</code>, <code>vmlsAsin4</code>, <code>vmlsCosh4</code>, |
| <code>vmlsCos4</code>, <code>vmlsAcosh4</code> and <code>vmlsAcos4</code> for corresponding |
| function type when <samp>-mveclibabi=svml</samp> is used, and <code>__vrd2_sin</code>, |
| <code>__vrd2_cos</code>, <code>__vrd2_exp</code>, <code>__vrd2_log</code>, <code>__vrd2_log2</code>, |
| <code>__vrd2_log10</code>, <code>__vrs4_sinf</code>, <code>__vrs4_cosf</code>, |
| <code>__vrs4_expf</code>, <code>__vrs4_logf</code>, <code>__vrs4_log2f</code>, |
| <code>__vrs4_log10f</code> and <code>__vrs4_powf</code> for the corresponding function type |
| when <samp>-mveclibabi=acml</samp> is used. |
| </p> |
| </dd> |
| <dt><code>-mabi=<var>name</var></code></dt> |
| <dd><a name="index-mabi-3"></a> |
| <p>Generate code for the specified calling convention. Permissible values |
| are ‘<samp>sysv</samp>’ for the ABI used on GNU/Linux and other systems, and |
| ‘<samp>ms</samp>’ for the Microsoft ABI. The default is to use the Microsoft |
| ABI when targeting Microsoft Windows and the SysV ABI on all other systems. |
| You can control this behavior for specific functions by |
| using the function attributes <code>ms_abi</code> and <code>sysv_abi</code>. |
| See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>. |
| </p> |
| </dd> |
| <dt><code>-mtls-dialect=<var>type</var></code></dt> |
| <dd><a name="index-mtls_002ddialect-1"></a> |
| <p>Generate code to access thread-local storage using the ‘<samp>gnu</samp>’ or |
| ‘<samp>gnu2</samp>’ conventions. ‘<samp>gnu</samp>’ is the conservative default; |
| ‘<samp>gnu2</samp>’ is more efficient, but it may add compile- and run-time |
| requirements that cannot be satisfied on all systems. |
| </p> |
| </dd> |
| <dt><code>-mpush-args</code></dt> |
| <dt><code>-mno-push-args</code></dt> |
| <dd><a name="index-mpush_002dargs"></a> |
| <a name="index-mno_002dpush_002dargs"></a> |
| <p>Use PUSH operations to store outgoing parameters. This method is shorter |
| and usually equally fast as method using SUB/MOV operations and is enabled |
| by default. In some cases disabling it may improve performance because of |
| improved scheduling and reduced dependencies. |
| </p> |
| </dd> |
| <dt><code>-maccumulate-outgoing-args</code></dt> |
| <dd><a name="index-maccumulate_002doutgoing_002dargs-1"></a> |
| <p>If enabled, the maximum amount of space required for outgoing arguments is |
| computed in the function prologue. This is faster on most modern CPUs |
| because of reduced dependencies, improved scheduling and reduced stack usage |
| when the preferred stack boundary is not equal to 2. The drawback is a notable |
| increase in code size. This switch implies <samp>-mno-push-args</samp>. |
| </p> |
| </dd> |
| <dt><code>-mthreads</code></dt> |
| <dd><a name="index-mthreads"></a> |
| <p>Support thread-safe exception handling on MinGW. Programs that rely |
| on thread-safe exception handling must compile and link all code with the |
| <samp>-mthreads</samp> option. When compiling, <samp>-mthreads</samp> defines |
| <samp>-D_MT</samp>; when linking, it links in a special thread helper library |
| <samp>-lmingwthrd</samp> which cleans up per-thread exception-handling data. |
| </p> |
| </dd> |
| <dt><code>-mno-align-stringops</code></dt> |
| <dd><a name="index-mno_002dalign_002dstringops"></a> |
| <p>Do not align the destination of inlined string operations. This switch reduces |
| code size and improves performance in case the destination is already aligned, |
| but GCC doesn’t know about it. |
| </p> |
| </dd> |
| <dt><code>-minline-all-stringops</code></dt> |
| <dd><a name="index-minline_002dall_002dstringops"></a> |
| <p>By default GCC inlines string operations only when the destination is |
| known to be aligned to least a 4-byte boundary. |
| This enables more inlining and increases code |
| size, but may improve performance of code that depends on fast |
| <code>memcpy</code>, <code>strlen</code>, |
| and <code>memset</code> for short lengths. |
| </p> |
| </dd> |
| <dt><code>-minline-stringops-dynamically</code></dt> |
| <dd><a name="index-minline_002dstringops_002ddynamically"></a> |
| <p>For string operations of unknown size, use run-time checks with |
| inline code for small blocks and a library call for large blocks. |
| </p> |
| </dd> |
| <dt><code>-mstringop-strategy=<var>alg</var></code></dt> |
| <dd><a name="index-mstringop_002dstrategy_003dalg"></a> |
| <p>Override the internal decision heuristic for the particular algorithm to use |
| for inlining string operations. The allowed values for <var>alg</var> are: |
| </p> |
| <dl compact="compact"> |
| <dt>‘<samp>rep_byte</samp>’</dt> |
| <dt>‘<samp>rep_4byte</samp>’</dt> |
| <dt>‘<samp>rep_8byte</samp>’</dt> |
| <dd><p>Expand using i386 <code>rep</code> prefix of the specified size. |
| </p> |
| </dd> |
| <dt>‘<samp>byte_loop</samp>’</dt> |
| <dt>‘<samp>loop</samp>’</dt> |
| <dt>‘<samp>unrolled_loop</samp>’</dt> |
| <dd><p>Expand into an inline loop. |
| </p> |
| </dd> |
| <dt>‘<samp>libcall</samp>’</dt> |
| <dd><p>Always use a library call. |
| </p></dd> |
| </dl> |
| |
| </dd> |
| <dt><code>-mmemcpy-strategy=<var>strategy</var></code></dt> |
| <dd><a name="index-mmemcpy_002dstrategy_003dstrategy"></a> |
| <p>Override the internal decision heuristic to decide if <code>__builtin_memcpy</code> |
| should be inlined and what inline algorithm to use when the expected size |
| of the copy operation is known. <var>strategy</var> |
| is a comma-separated list of <var>alg</var>:<var>max_size</var>:<var>dest_align</var> triplets. |
| <var>alg</var> is specified in <samp>-mstringop-strategy</samp>, <var>max_size</var> specifies |
| the max byte size with which inline algorithm <var>alg</var> is allowed. For the last |
| triplet, the <var>max_size</var> must be <code>-1</code>. The <var>max_size</var> of the triplets |
| in the list must be specified in increasing order. The minimal byte size for |
| <var>alg</var> is <code>0</code> for the first triplet and <code><var>max_size</var> + 1</code> of the |
| preceding range. |
| </p> |
| </dd> |
| <dt><code>-mmemset-strategy=<var>strategy</var></code></dt> |
| <dd><a name="index-mmemset_002dstrategy_003dstrategy"></a> |
| <p>The option is similar to <samp>-mmemcpy-strategy=</samp> except that it is to control |
| <code>__builtin_memset</code> expansion. |
| </p> |
| </dd> |
| <dt><code>-momit-leaf-frame-pointer</code></dt> |
| <dd><a name="index-momit_002dleaf_002dframe_002dpointer-2"></a> |
| <p>Don’t keep the frame pointer in a register for leaf functions. This |
| avoids the instructions to save, set up, and restore frame pointers and |
| makes an extra register available in leaf functions. The option |
| <samp>-fomit-leaf-frame-pointer</samp> removes the frame pointer for leaf functions, |
| which might make debugging harder. |
| </p> |
| </dd> |
| <dt><code>-mtls-direct-seg-refs</code></dt> |
| <dt><code>-mno-tls-direct-seg-refs</code></dt> |
| <dd><a name="index-mtls_002ddirect_002dseg_002drefs"></a> |
| <p>Controls whether TLS variables may be accessed with offsets from the |
| TLS segment register (<code>%gs</code> for 32-bit, <code>%fs</code> for 64-bit), |
| or whether the thread base pointer must be added. Whether or not this |
| is valid depends on the operating system, and whether it maps the |
| segment to cover the entire TLS area. |
| </p> |
| <p>For systems that use the GNU C Library, the default is on. |
| </p> |
| </dd> |
| <dt><code>-msse2avx</code></dt> |
| <dt><code>-mno-sse2avx</code></dt> |
| <dd><a name="index-msse2avx"></a> |
| <p>Specify that the assembler should encode SSE instructions with VEX |
| prefix. The option <samp>-mavx</samp> turns this on by default. |
| </p> |
| </dd> |
| <dt><code>-mfentry</code></dt> |
| <dt><code>-mno-fentry</code></dt> |
| <dd><a name="index-mfentry"></a> |
| <p>If profiling is active (<samp>-pg</samp>), put the profiling |
| counter call before the prologue. |
| Note: On x86 architectures the attribute <code>ms_hook_prologue</code> |
| isn’t possible at the moment for <samp>-mfentry</samp> and <samp>-pg</samp>. |
| </p> |
| </dd> |
| <dt><code>-mrecord-mcount</code></dt> |
| <dt><code>-mno-record-mcount</code></dt> |
| <dd><a name="index-mrecord_002dmcount"></a> |
| <p>If profiling is active (<samp>-pg</samp>), generate a __mcount_loc section |
| that contains pointers to each profiling call. This is useful for |
| automatically patching and out calls. |
| </p> |
| </dd> |
| <dt><code>-mnop-mcount</code></dt> |
| <dt><code>-mno-nop-mcount</code></dt> |
| <dd><a name="index-mnop_002dmcount"></a> |
| <p>If profiling is active (<samp>-pg</samp>), generate the calls to |
| the profiling functions as nops. This is useful when they |
| should be patched in later dynamically. This is likely only |
| useful together with <samp>-mrecord-mcount</samp>. |
| </p> |
| </dd> |
| <dt><code>-mskip-rax-setup</code></dt> |
| <dt><code>-mno-skip-rax-setup</code></dt> |
| <dd><a name="index-mskip_002drax_002dsetup"></a> |
| <p>When generating code for the x86-64 architecture with SSE extensions |
| disabled, <samp>-skip-rax-setup</samp> can be used to skip setting up RAX |
| register when there are no variable arguments passed in vector registers. |
| </p> |
| <p><strong>Warning:</strong> Since RAX register is used to avoid unnecessarily |
| saving vector registers on stack when passing variable arguments, the |
| impacts of this option are callees may waste some stack space, |
| misbehave or jump to a random location. GCC 4.4 or newer don’t have |
| those issues, regardless the RAX register value. |
| </p> |
| </dd> |
| <dt><code>-m8bit-idiv</code></dt> |
| <dt><code>-mno-8bit-idiv</code></dt> |
| <dd><a name="index-m8bit_002didiv"></a> |
| <p>On some processors, like Intel Atom, 8-bit unsigned integer divide is |
| much faster than 32-bit/64-bit integer divide. This option generates a |
| run-time check. If both dividend and divisor are within range of 0 |
| to 255, 8-bit unsigned integer divide is used instead of |
| 32-bit/64-bit integer divide. |
| </p> |
| </dd> |
| <dt><code>-mavx256-split-unaligned-load</code></dt> |
| <dt><code>-mavx256-split-unaligned-store</code></dt> |
| <dd><a name="index-mavx256_002dsplit_002dunaligned_002dload"></a> |
| <a name="index-mavx256_002dsplit_002dunaligned_002dstore"></a> |
| <p>Split 32-byte AVX unaligned load and store. |
| </p> |
| </dd> |
| <dt><code>-mstack-protector-guard=<var>guard</var></code></dt> |
| <dd><a name="index-mstack_002dprotector_002dguard_003dguard"></a> |
| <p>Generate stack protection code using canary at <var>guard</var>. Supported |
| locations are ‘<samp>global</samp>’ for global canary or ‘<samp>tls</samp>’ for per-thread |
| canary in the TLS block (the default). This option has effect only when |
| <samp>-fstack-protector</samp> or <samp>-fstack-protector-all</samp> is specified. |
| </p> |
| </dd> |
| </dl> |
| |
| <p>These ‘<samp>-m</samp>’ switches are supported in addition to the above |
| on x86-64 processors in 64-bit environments. |
| </p> |
| <dl compact="compact"> |
| <dt><code>-m32</code></dt> |
| <dt><code>-m64</code></dt> |
| <dt><code>-mx32</code></dt> |
| <dt><code>-m16</code></dt> |
| <dd><a name="index-m32-5"></a> |
| <a name="index-m64-5"></a> |
| <a name="index-mx32"></a> |
| <a name="index-m16"></a> |
| <p>Generate code for a 16-bit, 32-bit or 64-bit environment. |
| The <samp>-m32</samp> option sets <code>int</code>, <code>long</code>, and pointer types |
| to 32 bits, and |
| generates code that runs on any i386 system. |
| </p> |
| <p>The <samp>-m64</samp> option sets <code>int</code> to 32 bits and <code>long</code> and pointer |
| types to 64 bits, and generates code for the x86-64 architecture. |
| For Darwin only the <samp>-m64</samp> option also turns off the <samp>-fno-pic</samp> |
| and <samp>-mdynamic-no-pic</samp> options. |
| </p> |
| <p>The <samp>-mx32</samp> option sets <code>int</code>, <code>long</code>, and pointer types |
| to 32 bits, and |
| generates code for the x86-64 architecture. |
| </p> |
| <p>The <samp>-m16</samp> option is the same as <samp>-m32</samp>, except for that |
| it outputs the <code>.code16gcc</code> assembly directive at the beginning of |
| the assembly output so that the binary can run in 16-bit mode. |
| </p> |
| </dd> |
| <dt><code>-mno-red-zone</code></dt> |
| <dd><a name="index-mno_002dred_002dzone"></a> |
| <p>Do not use a so-called “red zone” for x86-64 code. The red zone is mandated |
| by the x86-64 ABI; it is a 128-byte area beyond the location of the |
| stack pointer that is not modified by signal or interrupt handlers |
| and therefore can be used for temporary data without adjusting the stack |
| pointer. The flag <samp>-mno-red-zone</samp> disables this red zone. |
| </p> |
| </dd> |
| <dt><code>-mcmodel=small</code></dt> |
| <dd><a name="index-mcmodel_003dsmall-3"></a> |
| <p>Generate code for the small code model: the program and its symbols must |
| be linked in the lower 2 GB of the address space. Pointers are 64 bits. |
| Programs can be statically or dynamically linked. This is the default |
| code model. |
| </p> |
| </dd> |
| <dt><code>-mcmodel=kernel</code></dt> |
| <dd><a name="index-mcmodel_003dkernel"></a> |
| <p>Generate code for the kernel code model. The kernel runs in the |
| negative 2 GB of the address space. |
| This model has to be used for Linux kernel code. |
| </p> |
| </dd> |
| <dt><code>-mcmodel=medium</code></dt> |
| <dd><a name="index-mcmodel_003dmedium-1"></a> |
| <p>Generate code for the medium model: the program is linked in the lower 2 |
| GB of the address space. Small symbols are also placed there. Symbols |
| with sizes larger than <samp>-mlarge-data-threshold</samp> are put into |
| large data or BSS sections and can be located above 2GB. Programs can |
| be statically or dynamically linked. |
| </p> |
| </dd> |
| <dt><code>-mcmodel=large</code></dt> |
| <dd><a name="index-mcmodel_003dlarge-3"></a> |
| <p>Generate code for the large model. This model makes no assumptions |
| about addresses and sizes of sections. |
| </p> |
| </dd> |
| <dt><code>-maddress-mode=long</code></dt> |
| <dd><a name="index-maddress_002dmode_003dlong"></a> |
| <p>Generate code for long address mode. This is only supported for 64-bit |
| and x32 environments. It is the default address mode for 64-bit |
| environments. |
| </p> |
| </dd> |
| <dt><code>-maddress-mode=short</code></dt> |
| <dd><a name="index-maddress_002dmode_003dshort"></a> |
| <p>Generate code for short address mode. This is only supported for 32-bit |
| and x32 environments. It is the default address mode for 32-bit and |
| x32 environments. |
| </p></dd> |
| </dl> |
| |
| <hr> |
| <div class="header"> |
| <p> |
| Next: <a href="x86-Windows-Options.html#x86-Windows-Options" accesskey="n" rel="next">x86 Windows Options</a>, Previous: <a href="VxWorks-Options.html#VxWorks-Options" accesskey="p" rel="prev">VxWorks Options</a>, Up: <a href="Submodel-Options.html#Submodel-Options" accesskey="u" rel="up">Submodel Options</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p> |
| </div> |
| |
| |
| |
| </body> |
| </html> |