| <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
| <html> |
| <!-- Copyright (C) 1988-2015 Free Software Foundation, Inc. |
| |
| Permission is granted to copy, distribute and/or modify this document |
| under the terms of the GNU Free Documentation License, Version 1.3 or |
| any later version published by the Free Software Foundation; with the |
| Invariant Sections being "Funding Free Software", the Front-Cover |
| Texts being (a) (see below), and with the Back-Cover Texts being (b) |
| (see below). A copy of the license is included in the section entitled |
| "GNU Free Documentation License". |
| |
| (a) The FSF's Front-Cover Text is: |
| |
| A GNU Manual |
| |
| (b) The FSF's Back-Cover Text is: |
| |
| You have freedom to copy and modify this GNU Manual, like GNU |
| software. Copies published by the Free Software Foundation raise |
| funds for GNU development. --> |
| <!-- Created by GNU Texinfo 5.2, http://www.gnu.org/software/texinfo/ --> |
| <head> |
| <title>GNU Compiler Collection (GCC) Internals: Processor pipeline description</title> |
| |
| <meta name="description" content="GNU Compiler Collection (GCC) Internals: Processor pipeline description"> |
| <meta name="keywords" content="GNU Compiler Collection (GCC) Internals: Processor pipeline description"> |
| <meta name="resource-type" content="document"> |
| <meta name="distribution" content="global"> |
| <meta name="Generator" content="makeinfo"> |
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
| <link href="index.html#Top" rel="start" title="Top"> |
| <link href="Option-Index.html#Option-Index" rel="index" title="Option Index"> |
| <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents"> |
| <link href="Insn-Attributes.html#Insn-Attributes" rel="up" title="Insn Attributes"> |
| <link href="Conditional-Execution.html#Conditional-Execution" rel="next" title="Conditional Execution"> |
| <link href="Delay-Slots.html#Delay-Slots" rel="prev" title="Delay Slots"> |
| <style type="text/css"> |
| <!-- |
| a.summary-letter {text-decoration: none} |
| blockquote.smallquotation {font-size: smaller} |
| div.display {margin-left: 3.2em} |
| div.example {margin-left: 3.2em} |
| div.indentedblock {margin-left: 3.2em} |
| div.lisp {margin-left: 3.2em} |
| div.smalldisplay {margin-left: 3.2em} |
| div.smallexample {margin-left: 3.2em} |
| div.smallindentedblock {margin-left: 3.2em; font-size: smaller} |
| div.smalllisp {margin-left: 3.2em} |
| kbd {font-style:oblique} |
| pre.display {font-family: inherit} |
| pre.format {font-family: inherit} |
| pre.menu-comment {font-family: serif} |
| pre.menu-preformatted {font-family: serif} |
| pre.smalldisplay {font-family: inherit; font-size: smaller} |
| pre.smallexample {font-size: smaller} |
| pre.smallformat {font-family: inherit; font-size: smaller} |
| pre.smalllisp {font-size: smaller} |
| span.nocodebreak {white-space:nowrap} |
| span.nolinebreak {white-space:nowrap} |
| span.roman {font-family:serif; font-weight:normal} |
| span.sansserif {font-family:sans-serif; font-weight:normal} |
| ul.no-bullet {list-style: none} |
| --> |
| </style> |
| |
| |
| </head> |
| |
| <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000"> |
| <a name="Processor-pipeline-description"></a> |
| <div class="header"> |
| <p> |
| Previous: <a href="Delay-Slots.html#Delay-Slots" accesskey="p" rel="prev">Delay Slots</a>, Up: <a href="Insn-Attributes.html#Insn-Attributes" accesskey="u" rel="up">Insn Attributes</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p> |
| </div> |
| <hr> |
| <a name="Specifying-processor-pipeline-description"></a> |
| <h4 class="subsection">16.19.9 Specifying processor pipeline description</h4> |
| <a name="index-processor-pipeline-description"></a> |
| <a name="index-processor-functional-units"></a> |
| <a name="index-instruction-latency-time"></a> |
| <a name="index-interlock-delays"></a> |
| <a name="index-data-dependence-delays"></a> |
| <a name="index-reservation-delays"></a> |
| <a name="index-pipeline-hazard-recognizer"></a> |
| <a name="index-automaton-based-pipeline-description"></a> |
| <a name="index-regular-expressions"></a> |
| <a name="index-deterministic-finite-state-automaton"></a> |
| <a name="index-automaton-based-scheduler"></a> |
| <a name="index-RISC"></a> |
| <a name="index-VLIW"></a> |
| |
| <p>To achieve better performance, most modern processors |
| (super-pipelined, superscalar <acronym>RISC</acronym>, and <acronym>VLIW</acronym> |
| processors) have many <em>functional units</em> on which several |
| instructions can be executed simultaneously. An instruction starts |
| execution if its issue conditions are satisfied. If not, the |
| instruction is stalled until its conditions are satisfied. Such |
| <em>interlock (pipeline) delay</em> causes interruption of the fetching |
| of successor instructions (or demands nop instructions, e.g. for some |
| MIPS processors). |
| </p> |
| <p>There are two major kinds of interlock delays in modern processors. |
| The first one is a data dependence delay determining <em>instruction |
| latency time</em>. The instruction execution is not started until all |
| source data have been evaluated by prior instructions (there are more |
| complex cases when the instruction execution starts even when the data |
| are not available but will be ready in given time after the |
| instruction execution start). Taking the data dependence delays into |
| account is simple. The data dependence (true, output, and |
| anti-dependence) delay between two instructions is given by a |
| constant. In most cases this approach is adequate. The second kind |
| of interlock delays is a reservation delay. The reservation delay |
| means that two instructions under execution will be in need of shared |
| processors resources, i.e. buses, internal registers, and/or |
| functional units, which are reserved for some time. Taking this kind |
| of delay into account is complex especially for modern <acronym>RISC</acronym> |
| processors. |
| </p> |
| <p>The task of exploiting more processor parallelism is solved by an |
| instruction scheduler. For a better solution to this problem, the |
| instruction scheduler has to have an adequate description of the |
| processor parallelism (or <em>pipeline description</em>). GCC |
| machine descriptions describe processor parallelism and functional |
| unit reservations for groups of instructions with the aid of |
| <em>regular expressions</em>. |
| </p> |
| <p>The GCC instruction scheduler uses a <em>pipeline hazard recognizer</em> to |
| figure out the possibility of the instruction issue by the processor |
| on a given simulated processor cycle. The pipeline hazard recognizer is |
| automatically generated from the processor pipeline description. The |
| pipeline hazard recognizer generated from the machine description |
| is based on a deterministic finite state automaton (<acronym>DFA</acronym>): |
| the instruction issue is possible if there is a transition from one |
| automaton state to another one. This algorithm is very fast, and |
| furthermore, its speed is not dependent on processor |
| complexity<a name="DOCF5" href="#FOOT5"><sup>5</sup></a>. |
| </p> |
| <a name="index-automaton-based-pipeline-description-1"></a> |
| <p>The rest of this section describes the directives that constitute |
| an automaton-based processor pipeline description. The order of |
| these constructions within the machine description file is not |
| important. |
| </p> |
| <a name="index-define_005fautomaton"></a> |
| <a name="index-pipeline-hazard-recognizer-1"></a> |
| <p>The following optional construction describes names of automata |
| generated and used for the pipeline hazards recognition. Sometimes |
| the generated finite state automaton used by the pipeline hazard |
| recognizer is large. If we use more than one automaton and bind functional |
| units to the automata, the total size of the automata is usually |
| less than the size of the single automaton. If there is no one such |
| construction, only one finite state automaton is generated. |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(define_automaton <var>automata-names</var>) |
| </pre></div> |
| |
| <p><var>automata-names</var> is a string giving names of the automata. The |
| names are separated by commas. All the automata should have unique names. |
| The automaton name is used in the constructions <code>define_cpu_unit</code> and |
| <code>define_query_cpu_unit</code>. |
| </p> |
| <a name="index-define_005fcpu_005funit"></a> |
| <a name="index-processor-functional-units-1"></a> |
| <p>Each processor functional unit used in the description of instruction |
| reservations should be described by the following construction. |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(define_cpu_unit <var>unit-names</var> [<var>automaton-name</var>]) |
| </pre></div> |
| |
| <p><var>unit-names</var> is a string giving the names of the functional units |
| separated by commas. Don’t use name ‘<samp>nothing</samp>’, it is reserved |
| for other goals. |
| </p> |
| <p><var>automaton-name</var> is a string giving the name of the automaton with |
| which the unit is bound. The automaton should be described in |
| construction <code>define_automaton</code>. You should give |
| <em>automaton-name</em>, if there is a defined automaton. |
| </p> |
| <p>The assignment of units to automata are constrained by the uses of the |
| units in insn reservations. The most important constraint is: if a |
| unit reservation is present on a particular cycle of an alternative |
| for an insn reservation, then some unit from the same automaton must |
| be present on the same cycle for the other alternatives of the insn |
| reservation. The rest of the constraints are mentioned in the |
| description of the subsequent constructions. |
| </p> |
| <a name="index-define_005fquery_005fcpu_005funit"></a> |
| <a name="index-querying-function-unit-reservations"></a> |
| <p>The following construction describes CPU functional units analogously |
| to <code>define_cpu_unit</code>. The reservation of such units can be |
| queried for an automaton state. The instruction scheduler never |
| queries reservation of functional units for given automaton state. So |
| as a rule, you don’t need this construction. This construction could |
| be used for future code generation goals (e.g. to generate |
| <acronym>VLIW</acronym> insn templates). |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(define_query_cpu_unit <var>unit-names</var> [<var>automaton-name</var>]) |
| </pre></div> |
| |
| <p><var>unit-names</var> is a string giving names of the functional units |
| separated by commas. |
| </p> |
| <p><var>automaton-name</var> is a string giving the name of the automaton with |
| which the unit is bound. |
| </p> |
| <a name="index-define_005finsn_005freservation"></a> |
| <a name="index-instruction-latency-time-1"></a> |
| <a name="index-regular-expressions-1"></a> |
| <a name="index-data-bypass"></a> |
| <p>The following construction is the major one to describe pipeline |
| characteristics of an instruction. |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(define_insn_reservation <var>insn-name</var> <var>default_latency</var> |
| <var>condition</var> <var>regexp</var>) |
| </pre></div> |
| |
| <p><var>default_latency</var> is a number giving latency time of the |
| instruction. There is an important difference between the old |
| description and the automaton based pipeline description. The latency |
| time is used for all dependencies when we use the old description. In |
| the automaton based pipeline description, the given latency time is only |
| used for true dependencies. The cost of anti-dependencies is always |
| zero and the cost of output dependencies is the difference between |
| latency times of the producing and consuming insns (if the difference |
| is negative, the cost is considered to be zero). You can always |
| change the default costs for any description by using the target hook |
| <code>TARGET_SCHED_ADJUST_COST</code> (see <a href="Scheduling.html#Scheduling">Scheduling</a>). |
| </p> |
| <p><var>insn-name</var> is a string giving the internal name of the insn. The |
| internal names are used in constructions <code>define_bypass</code> and in |
| the automaton description file generated for debugging. The internal |
| name has nothing in common with the names in <code>define_insn</code>. It is a |
| good practice to use insn classes described in the processor manual. |
| </p> |
| <p><var>condition</var> defines what RTL insns are described by this |
| construction. You should remember that you will be in trouble if |
| <var>condition</var> for two or more different |
| <code>define_insn_reservation</code> constructions is TRUE for an insn. In |
| this case what reservation will be used for the insn is not defined. |
| Such cases are not checked during generation of the pipeline hazards |
| recognizer because in general recognizing that two conditions may have |
| the same value is quite difficult (especially if the conditions |
| contain <code>symbol_ref</code>). It is also not checked during the |
| pipeline hazard recognizer work because it would slow down the |
| recognizer considerably. |
| </p> |
| <p><var>regexp</var> is a string describing the reservation of the cpu’s functional |
| units by the instruction. The reservations are described by a regular |
| expression according to the following syntax: |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample"> regexp = regexp "," oneof |
| | oneof |
| |
| oneof = oneof "|" allof |
| | allof |
| |
| allof = allof "+" repeat |
| | repeat |
| |
| repeat = element "*" number |
| | element |
| |
| element = cpu_function_unit_name |
| | reservation_name |
| | result_name |
| | "nothing" |
| | "(" regexp ")" |
| </pre></div> |
| |
| <ul> |
| <li> ‘<samp>,</samp>’ is used for describing the start of the next cycle in |
| the reservation. |
| |
| </li><li> ‘<samp>|</samp>’ is used for describing a reservation described by the first |
| regular expression <strong>or</strong> a reservation described by the second |
| regular expression <strong>or</strong> etc. |
| |
| </li><li> ‘<samp>+</samp>’ is used for describing a reservation described by the first |
| regular expression <strong>and</strong> a reservation described by the |
| second regular expression <strong>and</strong> etc. |
| |
| </li><li> ‘<samp>*</samp>’ is used for convenience and simply means a sequence in which |
| the regular expression are repeated <var>number</var> times with cycle |
| advancing (see ‘<samp>,</samp>’). |
| |
| </li><li> ‘<samp>cpu_function_unit_name</samp>’ denotes reservation of the named |
| functional unit. |
| |
| </li><li> ‘<samp>reservation_name</samp>’ — see description of construction |
| ‘<samp>define_reservation</samp>’. |
| |
| </li><li> ‘<samp>nothing</samp>’ denotes no unit reservations. |
| </li></ul> |
| |
| <a name="index-define_005freservation"></a> |
| <p>Sometimes unit reservations for different insns contain common parts. |
| In such case, you can simplify the pipeline description by describing |
| the common part by the following construction |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(define_reservation <var>reservation-name</var> <var>regexp</var>) |
| </pre></div> |
| |
| <p><var>reservation-name</var> is a string giving name of <var>regexp</var>. |
| Functional unit names and reservation names are in the same name |
| space. So the reservation names should be different from the |
| functional unit names and can not be the reserved name ‘<samp>nothing</samp>’. |
| </p> |
| <a name="index-define_005fbypass"></a> |
| <a name="index-instruction-latency-time-2"></a> |
| <a name="index-data-bypass-1"></a> |
| <p>The following construction is used to describe exceptions in the |
| latency time for given instruction pair. This is so called bypasses. |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(define_bypass <var>number</var> <var>out_insn_names</var> <var>in_insn_names</var> |
| [<var>guard</var>]) |
| </pre></div> |
| |
| <p><var>number</var> defines when the result generated by the instructions |
| given in string <var>out_insn_names</var> will be ready for the |
| instructions given in string <var>in_insn_names</var>. Each of these |
| strings is a comma-separated list of filename-style globs and |
| they refer to the names of <code>define_insn_reservation</code>s. |
| For example: |
| </p><div class="smallexample"> |
| <pre class="smallexample">(define_bypass 1 "cpu1_load_*, cpu1_store_*" "cpu1_load_*") |
| </pre></div> |
| <p>defines a bypass between instructions that start with |
| ‘<samp>cpu1_load_</samp>’ or ‘<samp>cpu1_store_</samp>’ and those that start with |
| ‘<samp>cpu1_load_</samp>’. |
| </p> |
| <p><var>guard</var> is an optional string giving the name of a C function which |
| defines an additional guard for the bypass. The function will get the |
| two insns as parameters. If the function returns zero the bypass will |
| be ignored for this case. The additional guard is necessary to |
| recognize complicated bypasses, e.g. when the consumer is only an address |
| of insn ‘<samp>store</samp>’ (not a stored value). |
| </p> |
| <p>If there are more one bypass with the same output and input insns, the |
| chosen bypass is the first bypass with a guard in description whose |
| guard function returns nonzero. If there is no such bypass, then |
| bypass without the guard function is chosen. |
| </p> |
| <a name="index-exclusion_005fset"></a> |
| <a name="index-presence_005fset"></a> |
| <a name="index-final_005fpresence_005fset"></a> |
| <a name="index-absence_005fset"></a> |
| <a name="index-final_005fabsence_005fset"></a> |
| <a name="index-VLIW-1"></a> |
| <a name="index-RISC-1"></a> |
| <p>The following five constructions are usually used to describe |
| <acronym>VLIW</acronym> processors, or more precisely, to describe a placement |
| of small instructions into <acronym>VLIW</acronym> instruction slots. They |
| can be used for <acronym>RISC</acronym> processors, too. |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(exclusion_set <var>unit-names</var> <var>unit-names</var>) |
| (presence_set <var>unit-names</var> <var>patterns</var>) |
| (final_presence_set <var>unit-names</var> <var>patterns</var>) |
| (absence_set <var>unit-names</var> <var>patterns</var>) |
| (final_absence_set <var>unit-names</var> <var>patterns</var>) |
| </pre></div> |
| |
| <p><var>unit-names</var> is a string giving names of functional units |
| separated by commas. |
| </p> |
| <p><var>patterns</var> is a string giving patterns of functional units |
| separated by comma. Currently pattern is one unit or units |
| separated by white-spaces. |
| </p> |
| <p>The first construction (‘<samp>exclusion_set</samp>’) means that each |
| functional unit in the first string can not be reserved simultaneously |
| with a unit whose name is in the second string and vice versa. For |
| example, the construction is useful for describing processors |
| (e.g. some SPARC processors) with a fully pipelined floating point |
| functional unit which can execute simultaneously only single floating |
| point insns or only double floating point insns. |
| </p> |
| <p>The second construction (‘<samp>presence_set</samp>’) means that each |
| functional unit in the first string can not be reserved unless at |
| least one of pattern of units whose names are in the second string is |
| reserved. This is an asymmetric relation. For example, it is useful |
| for description that <acronym>VLIW</acronym> ‘<samp>slot1</samp>’ is reserved after |
| ‘<samp>slot0</samp>’ reservation. We could describe it by the following |
| construction |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(presence_set "slot1" "slot0") |
| </pre></div> |
| |
| <p>Or ‘<samp>slot1</samp>’ is reserved only after ‘<samp>slot0</samp>’ and unit ‘<samp>b0</samp>’ |
| reservation. In this case we could write |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(presence_set "slot1" "slot0 b0") |
| </pre></div> |
| |
| <p>The third construction (‘<samp>final_presence_set</samp>’) is analogous to |
| ‘<samp>presence_set</samp>’. The difference between them is when checking is |
| done. When an instruction is issued in given automaton state |
| reflecting all current and planned unit reservations, the automaton |
| state is changed. The first state is a source state, the second one |
| is a result state. Checking for ‘<samp>presence_set</samp>’ is done on the |
| source state reservation, checking for ‘<samp>final_presence_set</samp>’ is |
| done on the result reservation. This construction is useful to |
| describe a reservation which is actually two subsequent reservations. |
| For example, if we use |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(presence_set "slot1" "slot0") |
| </pre></div> |
| |
| <p>the following insn will be never issued (because ‘<samp>slot1</samp>’ requires |
| ‘<samp>slot0</samp>’ which is absent in the source state). |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(define_reservation "insn_and_nop" "slot0 + slot1") |
| </pre></div> |
| |
| <p>but it can be issued if we use analogous ‘<samp>final_presence_set</samp>’. |
| </p> |
| <p>The forth construction (‘<samp>absence_set</samp>’) means that each functional |
| unit in the first string can be reserved only if each pattern of units |
| whose names are in the second string is not reserved. This is an |
| asymmetric relation (actually ‘<samp>exclusion_set</samp>’ is analogous to |
| this one but it is symmetric). For example it might be useful in a |
| <acronym>VLIW</acronym> description to say that ‘<samp>slot0</samp>’ cannot be reserved |
| after either ‘<samp>slot1</samp>’ or ‘<samp>slot2</samp>’ have been reserved. This |
| can be described as: |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(absence_set "slot0" "slot1, slot2") |
| </pre></div> |
| |
| <p>Or ‘<samp>slot2</samp>’ can not be reserved if ‘<samp>slot0</samp>’ and unit ‘<samp>b0</samp>’ |
| are reserved or ‘<samp>slot1</samp>’ and unit ‘<samp>b1</samp>’ are reserved. In |
| this case we could write |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(absence_set "slot2" "slot0 b0, slot1 b1") |
| </pre></div> |
| |
| <p>All functional units mentioned in a set should belong to the same |
| automaton. |
| </p> |
| <p>The last construction (‘<samp>final_absence_set</samp>’) is analogous to |
| ‘<samp>absence_set</samp>’ but checking is done on the result (state) |
| reservation. See comments for ‘<samp>final_presence_set</samp>’. |
| </p> |
| <a name="index-automata_005foption"></a> |
| <a name="index-deterministic-finite-state-automaton-1"></a> |
| <a name="index-nondeterministic-finite-state-automaton"></a> |
| <a name="index-finite-state-automaton-minimization"></a> |
| <p>You can control the generator of the pipeline hazard recognizer with |
| the following construction. |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(automata_option <var>options</var>) |
| </pre></div> |
| |
| <p><var>options</var> is a string giving options which affect the generated |
| code. Currently there are the following options: |
| </p> |
| <ul> |
| <li> <em>no-minimization</em> makes no minimization of the automaton. This is |
| only worth to do when we are debugging the description and need to |
| look more accurately at reservations of states. |
| |
| </li><li> <em>time</em> means printing time statistics about the generation of |
| automata. |
| |
| </li><li> <em>stats</em> means printing statistics about the generated automata |
| such as the number of DFA states, NDFA states and arcs. |
| |
| </li><li> <em>v</em> means a generation of the file describing the result automata. |
| The file has suffix ‘<samp>.dfa</samp>’ and can be used for the description |
| verification and debugging. |
| |
| </li><li> <em>w</em> means a generation of warning instead of error for |
| non-critical errors. |
| |
| </li><li> <em>no-comb-vect</em> prevents the automaton generator from generating |
| two data structures and comparing them for space efficiency. Using |
| a comb vector to represent transitions may be better, but it can be |
| very expensive to construct. This option is useful if the build |
| process spends an unacceptably long time in genautomata. |
| |
| </li><li> <em>ndfa</em> makes nondeterministic finite state automata. This affects |
| the treatment of operator ‘<samp>|</samp>’ in the regular expressions. The |
| usual treatment of the operator is to try the first alternative and, |
| if the reservation is not possible, the second alternative. The |
| nondeterministic treatment means trying all alternatives, some of them |
| may be rejected by reservations in the subsequent insns. |
| |
| </li><li> <em>collapse-ndfa</em> modifies the behaviour of the generator when |
| producing an automaton. An additional state transition to collapse a |
| nondeterministic <acronym>NDFA</acronym> state to a deterministic <acronym>DFA</acronym> |
| state is generated. It can be triggered by passing <code>const0_rtx</code> to |
| state_transition. In such an automaton, cycle advance transitions are |
| available only for these collapsed states. This option is useful for |
| ports that want to use the <code>ndfa</code> option, but also want to use |
| <code>define_query_cpu_unit</code> to assign units to insns issued in a cycle. |
| |
| </li><li> <em>progress</em> means output of a progress bar showing how many states |
| were generated so far for automaton being processed. This is useful |
| during debugging a <acronym>DFA</acronym> description. If you see too many |
| generated states, you could interrupt the generator of the pipeline |
| hazard recognizer and try to figure out a reason for generation of the |
| huge automaton. |
| </li></ul> |
| |
| <p>As an example, consider a superscalar <acronym>RISC</acronym> machine which can |
| issue three insns (two integer insns and one floating point insn) on |
| the cycle but can finish only two insns. To describe this, we define |
| the following functional units. |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(define_cpu_unit "i0_pipeline, i1_pipeline, f_pipeline") |
| (define_cpu_unit "port0, port1") |
| </pre></div> |
| |
| <p>All simple integer insns can be executed in any integer pipeline and |
| their result is ready in two cycles. The simple integer insns are |
| issued into the first pipeline unless it is reserved, otherwise they |
| are issued into the second pipeline. Integer division and |
| multiplication insns can be executed only in the second integer |
| pipeline and their results are ready correspondingly in 8 and 4 |
| cycles. The integer division is not pipelined, i.e. the subsequent |
| integer division insn can not be issued until the current division |
| insn finished. Floating point insns are fully pipelined and their |
| results are ready in 3 cycles. Where the result of a floating point |
| insn is used by an integer insn, an additional delay of one cycle is |
| incurred. To describe all of this we could specify |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(define_cpu_unit "div") |
| |
| (define_insn_reservation "simple" 2 (eq_attr "type" "int") |
| "(i0_pipeline | i1_pipeline), (port0 | port1)") |
| |
| (define_insn_reservation "mult" 4 (eq_attr "type" "mult") |
| "i1_pipeline, nothing*2, (port0 | port1)") |
| |
| (define_insn_reservation "div" 8 (eq_attr "type" "div") |
| "i1_pipeline, div*7, div + (port0 | port1)") |
| |
| (define_insn_reservation "float" 3 (eq_attr "type" "float") |
| "f_pipeline, nothing, (port0 | port1)) |
| |
| (define_bypass 4 "float" "simple,mult,div") |
| </pre></div> |
| |
| <p>To simplify the description we could describe the following reservation |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(define_reservation "finish" "port0|port1") |
| </pre></div> |
| |
| <p>and use it in all <code>define_insn_reservation</code> as in the following |
| construction |
| </p> |
| <div class="smallexample"> |
| <pre class="smallexample">(define_insn_reservation "simple" 2 (eq_attr "type" "int") |
| "(i0_pipeline | i1_pipeline), finish") |
| </pre></div> |
| |
| |
| <div class="footnote"> |
| <hr> |
| <h4 class="footnotes-heading">Footnotes</h4> |
| |
| <h3><a name="FOOT5" href="#DOCF5">(5)</a></h3> |
| <p>However, the size of the automaton depends on |
| processor complexity. To limit this effect, machine descriptions |
| can split orthogonal parts of the machine description among several |
| automata: but then, since each of these must be stepped independently, |
| this does cause a small decrease in the algorithm’s performance.</p> |
| </div> |
| <hr> |
| <div class="header"> |
| <p> |
| Previous: <a href="Delay-Slots.html#Delay-Slots" accesskey="p" rel="prev">Delay Slots</a>, Up: <a href="Insn-Attributes.html#Insn-Attributes" accesskey="u" rel="up">Insn Attributes</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p> |
| </div> |
| |
| |
| |
| </body> |
| </html> |