Home | History | Annotate | Download | only in Blackfin
      1 //===-- README.txt - Notes for Blackfin Target ------------------*- org -*-===//
      2 
      3 * Condition codes
      4 ** DONE Problem with asymmetric SETCC operations
      5 The instruction
      6 
      7   CC = R0 < 2
      8 
      9 is not symmetric - there is no R0 > 2 instruction. On the other hand, IF CC
     10 JUMP can take both CC and !CC as a condition. We cannot pattern-match (brcond
     11 (not cc), target), the DAG optimizer removes that kind of thing.
     12 
     13 This is handled by creating a pseudo-register NCC that aliases CC. Register
     14 classes JustCC and NotCC are used to control the inversion of CC.
     15 
     16 ** DONE CC as an i32 register
     17 The AnyCC register class pretends to hold i32 values. It can only represent the
     18 values 0 and 1, but we can copy to and from the D class. This hack makes it
     19 possible to represent the setcc instruction without having i1 as a legal type.
     20 
     21 In most cases, the CC register is set by a "CC = .." or BITTST instruction, and
     22 then used in a conditional branch or move. The code generator thinks it is
     23 moving 32 bits, but the value stays in CC. In other cases, the result of a
     24 comparison is actually used as am i32 number, and CC will be copied to a D
     25 register.
     26 
     27 * Stack frames
     28 ** TODO Use Push/Pop instructions
     29 We should use the push/pop instructions when saving callee-saved
     30 registers. The are smaller, and we may even use push multiple instructions.
     31 
     32 ** TODO requiresRegisterScavenging
     33 We need more intelligence in determining when the scavenger is needed. We
     34 should keep track of:
     35 - Spilling D16 registers
     36 - Spilling AnyCC registers
     37 
     38 * Assembler
     39 ** TODO Implement PrintGlobalVariable
     40 ** TODO Remove LOAD32sym
     41 It's a hack combining two instructions by concatenation.
     42 
     43 * Inline Assembly
     44 
     45 These are the GCC constraints from bfin/constraints.md:
     46 
     47 | Code  | Register class                            | LLVM |
     48 |-------+-------------------------------------------+------|
     49 | a     | P                                         | C    |
     50 | d     | D                                         | C    |
     51 | z     | Call clobbered P (P0, P1, P2)             | X    |
     52 | D     | EvenD                                     | X    |
     53 | W     | OddD                                      | X    |
     54 | e     | Accu                                      | C    |
     55 | A     | A0                                        | S    |
     56 | B     | A1                                        | S    |
     57 | b     | I                                         | C    |
     58 | v     | B                                         | C    |
     59 | f     | M                                         | C    |
     60 | c     | Circular I, B, L                          | X    |
     61 | C     | JustCC                                    | S    |
     62 | t     | LoopTop                                   | X    |
     63 | u     | LoopBottom                                | X    |
     64 | k     | LoopCount                                 | X    |
     65 | x     | GR                                        | C    |
     66 | y     | RET*, ASTAT, SEQSTAT, USP                 | X    |
     67 | w     | ALL                                       | C    |
     68 | Z     | The FD-PIC GOT pointer (P3)               | S    |
     69 | Y     | The FD-PIC function pointer register (P1) | S    |
     70 | q0-q7 | R0-R7 individually                        |      |
     71 | qA    | P0                                        |      |
     72 |-------+-------------------------------------------+------|
     73 | Code  | Constant                                  |      |
     74 |-------+-------------------------------------------+------|
     75 | J     | 1<<N, N<32                                |      |
     76 | Ks3   | imm3                                      |      |
     77 | Ku3   | uimm3                                     |      |
     78 | Ks4   | imm4                                      |      |
     79 | Ku4   | uimm4                                     |      |
     80 | Ks5   | imm5                                      |      |
     81 | Ku5   | uimm5                                     |      |
     82 | Ks7   | imm7                                      |      |
     83 | KN7   | -imm7                                     |      |
     84 | Ksh   | imm16                                     |      |
     85 | Kuh   | uimm16                                    |      |
     86 | L     | ~(1<<N)                                   |      |
     87 | M1    | 0xff                                      |      |
     88 | M2    | 0xffff                                    |      |
     89 | P0-P4 | 0-4                                       |      |
     90 | PA    | Macflag, not M                            |      |
     91 | PB    | Macflag, only M                           |      |
     92 | Q     | Symbol                                    |      |
     93 
     94 ** TODO Support all register classes
     95 * DAG combiner
     96 ** Create test case for each Illegal SETCC case
     97 The DAG combiner may someimes produce illegal i16 SETCC instructions.
     98 
     99 *** TODO SETCC (ctlz x), 5) == const
    100 *** TODO SETCC (and load, const) == const
    101 *** DONE SETCC (zext x) == const
    102 *** TODO SETCC (sext x) == const
    103 
    104 * Instruction selection
    105 ** TODO Better imediate constants
    106 Like ARM, build constants as small imm + shift.
    107 
    108 ** TODO Implement cycle counter
    109 We have CYCLES and CYCLES2 registers, but the readcyclecounter intrinsic wants
    110 to return i64, and the code generator doesn't know how to legalize that.
    111 
    112 ** TODO Instruction alternatives
    113 Some instructions come in different variants for example:
    114 
    115   D = D + D
    116   P = P + P
    117 
    118 Cross combinations are not allowed:
    119 
    120   P = D + D (bad)
    121 
    122 Similarly for the subreg pseudo-instructions:
    123 
    124  D16L = EXTRACT_SUBREG D16, bfin_subreg_lo16
    125  P16L = EXTRACT_SUBREG P16, bfin_subreg_lo16
    126 
    127 We want to take advantage of the alternative instructions. This could be done by
    128 changing the DAG after instruction selection.
    129 
    130 
    131 ** Multipatterns for load/store
    132 We should try to identify multipatterns for load and store instructions. The
    133 available instruction matrix is a bit irregular.
    134 
    135 Loads:
    136 
    137 | Addr       | D | P | D 16z | D 16s | D16 | D 8z | D 8s |
    138 |------------+---+---+-------+-------+-----+------+------|
    139 | P          | * | * | *     | *     | *   | *    | *    |
    140 | P++        | * | * | *     | *     |     | *    | *    |
    141 | P--        | * | * | *     | *     |     | *    | *    |
    142 | P+uimm5m2  |   |   | *     | *     |     |      |      |
    143 | P+uimm6m4  | * | * |       |       |     |      |      |
    144 | P+imm16    |   |   |       |       |     | *    | *    |
    145 | P+imm17m2  |   |   | *     | *     |     |      |      |
    146 | P+imm18m4  | * | * |       |       |     |      |      |
    147 | P++P       | * |   | *     | *     | *   |      |      |
    148 | FP-uimm7m4 | * | * |       |       |     |      |      |
    149 | I          | * |   |       |       | *   |      |      |
    150 | I++        | * |   |       |       | *   |      |      |
    151 | I--        | * |   |       |       | *   |      |      |
    152 | I++M       | * |   |       |       |     |      |      |
    153 
    154 Stores:
    155 
    156 | Addr       | D | P | D16H | D16L | D 8 |
    157 |------------+---+---+------+------+-----|
    158 | P          | * | * | *    | *    | *   |
    159 | P++        | * | * |      | *    | *   |
    160 | P--        | * | * |      | *    | *   |
    161 | P+uimm5m2  |   |   |      | *    |     |
    162 | P+uimm6m4  | * | * |      |      |     |
    163 | P+imm16    |   |   |      |      | *   |
    164 | P+imm17m2  |   |   |      | *    |     |
    165 | P+imm18m4  | * | * |      |      |     |
    166 | P++P       | * |   | *    | *    |     |
    167 | FP-uimm7m4 | * | * |      |      |     |
    168 | I          | * |   | *    | *    |     |
    169 | I++        | * |   | *    | *    |     |
    170 | I--        | * |   | *    | *    |     |
    171 | I++M       | * |   |      |      |     |
    172 
    173 * Workarounds and features
    174 Blackfin CPUs have bugs. Each model comes in a number of silicon revisions with
    175 different bugs. We learn about the CPU model from the -mcpu switch.
    176 
    177 ** Interpretation of -mcpu value
    178 - -mcpu=bf527 refers to the latest known BF527 revision
    179 - -mcpu=bf527-0.2 refers to silicon rev. 0.2
    180 - -mcpu=bf527-any refers to all known revisions
    181 - -mcpu=bf527-none disables all workarounds
    182 
    183 The -mcpu setting affects the __SILICON_REVISION__ macro and enabled workarounds:
    184 
    185 | -mcpu      | __SILICON_REVISION__ | Workarounds        |
    186 |------------+----------------------+--------------------|
    187 | bf527      | Def Latest           | Specific to latest |
    188 | bf527-1.3  | Def 0x0103           | Specific to 1.3    |
    189 | bf527-any  | Def 0xffff           | All bf527-x.y      |
    190 | bf527-none | Undefined            | None               |
    191 
    192 These are the known cores and revisions:
    193 
    194 | Core        | Silicon            | Processors              |
    195 |-------------+--------------------+-------------------------|
    196 | Edinburgh   | 0.3, 0.4, 0.5, 0.6 | BF531 BF532 BF533       |
    197 | Braemar     | 0.2, 0.3           | BF534 BF536 BF537       |
    198 | Stirling    | 0.3, 0.4, 0.5      | BF538 BF539             |
    199 | Moab        | 0.0, 0.1, 0.2      | BF542 BF544 BF548 BF549 |
    200 | Teton       | 0.3, 0.5           | BF561                   |
    201 | Kookaburra  | 0.0, 0.1, 0.2      | BF523 BF525 BF527       |
    202 | Mockingbird | 0.0, 0.1           | BF522 BF524 BF526       |
    203 | Brodie      | 0.0, 0.1           | BF512 BF514 BF516 BF518 |
    204 
    205 
    206 ** Compiler implemented workarounds
    207 Most workarounds are implemented in header files and source code using the
    208 __ADSPBF527__ macros. A few workarounds require compiler support.
    209 
    210 |  Anomaly | Macro                          | GCC Switch       |
    211 |----------+--------------------------------+------------------|
    212 |      Any | __WORKAROUNDS_ENABLED          |                  |
    213 | 05000074 | WA_05000074                    |                  |
    214 | 05000244 | __WORKAROUND_SPECULATIVE_SYNCS | -mcsync-anomaly  |
    215 | 05000245 | __WORKAROUND_SPECULATIVE_LOADS | -mspecld-anomaly |
    216 | 05000257 | WA_05000257                    |                  |
    217 | 05000283 | WA_05000283                    |                  |
    218 | 05000312 | WA_LOAD_LCREGS                 |                  |
    219 | 05000315 | WA_05000315                    |                  |
    220 | 05000371 | __WORKAROUND_RETS              |                  |
    221 | 05000426 | __WORKAROUND_INDIRECT_CALLS    | Not -micplb      |
    222 
    223 ** GCC feature switches
    224 | Switch                    | Description                            |
    225 |---------------------------+----------------------------------------|
    226 | -msim                     | Use simulator runtime                  |
    227 | -momit-leaf-frame-pointer | Omit frame pointer for leaf functions  |
    228 | -mlow64k                  |                                        |
    229 | -mcsync-anomaly           |                                        |
    230 | -mspecld-anomaly          |                                        |
    231 | -mid-shared-library       |                                        |
    232 | -mleaf-id-shared-library  |                                        |
    233 | -mshared-library-id=      |                                        |
    234 | -msep-data                | Enable separate data segment           |
    235 | -mlong-calls              | Use indirect calls                     |
    236 | -mfast-fp                 |                                        |
    237 | -mfdpic                   |                                        |
    238 | -minline-plt              |                                        |
    239 | -mstack-check-l1          | Do stack checking in L1 scratch memory |
    240 | -mmulticore               | Enable multicore support               |
    241 | -mcorea                   | Build for Core A                       |
    242 | -mcoreb                   | Build for Core B                       |
    243 | -msdram                   | Build for SDRAM                        |
    244 | -micplb                   | Assume ICPLBs are enabled at runtime.  |
    245