Home | History | Annotate | Download | only in docs
      1 ============
      2 CMake Primer
      3 ============
      4 
      5 .. contents::
      6    :local:
      7 
      8 .. warning::
      9    Disclaimer: This documentation is written by LLVM project contributors `not`
     10    anyone affiliated with the CMake project. This document may contain
     11    inaccurate terminology, phrasing, or technical details. It is provided with
     12    the best intentions.
     13 
     14 
     15 Introduction
     16 ============
     17 
     18 The LLVM project and many of the core projects built on LLVM build using CMake.
     19 This document aims to provide a brief overview of CMake for developers modifying
     20 LLVM projects or building their own projects on top of LLVM.
     21 
     22 The official CMake language references is available in the cmake-language
     23 manpage and `cmake-language online documentation
     24 <https://cmake.org/cmake/help/v3.4/manual/cmake-language.7.html>`_.
     25 
     26 10,000 ft View
     27 ==============
     28 
     29 CMake is a tool that reads script files in its own language that describe how a
     30 software project builds. As CMake evaluates the scripts it constructs an
     31 internal representation of the software project. Once the scripts have been
     32 fully processed, if there are no errors, CMake will generate build files to
     33 actually build the project. CMake supports generating build files for a variety
     34 of command line build tools as well as for popular IDEs.
     35 
     36 When a user runs CMake it performs a variety of checks similar to how autoconf
     37 worked historically. During the checks and the evaluation of the build
     38 description scripts CMake caches values into the CMakeCache. This is useful
     39 because it allows the build system to skip long-running checks during
     40 incremental development. CMake caching also has some drawbacks, but that will be
     41 discussed later.
     42 
     43 Scripting Overview
     44 ==================
     45 
     46 CMake's scripting language has a very simple grammar. Every language construct
     47 is a command that matches the pattern _name_(_args_). Commands come in three
     48 primary types: language-defined (commands implemented in C++ in CMake), defined
     49 functions, and defined macros. The CMake distribution also contains a suite of
     50 CMake modules that contain definitions for useful functionality.
     51 
     52 The example below is the full CMake build for building a C++ "Hello World"
     53 program. The example uses only CMake language-defined functions.
     54 
     55 .. code-block:: cmake
     56 
     57    cmake_minimum_required(VERSION 3.2)
     58    project(HelloWorld)
     59    add_executable(HelloWorld HelloWorld.cpp)
     60 
     61 The CMake language provides control flow constructs in the form of foreach loops
     62 and if blocks. To make the example above more complicated you could add an if
     63 block to define "APPLE" when targeting Apple platforms:
     64 
     65 .. code-block:: cmake
     66 
     67    cmake_minimum_required(VERSION 3.2)
     68    project(HelloWorld)
     69    add_executable(HelloWorld HelloWorld.cpp)
     70    if(APPLE)
     71      target_compile_definitions(HelloWorld PUBLIC APPLE)
     72    endif()
     73    
     74 Variables, Types, and Scope
     75 ===========================
     76 
     77 Dereferencing
     78 -------------
     79 
     80 In CMake variables are "stringly" typed. All variables are represented as
     81 strings throughout evaluation. Wrapping a variable in ``${}`` dereferences it
     82 and results in a literal substitution of the name for the value. CMake refers to
     83 this as "variable evaluation" in their documentation. Dereferences are performed
     84 *before* the command being called receives the arguments. This means
     85 dereferencing a list results in multiple separate arguments being passed to the
     86 command.
     87 
     88 Variable dereferences can be nested and be used to model complex data. For
     89 example:
     90 
     91 .. code-block:: cmake
     92 
     93    set(var_name var1)
     94    set(${var_name} foo) # same as "set(var1 foo)"
     95    set(${${var_name}}_var bar) # same as "set(foo_var bar)"
     96    
     97 Dereferencing an unset variable results in an empty expansion. It is a common
     98 pattern in CMake to conditionally set variables knowing that it will be used in
     99 code paths that the variable isn't set. There are examples of this throughout
    100 the LLVM CMake build system.
    101 
    102 An example of variable empty expansion is:
    103 
    104 .. code-block:: cmake
    105 
    106    if(APPLE)
    107      set(extra_sources Apple.cpp)
    108    endif()
    109    add_executable(HelloWorld HelloWorld.cpp ${extra_sources})
    110    
    111 In this example the ``extra_sources`` variable is only defined if you're
    112 targeting an Apple platform. For all other targets the ``extra_sources`` will be
    113 evaluated as empty before add_executable is given its arguments.
    114 
    115 One big "Gotcha" with variable dereferencing is that ``if`` commands implicitly
    116 dereference values. This has some unexpected results. For example:
    117 
    118 .. code-block:: cmake
    119 
    120    if("${SOME_VAR}" STREQUAL "MSVC")
    121 
    122 In this code sample MSVC will be implicitly dereferenced, which will result in
    123 the if command comparing the value of the dereferenced variables ``SOME_VAR``
    124 and ``MSVC``. A common workaround to this solution is to prepend strings being
    125 compared with an ``x``.
    126 
    127 .. code-block:: cmake
    128 
    129    if("x${SOME_VAR}" STREQUAL "xMSVC")
    130 
    131 This works because while ``MSVC`` is a defined variable, ``xMSVC`` is not. This
    132 pattern is uncommon, but it does occur in LLVM's CMake scripts.
    133 
    134 .. note::
    135    
    136    Once the LLVM project upgrades its minimum CMake version to 3.1 or later we
    137    can prevent this behavior by setting CMP0054 to new. For more information on
    138    CMake policies please see the cmake-policies manpage or the `cmake-policies
    139    online documentation
    140    <https://cmake.org/cmake/help/v3.4/manual/cmake-policies.7.html>`_.
    141 
    142 Lists
    143 -----
    144 
    145 In CMake lists are semi-colon delimited strings, and it is strongly advised that
    146 you avoid using semi-colons in lists; it doesn't go smoothly. A few examples of
    147 defining lists:
    148 
    149 .. code-block:: cmake
    150 
    151    # Creates a list with members a, b, c, and d
    152    set(my_list a b c d)
    153    set(my_list "a;b;c;d")
    154    
    155    # Creates a string "a b c d"
    156    set(my_string "a b c d")
    157 
    158 Lists of Lists
    159 --------------
    160 
    161 One of the more complicated patterns in CMake is lists of lists. Because a list
    162 cannot contain an element with a semi-colon to construct a list of lists you
    163 make a list of variable names that refer to other lists. For example:
    164 
    165 .. code-block:: cmake
    166 
    167    set(list_of_lists a b c)
    168    set(a 1 2 3)
    169    set(b 4 5 6)
    170    set(c 7 8 9)
    171    
    172 With this layout you can iterate through the list of lists printing each value
    173 with the following code:
    174 
    175 .. code-block:: cmake
    176 
    177    foreach(list_name IN LISTS list_of_lists)
    178      foreach(value IN LISTS ${list_name})
    179        message(${value})
    180      endforeach()
    181    endforeach()
    182    
    183 You'll notice that the inner foreach loop's list is doubly dereferenced. This is
    184 because the first dereference turns ``list_name`` into the name of the sub-list
    185 (a, b, or c in the example), then the second dereference is to get the value of
    186 the list.
    187 
    188 This pattern is used throughout CMake, the most common example is the compiler
    189 flags options, which CMake refers to using the following variable expansions:
    190 CMAKE_${LANGUAGE}_FLAGS and CMAKE_${LANGUAGE}_FLAGS_${CMAKE_BUILD_TYPE}.
    191 
    192 Other Types
    193 -----------
    194 
    195 Variables that are cached or specified on the command line can have types
    196 associated with them. The variable's type is used by CMake's UI tool to display
    197 the right input field. The variable's type generally doesn't impact evaluation.
    198 One of the few examples is PATH variables, which CMake does have some special
    199 handling for. You can read more about the special handling in `CMake's set
    200 documentation
    201 <https://cmake.org/cmake/help/v3.5/command/set.html#set-cache-entry>`_.
    202 
    203 Scope
    204 -----
    205 
    206 CMake inherently has a directory-based scoping. Setting a variable in a
    207 CMakeLists file, will set the variable for that file, and all subdirectories.
    208 Variables set in a CMake module that is included in a CMakeLists file will be
    209 set in the scope they are included from, and all subdirectories.
    210 
    211 When a variable that is already set is set again in a subdirectory it overrides
    212 the value in that scope and any deeper subdirectories.
    213 
    214 The CMake set command provides two scope-related options. PARENT_SCOPE sets a
    215 variable into the parent scope, and not the current scope. The CACHE option sets
    216 the variable in the CMakeCache, which results in it being set in all scopes. The
    217 CACHE option will not set a variable that already exists in the CACHE unless the
    218 FORCE option is specified.
    219 
    220 In addition to directory-based scope, CMake functions also have their own scope.
    221 This means variables set inside functions do not bleed into the parent scope.
    222 This is not true of macros, and it is for this reason LLVM prefers functions
    223 over macros whenever reasonable.
    224 
    225 .. note::
    226   Unlike C-based languages, CMake's loop and control flow blocks do not have
    227   their own scopes.
    228 
    229 Control Flow
    230 ============
    231 
    232 CMake features the same basic control flow constructs you would expect in any
    233 scripting language, but there are a few quarks because, as with everything in
    234 CMake, control flow constructs are commands.
    235 
    236 If, ElseIf, Else
    237 ----------------
    238 
    239 .. note::
    240   For the full documentation on the CMake if command go
    241   `here <https://cmake.org/cmake/help/v3.4/command/if.html>`_. That resource is
    242   far more complete.
    243 
    244 In general CMake if blocks work the way you'd expect:
    245 
    246 .. code-block:: cmake
    247 
    248   if(<condition>)
    249     .. do stuff
    250   elseif(<condition>)
    251     .. do other stuff
    252   else()
    253     .. do other other stuff
    254   endif()
    255 
    256 The single most important thing to know about CMake's if blocks coming from a C
    257 background is that they do not have their own scope. Variables set inside
    258 conditional blocks persist after the ``endif()``.
    259 
    260 Loops
    261 -----
    262 
    263 The most common form of the CMake ``foreach`` block is:
    264 
    265 .. code-block:: cmake
    266 
    267   foreach(var ...)
    268     .. do stuff
    269   endforeach()
    270 
    271 The variable argument portion of the ``foreach`` block can contain dereferenced
    272 lists, values to iterate, or a mix of both:
    273 
    274 .. code-block:: cmake
    275 
    276   foreach(var foo bar baz)
    277     message(${var})
    278   endforeach()
    279   # prints:
    280   #  foo
    281   #  bar
    282   #  baz
    283 
    284   set(my_list 1 2 3)
    285   foreach(var ${my_list})
    286     message(${var})
    287   endforeach()
    288   # prints:
    289   #  1
    290   #  2
    291   #  3
    292 
    293   foreach(var ${my_list} out_of_bounds)
    294     message(${var})
    295   endforeach()
    296   # prints:
    297   #  1
    298   #  2
    299   #  3
    300   #  out_of_bounds
    301 
    302 There is also a more modern CMake foreach syntax. The code below is equivalent
    303 to the code above:
    304 
    305 .. code-block:: cmake
    306 
    307   foreach(var IN ITEMS foo bar baz)
    308     message(${var})
    309   endforeach()
    310   # prints:
    311   #  foo
    312   #  bar
    313   #  baz
    314 
    315   set(my_list 1 2 3)
    316   foreach(var IN LISTS my_list)
    317     message(${var})
    318   endforeach()
    319   # prints:
    320   #  1
    321   #  2
    322   #  3
    323 
    324   foreach(var IN LISTS my_list ITEMS out_of_bounds)
    325     message(${var})
    326   endforeach()
    327   # prints:
    328   #  1
    329   #  2
    330   #  3
    331   #  out_of_bounds
    332 
    333 Similar to the conditional statements, these generally behave how you would
    334 expect, and they do not have their own scope.
    335 
    336 CMake also supports ``while`` loops, although they are not widely used in LLVM.
    337 
    338 Modules, Functions and Macros
    339 =============================
    340 
    341 Modules
    342 -------
    343 
    344 Modules are CMake's vehicle for enabling code reuse. CMake modules are just
    345 CMake script files. They can contain code to execute on include as well as
    346 definitions for commands.
    347 
    348 In CMake macros and functions are universally referred to as commands, and they
    349 are the primary method of defining code that can be called multiple times.
    350 
    351 In LLVM we have several CMake modules that are included as part of our
    352 distribution for developers who don't build our project from source. Those
    353 modules are the fundamental pieces needed to build LLVM-based projects with
    354 CMake. We also rely on modules as a way of organizing the build system's
    355 functionality for maintainability and re-use within LLVM projects.
    356 
    357 Argument Handling
    358 -----------------
    359 
    360 When defining a CMake command handling arguments is very useful. The examples
    361 in this section will all use the CMake ``function`` block, but this all applies
    362 to the ``macro`` block as well.
    363 
    364 CMake commands can have named arguments, but all commands are implicitly
    365 variable argument. If the command has named arguments they are required and must
    366 be specified at every call site. Below is a trivial example of providing a
    367 wrapper function for CMake's built in function ``add_dependencies``.
    368 
    369 .. code-block:: cmake
    370 
    371    function(add_deps target)
    372      add_dependencies(${target} ${ARGV})
    373    endfunction()
    374 
    375 This example defines a new macro named ``add_deps`` which takes a required first
    376 argument, and just calls another function passing through the first argument and
    377 all trailing arguments. When variable arguments are present CMake defines them
    378 in a list named ``ARGV``, and the count of the arguments is defined in ``ARGN``.
    379 
    380 CMake provides a module ``CMakeParseArguments`` which provides an implementation
    381 of advanced argument parsing. We use this all over LLVM, and it is recommended
    382 for any function that has complex argument-based behaviors or optional
    383 arguments. CMake's official documentation for the module is in the
    384 ``cmake-modules`` manpage, and is also available at the
    385 `cmake-modules online documentation
    386 <https://cmake.org/cmake/help/v3.4/module/CMakeParseArguments.html>`_.
    387 
    388 .. note::
    389   As of CMake 3.5 the cmake_parse_arguments command has become a native command
    390   and the CMakeParseArguments module is empty and only left around for
    391   compatibility.
    392 
    393 Functions Vs Macros
    394 -------------------
    395 
    396 Functions and Macros look very similar in how they are used, but there is one
    397 fundamental difference between the two. Functions have their own scope, and
    398 macros don't. This means variables set in macros will bleed out into the calling
    399 scope. That makes macros suitable for defining very small bits of functionality
    400 only.
    401 
    402 The other difference between CMake functions and macros is how arguments are
    403 passed. Arguments to macros are not set as variables, instead dereferences to
    404 the parameters are resolved across the macro before executing it. This can
    405 result in some unexpected behavior if using unreferenced variables. For example:
    406 
    407 .. code-block:: cmake
    408 
    409    macro(print_list my_list)
    410      foreach(var IN LISTS my_list)
    411        message("${var}")
    412      endforeach()
    413    endmacro()
    414    
    415    set(my_list a b c d)
    416    set(my_list_of_numbers 1 2 3 4)
    417    print_list(my_list_of_numbers)
    418    # prints:
    419    # a
    420    # b
    421    # c
    422    # d
    423 
    424 Generally speaking this issue is uncommon because it requires using
    425 non-dereferenced variables with names that overlap in the parent scope, but it
    426 is important to be aware of because it can lead to subtle bugs.
    427 
    428 LLVM Project Wrappers
    429 =====================
    430 
    431 LLVM projects provide lots of wrappers around critical CMake built-in commands.
    432 We use these wrappers to provide consistent behaviors across LLVM components
    433 and to reduce code duplication.
    434 
    435 We generally (but not always) follow the convention that commands prefaced with
    436 ``llvm_`` are intended to be used only as building blocks for other commands.
    437 Wrapper commands that are intended for direct use are generally named following
    438 with the project in the middle of the command name (i.e. ``add_llvm_executable``
    439 is the wrapper for ``add_executable``). The LLVM ``add_*`` wrapper functions are
    440 all defined in ``AddLLVM.cmake`` which is installed as part of the LLVM
    441 distribution. It can be included and used by any LLVM sub-project that requires
    442 LLVM.
    443 
    444 .. note::
    445 
    446    Not all LLVM projects require LLVM for all use cases. For example compiler-rt
    447    can be built without LLVM, and the compiler-rt sanitizer libraries are used
    448    with GCC.
    449 
    450 Useful Built-in Commands
    451 ========================
    452 
    453 CMake has a bunch of useful built-in commands. This document isn't going to
    454 go into details about them because The CMake project has excellent
    455 documentation. To highlight a few useful functions see:
    456 
    457 * `add_custom_command <https://cmake.org/cmake/help/v3.4/command/add_custom_command.html>`_
    458 * `add_custom_target <https://cmake.org/cmake/help/v3.4/command/add_custom_target.html>`_
    459 * `file <https://cmake.org/cmake/help/v3.4/command/file.html>`_
    460 * `list <https://cmake.org/cmake/help/v3.4/command/list.html>`_
    461 * `math <https://cmake.org/cmake/help/v3.4/command/math.html>`_
    462 * `string <https://cmake.org/cmake/help/v3.4/command/string.html>`_
    463 
    464 The full documentation for CMake commands is in the ``cmake-commands`` manpage
    465 and available on `CMake's website <https://cmake.org/cmake/help/v3.4/manual/cmake-commands.7.html>`_
    466