Home | History | Annotate | Download | only in allocator
      1 This document describes how malloc / new calls are routed in the various Chrome
      2 platforms.
      3 
      4 Bare in mind that the chromium codebase does not always just use `malloc()`.
      5 Some examples:
      6  - Large parts of the renderer (Blink) use two home-brewed allocators,
      7    PartitionAlloc and BlinkGC (Oilpan).
      8  - Some subsystems, such as the V8 JavaScript engine, handle memory management
      9    autonomously.
     10  - Various parts of the codebase use abstractions such as `SharedMemory` or
     11    `DiscardableMemory` which, similarly to the above, have their own page-level
     12    memory management.
     13 
     14 Background
     15 ----------
     16 The `allocator` target defines at compile-time the platform-specific choice of
     17 the allocator and extra-hooks which services calls to malloc/new. The relevant
     18 build-time flags involved are `use_allocator` and `win_use_allocator_shim`.
     19 
     20 The default choices are as follows:
     21 
     22 **Windows**  
     23 `use_allocator: winheap`, the default Windows heap.
     24 Additionally, `static_library` (i.e. non-component) builds have a shim
     25 layer wrapping malloc/new, which is controlled by `win_use_allocator_shim`.  
     26 The shim layer provides extra security features, such as preventing large
     27 allocations that can hit signed vs. unsigned bugs in third_party code.
     28 
     29 **Linux Desktop / CrOS**  
     30 `use_allocator: tcmalloc`, a forked copy of tcmalloc which resides in
     31 `third_party/tcmalloc/chromium`. Setting `use_allocator: none` causes the build
     32 to fall back to the system (Glibc) symbols.
     33 
     34 **Android**  
     35 `use_allocator: none`, always use the allocator symbols coming from Android's
     36 libc (Bionic). As it is developed as part of the OS, it is considered to be
     37 optimized for small devices and more memory-efficient than other choices.  
     38 The actual implementation backing malloc symbols in Bionic is up to the board
     39 config and can vary (typically *dlmalloc* or *jemalloc* on most Nexus devices).
     40 
     41 **Mac/iOS**  
     42 `use_allocator: none`, we always use the system's allocator implementation.
     43 
     44 In addition, when building for `asan` / `msan` / `syzyasan` `valgrind`, the
     45 both the allocator and the shim layer are disabled.
     46 
     47 Layering and build deps
     48 -----------------------
     49 The `allocator` target provides both the source files for tcmalloc (where
     50 applicable) and the linker flags required for the Windows shim layer.
     51 The `base` target is (almost) the only one depending on `allocator`. No other
     52 targets should depend on it, with the exception of the very few executables /
     53 dynamic libraries that don't depend, either directly or indirectly, on `base`
     54 within the scope of a linker unit.
     55 
     56 More importantly, **no other place outside of `/base` should depend on the
     57 specific allocator** (e.g., directly include `third_party/tcmalloc`).
     58 If such a functional dependency is required that should be achieved using
     59 abstractions in `base` (see `/base/allocator/allocator_extension.h` and
     60 `/base/memory/`)
     61 
     62 **Why `base` depends on `allocator`?**  
     63 Because it needs to provide services that depend on the actual allocator
     64 implementation. In the past `base` used to pretend to be allocator-agnostic
     65 and get the dependencies injected by other layers. This ended up being an
     66 inconsistent mess.
     67 See the [allocator cleanup doc][url-allocator-cleanup] for more context.
     68 
     69 Linker unit targets (executables and shared libraries) that depend in some way
     70 on `base` (most of the targets in the codebase) get automatically the correct
     71 set of linker flags to pull in tcmalloc or the Windows shim-layer.
     72 
     73 
     74 Source code
     75 -----------
     76 This directory contains just the allocator (i.e. shim) layer that switches
     77 between the different underlying memory allocation implementations.
     78 
     79 The tcmalloc library originates outside of Chromium and exists in
     80 `../../third_party/tcmalloc` (currently, the actual location is defined in the
     81 allocator.gyp file). The third party sources use a vendor-branch SCM pattern to
     82 track Chromium-specific changes independently from upstream changes.
     83 
     84 The general intent is to push local changes upstream so that over
     85 time we no longer need any forked files.
     86 
     87 
     88 Unified allocator shim
     89 ----------------------
     90 On most platform, Chrome overrides the malloc / operator new symbols (and
     91 corresponding free / delete and other variants). This is to enforce security
     92 checks and lately to enable the
     93 [memory-infra heap profiler][url-memory-infra-heap-profiler].  
     94 Historically each platform had its special logic for defining the allocator
     95 symbols in different places of the codebase. The unified allocator shim is
     96 a project aimed to unify the symbol definition and allocator routing logic in
     97 a central place.
     98 
     99  - Full documentation: [Allocator shim design doc][url-allocator-shim].
    100  - Current state: Available and enabled by default on Linux, CrOS and Android.
    101  - Tracking bug: [https://crbug.com/550886][crbug.com/550886].
    102  - Build-time flag: `use_experimental_allocator_shim`.
    103 
    104 **Overview of the unified allocator shim**  
    105 The allocator shim consists of three stages:
    106 ```
    107 +-------------------------+    +-----------------------+    +----------------+
    108 |     malloc & friends    | -> |       shim layer      | -> |   Routing to   |
    109 |    symbols definition   |    |     implementation    |    |    allocator   |
    110 +-------------------------+    +-----------------------+    +----------------+
    111 | - libc symbols (malloc, |    | - Security checks     |    | - tcmalloc     |
    112 |   calloc, free, ...)    |    | - Chain of dispatchers|    | - glibc        |
    113 | - C++ symbols (operator |    |   that can intercept  |    | - Android      |
    114 |   new, delete, ...)     |    |   and override        |    |   bionic       |
    115 | - glibc weak symbols    |    |   allocations         |    | - WinHeap      |
    116 |   (__libc_malloc, ...)  |    +-----------------------+    +----------------+
    117 +-------------------------+
    118 ```
    119 
    120 **1. malloc symbols definition**  
    121 This stage takes care of overriding the symbols `malloc`, `free`,
    122 `operator new`, `operator delete` and friends and routing those calls inside the
    123 allocator shim (next point).
    124 This is taken care of by the headers in `allocator_shim_override_*`.
    125 
    126 *On Linux/CrOS*: the allocator symbols are defined as exported global symbols
    127 in `allocator_shim_override_libc_symbols.h` (for `malloc`, `free` and friends)
    128 and in `allocator_shim_override_cpp_symbols.h` (for `operator new`,
    129 `operator delete` and friends).
    130 This enables proper interposition of malloc symbols referenced by the main
    131 executable and any third party libraries. Symbol resolution on Linux is a breadth first search that starts from the root link unit, that is the executable
    132 (see EXECUTABLE AND LINKABLE FORMAT (ELF) - Portable Formats Specification).
    133 Additionally, when tcmalloc is the default allocator, some extra glibc symbols
    134 are also defined in `allocator_shim_override_glibc_weak_symbols.h`, for subtle
    135 reasons explained in that file.
    136 The Linux/CrOS shim was introduced by
    137 [crrev.com/1675143004](https://crrev.com/1675143004).
    138 
    139 *On Android*: load-time symbol interposition (unlike the Linux/CrOS case) is not
    140 possible. This is because Android processes are `fork()`-ed from the Android
    141 zygote, which pre-loads libc.so and only later native code gets loaded via
    142 `dlopen()` (symbols from `dlopen()`-ed libraries get a different resolution
    143 scope).
    144 In this case, the approach instead of wrapping symbol resolution at link time
    145 (i.e. during the build), via the `--Wl,-wrap,malloc` linker flag.
    146 The use of this wrapping flag causes:
    147  - All references to allocator symbols in the Chrome codebase to be rewritten as
    148    references to `__wrap_malloc` and friends. The `__wrap_malloc` symbols are
    149    defined in the `allocator_shim_override_linker_wrapped_symbols.h` and
    150    route allocator calls inside the shim layer.
    151  - The reference to the original `malloc` symbols (which typically is defined by
    152    the system's libc.so) are accessible via the special `__real_malloc` and
    153    friends symbols (which will be relocated, at load time, against `malloc`).
    154 
    155 In summary, this approach is transparent to the dynamic loader, which still sees
    156 undefined symbol references to malloc symbols.
    157 These symbols will be resolved against libc.so as usual.
    158 More details in [crrev.com/1719433002](https://crrev.com/1719433002).
    159 
    160 **2. Shim layer implementation**  
    161 This stage contains the actual shim implementation. This consists of:
    162 - A singly linked list of dispatchers (structs with function pointers to `malloc`-like functions). Dispatchers can be dynamically inserted at runtime
    163 (using the `InsertAllocatorDispatch` API). They can intercept and override
    164 allocator calls.
    165 - The security checks (suicide on malloc-failure via `std::new_handler`, etc).
    166 This happens inside `allocator_shim.cc`
    167 
    168 **3. Final allocator routing**  
    169 The final element of the aforementioned dispatcher chain is statically defined
    170 at build time and ultimately routes the allocator calls to the actual allocator
    171 (as described in the *Background* section above). This is taken care of by the
    172 headers in `allocator_shim_default_dispatch_to_*` files.
    173 
    174 
    175 Appendixes
    176 ----------
    177 **How does the Windows shim layer replace the malloc symbols?**  
    178 The mechanism for hooking LIBCMT in Windows is rather tricky.  The core
    179 problem is that by default, the Windows library does not declare malloc and
    180 free as weak symbols.  Because of this, they cannot be overridden.  To work
    181 around this, we start with the LIBCMT.LIB, and manually remove all allocator
    182 related functions from it using the visual studio library tool.  Once removed,
    183 we can now link against the library and provide custom versions of the
    184 allocator related functionality.
    185 See the script `preb_libc.py` in this folder.
    186 
    187 Related links
    188 -------------
    189 - [Unified allocator shim doc - Feb 2016][url-allocator-shim]
    190 - [Allocator cleanup doc - Jan 2016][url-allocator-cleanup]
    191 - [Proposal to use PartitionAlloc as default allocator](https://crbug.com/339604)
    192 - [Memory-Infra: Tools to profile memory usage in Chrome](components/tracing/docs/memory_infra.md)
    193 
    194 [url-allocator-cleanup]: https://docs.google.com/document/d/1V77Kgp_4tfaaWPEZVxNevoD02wXiatnAv7Ssgr0hmjg/edit?usp=sharing
    195 [url-memory-infra-heap-profiler]: components/tracing/docs/heap_profiler.md
    196 [url-allocator-shim]: https://docs.google.com/document/d/1yKlO1AO4XjpDad9rjcBOI15EKdAGsuGO_IeZy0g0kxo/edit?usp=sharing
    197