Home | History | Annotate | Download | only in containers
      1 # base/containers library
      2 
      3 ## What goes here
      4 
      5 This directory contains some STL-like containers.
      6 
      7 Things should be moved here that are generally applicable across the code base.
      8 Don't add things here just because you need them in one place and think others
      9 may someday want something similar. You can put specialized containers in
     10 your component's directory and we can promote them here later if we feel there
     11 is broad applicability.
     12 
     13 ### Design and naming
     14 
     15 Containers should adhere as closely to STL as possible. Functions and behaviors
     16 not present in STL should only be added when they are related to the specific
     17 data structure implemented by the container.
     18 
     19 For STL-like containers our policy is that they should use STL-like naming even
     20 when it may conflict with the style guide. So functions and class names should
     21 be lower case with underscores. Non-STL-like classes and functions should use
     22 Google naming. Be sure to use the base namespace.
     23 
     24 ## Map and set selection
     25 
     26 ### Usage advice
     27 
     28   * Generally avoid **std::unordered\_set** and **std::unordered\_map**. In the
     29     common case, query performance is unlikely to be sufficiently higher than
     30     std::map to make a difference, insert performance is slightly worse, and
     31     the memory overhead is high. This makes sense mostly for large tables where
     32     you expect a lot of lookups.
     33 
     34   * Most maps and sets in Chrome are small and contain objects that can be
     35     moved efficiently. In this case, consider **base::flat\_map** and
     36     **base::flat\_set**. You need to be aware of the maximum expected size of
     37     the container since individual inserts and deletes are O(n), giving O(n^2)
     38     construction time for the entire map. But because it avoids mallocs in most
     39     cases, inserts are better or comparable to other containers even for
     40     several dozen items, and efficiently-moved types are unlikely to have
     41     performance problems for most cases until you have hundreds of items. If
     42     your container can be constructed in one shot, the constructor from vector
     43     gives O(n log n) construction times and it should be strictly better than
     44     a std::map.
     45 
     46   * **base::small\_map** has better runtime memory usage without the poor
     47     mutation performance of large containers that base::flat\_map has. But this
     48     advantage is partially offset by additional code size. Prefer in cases
     49     where you make many objects so that the code/heap tradeoff is good.
     50 
     51   * Use **std::map** and **std::set** if you can't decide. Even if they're not
     52     great, they're unlikely to be bad or surprising.
     53 
     54 ### Map and set details
     55 
     56 Sizes are on 64-bit platforms. Stable iterators aren't invalidated when the
     57 container is mutated.
     58 
     59 | Container                                | Empty size            | Per-item overhead | Stable iterators? |
     60 |:---------------------------------------- |:--------------------- |:----------------- |:----------------- |
     61 | std::map, std::set                       | 16 bytes              | 32 bytes          | Yes               |
     62 | std::unordered\_map, std::unordered\_set | 128 bytes             | 16-24 bytes       | No                |
     63 | base::flat\_map and base::flat\_set      | 24 bytes              | 0 (see notes)     | No                |
     64 | base::small\_map                         | 24 bytes (see notes)  | 32 bytes          | No                |
     65 
     66 **Takeaways:** std::unordered\_map and std::unordered\_map have high
     67 overhead for small container sizes, prefer these only for larger workloads.
     68 
     69 Code size comparisons for a block of code (see appendix) on Windows using
     70 strings as keys.
     71 
     72 | Container           | Code size  |
     73 |:------------------- |:---------- |
     74 | std::unordered\_map | 1646 bytes |
     75 | std::map            | 1759 bytes |
     76 | base::flat\_map     | 1872 bytes |
     77 | base::small\_map    | 2410 bytes |
     78 
     79 **Takeaways:** base::small\_map generates more code because of the inlining of
     80 both brute-force and red-black tree searching. This makes it less attractive
     81 for random one-off uses. But if your code is called frequently, the runtime
     82 memory benefits will be more important. The code sizes of the other maps are
     83 close enough it's not worth worrying about.
     84 
     85 ### std::map and std::set
     86 
     87 A red-black tree. Each inserted item requires the memory allocation of a node
     88 on the heap. Each node contains a left pointer, a right pointer, a parent
     89 pointer, and a "color" for the red-black tree (32-bytes per item on 64-bits).
     90 
     91 ### std::unordered\_map and std::unordered\_set
     92 
     93 A hash table. Implemented on Windows as a std::vector + std::list and in libc++
     94 as the equivalent of a std::vector + a std::forward\_list. Both implementations
     95 allocate an 8-entry hash table (containing iterators into the list) on
     96 initialization, and grow to 64 entries once 8 items are inserted. Above 64
     97 items, the size doubles every time the load factor exceeds 1.
     98 
     99 The empty size is sizeof(std::unordered\_map) = 64 +
    100 the initial hash table size which is 8 pointers. The per-item overhead in the
    101 table above counts the list node (2 pointers on Windows, 1 pointer in libc++),
    102 plus amortizes the hash table assuming a 0.5 load factor on average.
    103 
    104 In a microbenchmark on Windows, inserts of 1M integers into a
    105 std::unordered\_set took 1.07x the time of std::set, and queries took 0.67x the
    106 time of std::set. For a typical 4-entry set (the statistical mode of map sizes
    107 in the browser), query performance is identical to std::set and base::flat\_set.
    108 On ARM, unordered\_set performance can be worse because integer division to
    109 compute the bucket is slow, and a few "less than" operations can be faster than
    110 computing a hash depending on the key type. The takeaway is that you should not
    111 default to using unordered maps because "they're faster."
    112 
    113 ### base::flat\_map and base::flat\_set
    114 
    115 A sorted std::vector. Seached via binary search, inserts in the middle require
    116 moving elements to make room. Good cache locality. For large objects and large
    117 set sizes, std::vector's doubling-when-full strategy can waste memory.
    118 
    119 Supports efficient construction from a vector of items which avoids the O(n^2)
    120 insertion time of each element separately.
    121 
    122 The per-item overhead will depend on the underlying std::vector's reallocation
    123 strategy and the memory access pattern. Assuming items are being linearly added,
    124 one would expect it to be 3/4 full, so per-item overhead will be 0.25 *
    125 sizeof(T).
    126 
    127 
    128 flat\_set/flat\_map support a notion of transparent comparisons. Therefore you
    129 can, for example, lookup base::StringPiece in a set of std::strings without
    130 constructing a temporary std::string. This functionality is based on C++14
    131 extensions to std::set/std::map interface.
    132 
    133 You can find more information about transparent comparisons here:
    134 http://en.cppreference.com/w/cpp/utility/functional/less_void
    135 
    136 Example, smart pointer set:
    137 
    138 ```cpp
    139 // Declare a type alias using base::UniquePtrComparator.
    140 template <typename T>
    141 using UniquePtrSet = base::flat_set<std::unique_ptr<T>,
    142                                     base::UniquePtrComparator>;
    143 
    144 // ...
    145 // Collect data.
    146 std::vector<std::unique_ptr<int>> ptr_vec;
    147 ptr_vec.reserve(5);
    148 std::generate_n(std::back_inserter(ptr_vec), 5, []{
    149   return std::make_unique<int>(0);
    150 });
    151 
    152 // Construct a set.
    153 UniquePtrSet<int> ptr_set(std::move(ptr_vec), base::KEEP_FIRST_OF_DUPES);
    154 
    155 // Use raw pointers to lookup keys.
    156 int* ptr = ptr_set.begin()->get();
    157 EXPECT_TRUE(ptr_set.find(ptr) == ptr_set.begin());
    158 ```
    159 
    160 Example flat_map<std\::string, int>:
    161 
    162 ```cpp
    163 base::flat_map<std::string, int> str_to_int({{"a", 1}, {"c", 2},{"b", 2}},
    164                                             base::KEEP_FIRST_OF_DUPES);
    165 
    166 // Does not construct temporary strings.
    167 str_to_int.find("c")->second = 3;
    168 str_to_int.erase("c");
    169 EXPECT_EQ(str_to_int.end(), str_to_int.find("c")->second);
    170 
    171 // NOTE: This does construct a temporary string. This happens since if the
    172 // item is not in the container, then it needs to be constructed, which is
    173 // something that transparent comparators don't have to guarantee.
    174 str_to_int["c"] = 3;
    175 ```
    176 
    177 ### base::small\_map
    178 
    179 A small inline buffer that is brute-force searched that overflows into a full
    180 std::map or std::unordered\_map. This gives the memory benefit of
    181 base::flat\_map for small data sizes without the degenerate insertion
    182 performance for large container sizes.
    183 
    184 Since instantiations require both code for a std::map and a brute-force search
    185 of the inline container, plus a fancy iterator to cover both cases, code size
    186 is larger.
    187 
    188 The initial size in the above table is assuming a very small inline table. The
    189 actual size will be sizeof(int) + min(sizeof(std::map), sizeof(T) *
    190 inline\_size).
    191 
    192 # Deque
    193 
    194 ### Usage advice
    195 
    196 Chromium code should always use `base::circular_deque` or `base::queue` in
    197 preference to `std::deque` or `std::queue` due to memory usage and platform
    198 variation.
    199 
    200 The `base::circular_deque` implementation (and the `base::queue` which uses it)
    201 provide performance consistent across platforms that better matches most
    202 programmer's expectations on performance (it doesn't waste as much space as
    203 libc++ and doesn't do as many heap allocations as MSVC). It also generates less
    204 code tham `std::queue`: using it across the code base saves several hundred
    205 kilobytes.
    206 
    207 Since `base::deque` does not have stable iterators and it will move the objects
    208 it contains, it may not be appropriate for all uses. If you need these,
    209 consider using a `std::list` which will provide constant time insert and erase.
    210 
    211 ### std::deque and std::queue
    212 
    213 The implementation of `std::deque` varies considerably which makes it hard to
    214 reason about. All implementations use a sequence of data blocks referenced by
    215 an array of pointers. The standard guarantees random access, amortized
    216 constant operations at the ends, and linear mutations in the middle.
    217 
    218 In Microsoft's implementation, each block is the smaller of 16 bytes or the
    219 size of the contained element. This means in practice that every expansion of
    220 the deque of non-trivial classes requires a heap allocation. libc++ (on Android
    221 and Mac) uses 4K blocks which elimiates the problem of many heap allocations,
    222 but generally wastes a large amount of space (an Android analysis revealed more
    223 than 2.5MB wasted space from deque alone, resulting in some optimizations).
    224 libstdc++ uses an intermediate-size 512 byte buffer.
    225 
    226 Microsoft's implementation never shrinks the deque capacity, so the capacity
    227 will always be the maximum number of elements ever contained. libstdc++
    228 deallocates blocks as they are freed. libc++ keeps up to two empty blocks.
    229 
    230 ### base::circular_deque and base::queue
    231 
    232 A deque implemented as a circular buffer in an array. The underlying array will
    233 grow like a `std::vector` while the beginning and end of the deque will move
    234 around. The items will wrap around the underlying buffer so the storage will
    235 not be contiguous, but fast random access iterators are still possible.
    236 
    237 When the underlying buffer is filled, it will be reallocated and the constents
    238 moved (like a `std::vector`). The underlying buffer will be shrunk if there is
    239 too much wasted space (_unlike_ a `std::vector`). As a result, iterators are
    240 not stable across mutations.
    241 
    242 # Stack
    243 
    244 `std::stack` is like `std::queue` in that it is a wrapper around an underlying
    245 container. The default container is `std::deque` so everything from the deque
    246 section applies.
    247 
    248 Chromium provides `base/containers/stack.h` which defines `base::stack` that
    249 should be used in preference to std::stack. This changes the underlying
    250 container to `base::circular_deque`. The result will be very similar to
    251 manually specifying a `std::vector` for the underlying implementation except
    252 that the storage will shrink when it gets too empty (vector will never
    253 reallocate to a smaller size).
    254 
    255 Watch out: with some stack usage patterns it's easy to depend on unstable
    256 behavior:
    257 
    258 ```cpp
    259 base::stack<Foo> stack;
    260 for (...) {
    261   Foo& current = stack.top();
    262   DoStuff();  // May call stack.push(), say if writing a parser.
    263   current.done = true;  // Current may reference deleted item!
    264 }
    265 ```
    266 
    267 ## Appendix
    268 
    269 ### Code for map code size comparison
    270 
    271 This just calls insert and query a number of times, with printfs that prevent
    272 things from being dead-code eliminated.
    273 
    274 ```cpp
    275 TEST(Foo, Bar) {
    276   base::small_map<std::map<std::string, Flubber>> foo;
    277   foo.insert(std::make_pair("foo", Flubber(8, "bar")));
    278   foo.insert(std::make_pair("bar", Flubber(8, "bar")));
    279   foo.insert(std::make_pair("foo1", Flubber(8, "bar")));
    280   foo.insert(std::make_pair("bar1", Flubber(8, "bar")));
    281   foo.insert(std::make_pair("foo", Flubber(8, "bar")));
    282   foo.insert(std::make_pair("bar", Flubber(8, "bar")));
    283   auto found = foo.find("asdf");
    284   printf("Found is %d\n", (int)(found == foo.end()));
    285   found = foo.find("foo");
    286   printf("Found is %d\n", (int)(found == foo.end()));
    287   found = foo.find("bar");
    288   printf("Found is %d\n", (int)(found == foo.end()));
    289   found = foo.find("asdfhf");
    290   printf("Found is %d\n", (int)(found == foo.end()));
    291   found = foo.find("bar1");
    292   printf("Found is %d\n", (int)(found == foo.end()));
    293 }
    294 ```
    295 
    296