1 # base/containers library 2 3 ## What goes here 4 5 This directory contains some STL-like containers. 6 7 Things should be moved here that are generally applicable across the code base. 8 Don't add things here just because you need them in one place and think others 9 may someday want something similar. You can put specialized containers in 10 your component's directory and we can promote them here later if we feel there 11 is broad applicability. 12 13 ### Design and naming 14 15 Containers should adhere as closely to STL as possible. Functions and behaviors 16 not present in STL should only be added when they are related to the specific 17 data structure implemented by the container. 18 19 For STL-like containers our policy is that they should use STL-like naming even 20 when it may conflict with the style guide. So functions and class names should 21 be lower case with underscores. Non-STL-like classes and functions should use 22 Google naming. Be sure to use the base namespace. 23 24 ## Map and set selection 25 26 ### Usage advice 27 28 * Generally avoid **std::unordered\_set** and **std::unordered\_map**. In the 29 common case, query performance is unlikely to be sufficiently higher than 30 std::map to make a difference, insert performance is slightly worse, and 31 the memory overhead is high. This makes sense mostly for large tables where 32 you expect a lot of lookups. 33 34 * Most maps and sets in Chrome are small and contain objects that can be 35 moved efficiently. In this case, consider **base::flat\_map** and 36 **base::flat\_set**. You need to be aware of the maximum expected size of 37 the container since individual inserts and deletes are O(n), giving O(n^2) 38 construction time for the entire map. But because it avoids mallocs in most 39 cases, inserts are better or comparable to other containers even for 40 several dozen items, and efficiently-moved types are unlikely to have 41 performance problems for most cases until you have hundreds of items. If 42 your container can be constructed in one shot, the constructor from vector 43 gives O(n log n) construction times and it should be strictly better than 44 a std::map. 45 46 * **base::small\_map** has better runtime memory usage without the poor 47 mutation performance of large containers that base::flat\_map has. But this 48 advantage is partially offset by additional code size. Prefer in cases 49 where you make many objects so that the code/heap tradeoff is good. 50 51 * Use **std::map** and **std::set** if you can't decide. Even if they're not 52 great, they're unlikely to be bad or surprising. 53 54 ### Map and set details 55 56 Sizes are on 64-bit platforms. Stable iterators aren't invalidated when the 57 container is mutated. 58 59 | Container | Empty size | Per-item overhead | Stable iterators? | 60 |:---------------------------------------- |:--------------------- |:----------------- |:----------------- | 61 | std::map, std::set | 16 bytes | 32 bytes | Yes | 62 | std::unordered\_map, std::unordered\_set | 128 bytes | 16-24 bytes | No | 63 | base::flat\_map and base::flat\_set | 24 bytes | 0 (see notes) | No | 64 | base::small\_map | 24 bytes (see notes) | 32 bytes | No | 65 66 **Takeaways:** std::unordered\_map and std::unordered\_map have high 67 overhead for small container sizes, prefer these only for larger workloads. 68 69 Code size comparisons for a block of code (see appendix) on Windows using 70 strings as keys. 71 72 | Container | Code size | 73 |:------------------- |:---------- | 74 | std::unordered\_map | 1646 bytes | 75 | std::map | 1759 bytes | 76 | base::flat\_map | 1872 bytes | 77 | base::small\_map | 2410 bytes | 78 79 **Takeaways:** base::small\_map generates more code because of the inlining of 80 both brute-force and red-black tree searching. This makes it less attractive 81 for random one-off uses. But if your code is called frequently, the runtime 82 memory benefits will be more important. The code sizes of the other maps are 83 close enough it's not worth worrying about. 84 85 ### std::map and std::set 86 87 A red-black tree. Each inserted item requires the memory allocation of a node 88 on the heap. Each node contains a left pointer, a right pointer, a parent 89 pointer, and a "color" for the red-black tree (32-bytes per item on 64-bits). 90 91 ### std::unordered\_map and std::unordered\_set 92 93 A hash table. Implemented on Windows as a std::vector + std::list and in libc++ 94 as the equivalent of a std::vector + a std::forward\_list. Both implementations 95 allocate an 8-entry hash table (containing iterators into the list) on 96 initialization, and grow to 64 entries once 8 items are inserted. Above 64 97 items, the size doubles every time the load factor exceeds 1. 98 99 The empty size is sizeof(std::unordered\_map) = 64 + 100 the initial hash table size which is 8 pointers. The per-item overhead in the 101 table above counts the list node (2 pointers on Windows, 1 pointer in libc++), 102 plus amortizes the hash table assuming a 0.5 load factor on average. 103 104 In a microbenchmark on Windows, inserts of 1M integers into a 105 std::unordered\_set took 1.07x the time of std::set, and queries took 0.67x the 106 time of std::set. For a typical 4-entry set (the statistical mode of map sizes 107 in the browser), query performance is identical to std::set and base::flat\_set. 108 On ARM, unordered\_set performance can be worse because integer division to 109 compute the bucket is slow, and a few "less than" operations can be faster than 110 computing a hash depending on the key type. The takeaway is that you should not 111 default to using unordered maps because "they're faster." 112 113 ### base::flat\_map and base::flat\_set 114 115 A sorted std::vector. Seached via binary search, inserts in the middle require 116 moving elements to make room. Good cache locality. For large objects and large 117 set sizes, std::vector's doubling-when-full strategy can waste memory. 118 119 Supports efficient construction from a vector of items which avoids the O(n^2) 120 insertion time of each element separately. 121 122 The per-item overhead will depend on the underlying std::vector's reallocation 123 strategy and the memory access pattern. Assuming items are being linearly added, 124 one would expect it to be 3/4 full, so per-item overhead will be 0.25 * 125 sizeof(T). 126 127 128 flat\_set/flat\_map support a notion of transparent comparisons. Therefore you 129 can, for example, lookup base::StringPiece in a set of std::strings without 130 constructing a temporary std::string. This functionality is based on C++14 131 extensions to std::set/std::map interface. 132 133 You can find more information about transparent comparisons here: 134 http://en.cppreference.com/w/cpp/utility/functional/less_void 135 136 Example, smart pointer set: 137 138 ```cpp 139 // Declare a type alias using base::UniquePtrComparator. 140 template <typename T> 141 using UniquePtrSet = base::flat_set<std::unique_ptr<T>, 142 base::UniquePtrComparator>; 143 144 // ... 145 // Collect data. 146 std::vector<std::unique_ptr<int>> ptr_vec; 147 ptr_vec.reserve(5); 148 std::generate_n(std::back_inserter(ptr_vec), 5, []{ 149 return std::make_unique<int>(0); 150 }); 151 152 // Construct a set. 153 UniquePtrSet<int> ptr_set(std::move(ptr_vec), base::KEEP_FIRST_OF_DUPES); 154 155 // Use raw pointers to lookup keys. 156 int* ptr = ptr_set.begin()->get(); 157 EXPECT_TRUE(ptr_set.find(ptr) == ptr_set.begin()); 158 ``` 159 160 Example flat_map<std\::string, int>: 161 162 ```cpp 163 base::flat_map<std::string, int> str_to_int({{"a", 1}, {"c", 2},{"b", 2}}, 164 base::KEEP_FIRST_OF_DUPES); 165 166 // Does not construct temporary strings. 167 str_to_int.find("c")->second = 3; 168 str_to_int.erase("c"); 169 EXPECT_EQ(str_to_int.end(), str_to_int.find("c")->second); 170 171 // NOTE: This does construct a temporary string. This happens since if the 172 // item is not in the container, then it needs to be constructed, which is 173 // something that transparent comparators don't have to guarantee. 174 str_to_int["c"] = 3; 175 ``` 176 177 ### base::small\_map 178 179 A small inline buffer that is brute-force searched that overflows into a full 180 std::map or std::unordered\_map. This gives the memory benefit of 181 base::flat\_map for small data sizes without the degenerate insertion 182 performance for large container sizes. 183 184 Since instantiations require both code for a std::map and a brute-force search 185 of the inline container, plus a fancy iterator to cover both cases, code size 186 is larger. 187 188 The initial size in the above table is assuming a very small inline table. The 189 actual size will be sizeof(int) + min(sizeof(std::map), sizeof(T) * 190 inline\_size). 191 192 # Deque 193 194 ### Usage advice 195 196 Chromium code should always use `base::circular_deque` or `base::queue` in 197 preference to `std::deque` or `std::queue` due to memory usage and platform 198 variation. 199 200 The `base::circular_deque` implementation (and the `base::queue` which uses it) 201 provide performance consistent across platforms that better matches most 202 programmer's expectations on performance (it doesn't waste as much space as 203 libc++ and doesn't do as many heap allocations as MSVC). It also generates less 204 code tham `std::queue`: using it across the code base saves several hundred 205 kilobytes. 206 207 Since `base::deque` does not have stable iterators and it will move the objects 208 it contains, it may not be appropriate for all uses. If you need these, 209 consider using a `std::list` which will provide constant time insert and erase. 210 211 ### std::deque and std::queue 212 213 The implementation of `std::deque` varies considerably which makes it hard to 214 reason about. All implementations use a sequence of data blocks referenced by 215 an array of pointers. The standard guarantees random access, amortized 216 constant operations at the ends, and linear mutations in the middle. 217 218 In Microsoft's implementation, each block is the smaller of 16 bytes or the 219 size of the contained element. This means in practice that every expansion of 220 the deque of non-trivial classes requires a heap allocation. libc++ (on Android 221 and Mac) uses 4K blocks which elimiates the problem of many heap allocations, 222 but generally wastes a large amount of space (an Android analysis revealed more 223 than 2.5MB wasted space from deque alone, resulting in some optimizations). 224 libstdc++ uses an intermediate-size 512 byte buffer. 225 226 Microsoft's implementation never shrinks the deque capacity, so the capacity 227 will always be the maximum number of elements ever contained. libstdc++ 228 deallocates blocks as they are freed. libc++ keeps up to two empty blocks. 229 230 ### base::circular_deque and base::queue 231 232 A deque implemented as a circular buffer in an array. The underlying array will 233 grow like a `std::vector` while the beginning and end of the deque will move 234 around. The items will wrap around the underlying buffer so the storage will 235 not be contiguous, but fast random access iterators are still possible. 236 237 When the underlying buffer is filled, it will be reallocated and the constents 238 moved (like a `std::vector`). The underlying buffer will be shrunk if there is 239 too much wasted space (_unlike_ a `std::vector`). As a result, iterators are 240 not stable across mutations. 241 242 # Stack 243 244 `std::stack` is like `std::queue` in that it is a wrapper around an underlying 245 container. The default container is `std::deque` so everything from the deque 246 section applies. 247 248 Chromium provides `base/containers/stack.h` which defines `base::stack` that 249 should be used in preference to std::stack. This changes the underlying 250 container to `base::circular_deque`. The result will be very similar to 251 manually specifying a `std::vector` for the underlying implementation except 252 that the storage will shrink when it gets too empty (vector will never 253 reallocate to a smaller size). 254 255 Watch out: with some stack usage patterns it's easy to depend on unstable 256 behavior: 257 258 ```cpp 259 base::stack<Foo> stack; 260 for (...) { 261 Foo& current = stack.top(); 262 DoStuff(); // May call stack.push(), say if writing a parser. 263 current.done = true; // Current may reference deleted item! 264 } 265 ``` 266 267 ## Appendix 268 269 ### Code for map code size comparison 270 271 This just calls insert and query a number of times, with printfs that prevent 272 things from being dead-code eliminated. 273 274 ```cpp 275 TEST(Foo, Bar) { 276 base::small_map<std::map<std::string, Flubber>> foo; 277 foo.insert(std::make_pair("foo", Flubber(8, "bar"))); 278 foo.insert(std::make_pair("bar", Flubber(8, "bar"))); 279 foo.insert(std::make_pair("foo1", Flubber(8, "bar"))); 280 foo.insert(std::make_pair("bar1", Flubber(8, "bar"))); 281 foo.insert(std::make_pair("foo", Flubber(8, "bar"))); 282 foo.insert(std::make_pair("bar", Flubber(8, "bar"))); 283 auto found = foo.find("asdf"); 284 printf("Found is %d\n", (int)(found == foo.end())); 285 found = foo.find("foo"); 286 printf("Found is %d\n", (int)(found == foo.end())); 287 found = foo.find("bar"); 288 printf("Found is %d\n", (int)(found == foo.end())); 289 found = foo.find("asdfhf"); 290 printf("Found is %d\n", (int)(found == foo.end())); 291 found = foo.find("bar1"); 292 printf("Found is %d\n", (int)(found == foo.end())); 293 } 294 ``` 295 296