1 # base/containers library
2
3 ## What goes here
4
5 This directory contains some STL-like containers.
6
7 Things should be moved here that are generally applicable across the code base.
8 Don't add things here just because you need them in one place and think others
9 may someday want something similar. You can put specialized containers in
10 your component's directory and we can promote them here later if we feel there
11 is broad applicability.
12
13 ### Design and naming
14
15 Containers should adhere as closely to STL as possible. Functions and behaviors
16 not present in STL should only be added when they are related to the specific
17 data structure implemented by the container.
18
19 For STL-like containers our policy is that they should use STL-like naming even
20 when it may conflict with the style guide. So functions and class names should
21 be lower case with underscores. Non-STL-like classes and functions should use
22 Google naming. Be sure to use the base namespace.
23
24 ## Map and set selection
25
26 ### Usage advice
27
28 * Generally avoid **std::unordered\_set** and **std::unordered\_map**. In the
29 common case, query performance is unlikely to be sufficiently higher than
30 std::map to make a difference, insert performance is slightly worse, and
31 the memory overhead is high. This makes sense mostly for large tables where
32 you expect a lot of lookups.
33
34 * Most maps and sets in Chrome are small and contain objects that can be
35 moved efficiently. In this case, consider **base::flat\_map** and
36 **base::flat\_set**. You need to be aware of the maximum expected size of
37 the container since individual inserts and deletes are O(n), giving O(n^2)
38 construction time for the entire map. But because it avoids mallocs in most
39 cases, inserts are better or comparable to other containers even for
40 several dozen items, and efficiently-moved types are unlikely to have
41 performance problems for most cases until you have hundreds of items. If
42 your container can be constructed in one shot, the constructor from vector
43 gives O(n log n) construction times and it should be strictly better than
44 a std::map.
45
46 * **base::small\_map** has better runtime memory usage without the poor
47 mutation performance of large containers that base::flat\_map has. But this
48 advantage is partially offset by additional code size. Prefer in cases
49 where you make many objects so that the code/heap tradeoff is good.
50
51 * Use **std::map** and **std::set** if you can't decide. Even if they're not
52 great, they're unlikely to be bad or surprising.
53
54 ### Map and set details
55
56 Sizes are on 64-bit platforms. Stable iterators aren't invalidated when the
57 container is mutated.
58
59 | Container | Empty size | Per-item overhead | Stable iterators? |
60 |:---------------------------------------- |:--------------------- |:----------------- |:----------------- |
61 | std::map, std::set | 16 bytes | 32 bytes | Yes |
62 | std::unordered\_map, std::unordered\_set | 128 bytes | 16-24 bytes | No |
63 | base::flat\_map and base::flat\_set | 24 bytes | 0 (see notes) | No |
64 | base::small\_map | 24 bytes (see notes) | 32 bytes | No |
65
66 **Takeaways:** std::unordered\_map and std::unordered\_map have high
67 overhead for small container sizes, prefer these only for larger workloads.
68
69 Code size comparisons for a block of code (see appendix) on Windows using
70 strings as keys.
71
72 | Container | Code size |
73 |:------------------- |:---------- |
74 | std::unordered\_map | 1646 bytes |
75 | std::map | 1759 bytes |
76 | base::flat\_map | 1872 bytes |
77 | base::small\_map | 2410 bytes |
78
79 **Takeaways:** base::small\_map generates more code because of the inlining of
80 both brute-force and red-black tree searching. This makes it less attractive
81 for random one-off uses. But if your code is called frequently, the runtime
82 memory benefits will be more important. The code sizes of the other maps are
83 close enough it's not worth worrying about.
84
85 ### std::map and std::set
86
87 A red-black tree. Each inserted item requires the memory allocation of a node
88 on the heap. Each node contains a left pointer, a right pointer, a parent
89 pointer, and a "color" for the red-black tree (32-bytes per item on 64-bits).
90
91 ### std::unordered\_map and std::unordered\_set
92
93 A hash table. Implemented on Windows as a std::vector + std::list and in libc++
94 as the equivalent of a std::vector + a std::forward\_list. Both implementations
95 allocate an 8-entry hash table (containing iterators into the list) on
96 initialization, and grow to 64 entries once 8 items are inserted. Above 64
97 items, the size doubles every time the load factor exceeds 1.
98
99 The empty size is sizeof(std::unordered\_map) = 64 +
100 the initial hash table size which is 8 pointers. The per-item overhead in the
101 table above counts the list node (2 pointers on Windows, 1 pointer in libc++),
102 plus amortizes the hash table assuming a 0.5 load factor on average.
103
104 In a microbenchmark on Windows, inserts of 1M integers into a
105 std::unordered\_set took 1.07x the time of std::set, and queries took 0.67x the
106 time of std::set. For a typical 4-entry set (the statistical mode of map sizes
107 in the browser), query performance is identical to std::set and base::flat\_set.
108 On ARM, unordered\_set performance can be worse because integer division to
109 compute the bucket is slow, and a few "less than" operations can be faster than
110 computing a hash depending on the key type. The takeaway is that you should not
111 default to using unordered maps because "they're faster."
112
113 ### base::flat\_map and base::flat\_set
114
115 A sorted std::vector. Seached via binary search, inserts in the middle require
116 moving elements to make room. Good cache locality. For large objects and large
117 set sizes, std::vector's doubling-when-full strategy can waste memory.
118
119 Supports efficient construction from a vector of items which avoids the O(n^2)
120 insertion time of each element separately.
121
122 The per-item overhead will depend on the underlying std::vector's reallocation
123 strategy and the memory access pattern. Assuming items are being linearly added,
124 one would expect it to be 3/4 full, so per-item overhead will be 0.25 *
125 sizeof(T).
126
127
128 flat\_set/flat\_map support a notion of transparent comparisons. Therefore you
129 can, for example, lookup base::StringPiece in a set of std::strings without
130 constructing a temporary std::string. This functionality is based on C++14
131 extensions to std::set/std::map interface.
132
133 You can find more information about transparent comparisons here:
134 http://en.cppreference.com/w/cpp/utility/functional/less_void
135
136 Example, smart pointer set:
137
138 ```cpp
139 // Declare a type alias using base::UniquePtrComparator.
140 template <typename T>
141 using UniquePtrSet = base::flat_set<std::unique_ptr<T>,
142 base::UniquePtrComparator>;
143
144 // ...
145 // Collect data.
146 std::vector<std::unique_ptr<int>> ptr_vec;
147 ptr_vec.reserve(5);
148 std::generate_n(std::back_inserter(ptr_vec), 5, []{
149 return std::make_unique<int>(0);
150 });
151
152 // Construct a set.
153 UniquePtrSet<int> ptr_set(std::move(ptr_vec), base::KEEP_FIRST_OF_DUPES);
154
155 // Use raw pointers to lookup keys.
156 int* ptr = ptr_set.begin()->get();
157 EXPECT_TRUE(ptr_set.find(ptr) == ptr_set.begin());
158 ```
159
160 Example flat_map<std\::string, int>:
161
162 ```cpp
163 base::flat_map<std::string, int> str_to_int({{"a", 1}, {"c", 2},{"b", 2}},
164 base::KEEP_FIRST_OF_DUPES);
165
166 // Does not construct temporary strings.
167 str_to_int.find("c")->second = 3;
168 str_to_int.erase("c");
169 EXPECT_EQ(str_to_int.end(), str_to_int.find("c")->second);
170
171 // NOTE: This does construct a temporary string. This happens since if the
172 // item is not in the container, then it needs to be constructed, which is
173 // something that transparent comparators don't have to guarantee.
174 str_to_int["c"] = 3;
175 ```
176
177 ### base::small\_map
178
179 A small inline buffer that is brute-force searched that overflows into a full
180 std::map or std::unordered\_map. This gives the memory benefit of
181 base::flat\_map for small data sizes without the degenerate insertion
182 performance for large container sizes.
183
184 Since instantiations require both code for a std::map and a brute-force search
185 of the inline container, plus a fancy iterator to cover both cases, code size
186 is larger.
187
188 The initial size in the above table is assuming a very small inline table. The
189 actual size will be sizeof(int) + min(sizeof(std::map), sizeof(T) *
190 inline\_size).
191
192 # Deque
193
194 ### Usage advice
195
196 Chromium code should always use `base::circular_deque` or `base::queue` in
197 preference to `std::deque` or `std::queue` due to memory usage and platform
198 variation.
199
200 The `base::circular_deque` implementation (and the `base::queue` which uses it)
201 provide performance consistent across platforms that better matches most
202 programmer's expectations on performance (it doesn't waste as much space as
203 libc++ and doesn't do as many heap allocations as MSVC). It also generates less
204 code tham `std::queue`: using it across the code base saves several hundred
205 kilobytes.
206
207 Since `base::deque` does not have stable iterators and it will move the objects
208 it contains, it may not be appropriate for all uses. If you need these,
209 consider using a `std::list` which will provide constant time insert and erase.
210
211 ### std::deque and std::queue
212
213 The implementation of `std::deque` varies considerably which makes it hard to
214 reason about. All implementations use a sequence of data blocks referenced by
215 an array of pointers. The standard guarantees random access, amortized
216 constant operations at the ends, and linear mutations in the middle.
217
218 In Microsoft's implementation, each block is the smaller of 16 bytes or the
219 size of the contained element. This means in practice that every expansion of
220 the deque of non-trivial classes requires a heap allocation. libc++ (on Android
221 and Mac) uses 4K blocks which elimiates the problem of many heap allocations,
222 but generally wastes a large amount of space (an Android analysis revealed more
223 than 2.5MB wasted space from deque alone, resulting in some optimizations).
224 libstdc++ uses an intermediate-size 512 byte buffer.
225
226 Microsoft's implementation never shrinks the deque capacity, so the capacity
227 will always be the maximum number of elements ever contained. libstdc++
228 deallocates blocks as they are freed. libc++ keeps up to two empty blocks.
229
230 ### base::circular_deque and base::queue
231
232 A deque implemented as a circular buffer in an array. The underlying array will
233 grow like a `std::vector` while the beginning and end of the deque will move
234 around. The items will wrap around the underlying buffer so the storage will
235 not be contiguous, but fast random access iterators are still possible.
236
237 When the underlying buffer is filled, it will be reallocated and the constents
238 moved (like a `std::vector`). The underlying buffer will be shrunk if there is
239 too much wasted space (_unlike_ a `std::vector`). As a result, iterators are
240 not stable across mutations.
241
242 # Stack
243
244 `std::stack` is like `std::queue` in that it is a wrapper around an underlying
245 container. The default container is `std::deque` so everything from the deque
246 section applies.
247
248 Chromium provides `base/containers/stack.h` which defines `base::stack` that
249 should be used in preference to std::stack. This changes the underlying
250 container to `base::circular_deque`. The result will be very similar to
251 manually specifying a `std::vector` for the underlying implementation except
252 that the storage will shrink when it gets too empty (vector will never
253 reallocate to a smaller size).
254
255 Watch out: with some stack usage patterns it's easy to depend on unstable
256 behavior:
257
258 ```cpp
259 base::stack<Foo> stack;
260 for (...) {
261 Foo& current = stack.top();
262 DoStuff(); // May call stack.push(), say if writing a parser.
263 current.done = true; // Current may reference deleted item!
264 }
265 ```
266
267 ## Appendix
268
269 ### Code for map code size comparison
270
271 This just calls insert and query a number of times, with printfs that prevent
272 things from being dead-code eliminated.
273
274 ```cpp
275 TEST(Foo, Bar) {
276 base::small_map<std::map<std::string, Flubber>> foo;
277 foo.insert(std::make_pair("foo", Flubber(8, "bar")));
278 foo.insert(std::make_pair("bar", Flubber(8, "bar")));
279 foo.insert(std::make_pair("foo1", Flubber(8, "bar")));
280 foo.insert(std::make_pair("bar1", Flubber(8, "bar")));
281 foo.insert(std::make_pair("foo", Flubber(8, "bar")));
282 foo.insert(std::make_pair("bar", Flubber(8, "bar")));
283 auto found = foo.find("asdf");
284 printf("Found is %d\n", (int)(found == foo.end()));
285 found = foo.find("foo");
286 printf("Found is %d\n", (int)(found == foo.end()));
287 found = foo.find("bar");
288 printf("Found is %d\n", (int)(found == foo.end()));
289 found = foo.find("asdfhf");
290 printf("Found is %d\n", (int)(found == foo.end()));
291 found = foo.find("bar1");
292 printf("Found is %d\n", (int)(found == foo.end()));
293 }
294 ```
295
296