Home How does std::unordered-map.clear() work when keys do not own the underlying memory?

Questions

How does std::unordered-map.clear() work when keys do not own the underlying memory?

April 7, 2023

I came across a relatively weird segfault when using a std::unordered_map<std::string_view, std::string_view> with MSVC 2017, compiling for C++17, with a simple debug build.

The TLDR here is that calling .clear() on my map after the char arrays to which the string views were ‘looking at’ changed or went out of scope led to a segfault.

After investigating the internals of the unordered map, it seems to me that calling .clear() assumes that one can correctly compute the hash of every key in the map. This I guess is just a way to satisfy the O(container_size) time complexity of clear that the standard states.

Now in the case of my unordered map of string views, I believe the hashing of a string view is based on the hashing of the char array that it looks at, therefore if the char array went out of scope,or was modified in any way, my hashing would be completely off, leading to the segfault.

However, this seems a very odd way to implement the .clear() function of the unordered map container. Why isn’t .clear() relying on some sort of internal state to decide where to start erasing from when a call to clear happens? In this way, it wouldn’t have to rely on the keys hash being correctly computable when a call to .clear() happens.

Does this imply that one can’t safely call .clear() on an unordered_map storing keys that do not own the underlying memory, and whose hashing are based on the underlying memory?

Can anyone please shed some light on this?

Thanks!

I would expect the call to .clear() to never lead to UB, no-matter what keys I am using in my map. I would expect the call to .clear() to be completely independent on hashing of the keys themselves.

>Solution :

The standard lays down several expectations about the behavior of keys in any "unordered" container:

For any two keys k1 and k2 in the same container, calling pred(k1, k2) shall always return the same value. For any key k in a container, calling hash(k) shall always return the same value.

These are rules that your key objects must abide by. Always: so long as the container exists, these must hold true. And if they don’t hold true, you get undefined behavior.

These statements do not provide an exception for clear or even the destructor. As such, any implementation is permitted to assume that the container still works during any function call, including clear or the destructor.

Could one implement the type in a way that clear would still work if you broke the rules? Maybe. But they don’t have to because your code is the one that’s ultimately at fault here.