About 18 months ago, back in November 2017, I made a note to write about an interesting mistake I made in a C++ program. Time passed and my note proved to be insufficiently detailed—I’d forgotten what made the mistake interesting at all. Last week, memories came flooding back as I made exactly the same mistake again. If only I’d written the original article, then history could never have possibly repeated itself, right?
Le Problem
I was using a std::map
to store a relationship between two types of object.
The Key
type looked something like this:
struct Key final
{
const std::string group;
const std::string name;
Key(std::string group_, std::string name_) :
group { std::move(group_) },
name { std::move(name_) }
{
}
};
Dead simple.
C++‘s standard library is clever in its simplicity. Every algorithm depends on
a single comparison being defined, operator<
. All other comparison operators
can be derived from this, as shown in the table below. There’s even a helper
for this in the standard library.1
Operation | Implementation |
---|---|
a >= b | !(a < b) |
a > b | b < a |
a <= b | !(b < a) |
a == b | (a < b) == (b < a) |
a != b | (a < b) != (b < a) |
So why is this relevant to std::map
? The standard mandates that it’s a
sorted container, with ordering enforced by a comparator object which,
by default, makes use of operator<
. So I implemented the following operator:
struct Key final
{
...
bool operator< (const Key &rhs_)
{
if (group < rhs_.group)
return true;
return name < rhs_.name;
}
};
Proud with my brevity, I moved onto the matter at hand; I didn’t actually care about enforcing an ordering, I just needed the associative data-structure. If you’ve already spotted the mistake: you’re doing better than I did!
Le Limpet Mine
This went unnoticed for a surprising amount of time. Yes I know I should have written a test for it, but it was simple and I felt cocky.
The mapping was being used to track plugins being loaded into an application, so the data being run through the faulty comparison was carefully curated (by accident). I ended up with a situation like this:
Group | Name |
---|---|
compute | maximise |
compute | minimise |
compute | optimise |
render | png |
All’s well; clearly the ordering is being enforced correctly. Then I add two
more plugins into the system, let’s call them { "logging", "file" }
and
{ "logging", "network" }
. Much to my frustration, I end up with an exception
being thrown from map::at
because one of the keys wasn’t found. I knew it
existed because I was in quite close control of what plugins were loaded, but
map::at
was failing nonetheless.
I spent ages checking the plugin loading logic, making sure that registration
and discovery functions were being called. I stepped through the code watching
what happened. I even added cout
’s as plugins were registered just in case.
Everything was working as intended…
Analysis
As a sanity check, I enumerated the map, printing out the keys to verify what was being stored was as I expected it to be:
Group | Name |
---|---|
logging | file |
compute | maximise |
compute | minimise |
logging | network |
compute | optimise |
render | png |
*Facepalm*. The second I saw it, I knew what I’d done. No wonder map::at
was failing to find any of the compute
plugins because it wouldn’t advance
beyond the two logging
groups (remember that it’s a sorted container).
Every cloud has a silver lining, and this lining was a jolt to the memory: I’d
made a note to write about this mistake before.
Last time this happened (yes, I’m still reeling from embarrassment…) I
produced a plot using matplotlib
and was presented with a pattern that
was surprisingly pleasing to look at. Given that I have this nice, shiny, new
blog now, I saw it as an opportunity to play with D3.js. After a bit of
playing and learning, I had this figure below:
I’ve mentioned before that I use Gatsby for this blog which uses React internally. I’m a huge fan of React, but it clearly has its own view of how the DOM should be managed, so I saw integrating a D3 plot into a React site as an interesting challenge. Perhaps I should write about that too…
Anyway, back to the analysis: the lack of false negatives in the above plot demonstrates how easy it is to forget to be a scientist. When we conceive a hypothesis, we instinctively jump to confirming it to be true. In this case, all positives are true positives, so naively the implementation clearly works. Science, however, is not about proving things to be correct (that’s maths); it’s about proving things to be not wrong. That difference is subtle but extremely powerful.
Uncovering the problem shown very clearly in the figure is a two step process:
- Does the hypothesis stand up against some indicative stimulus? If not: great, you’ve just ruled out a bad idea before spending much time on it. If so: awesome, now we can verify its usefulness.
- Given that it works some of the time, when does this model break down? This part is the real scientific work. If the model breaks down too soon, it clearly isn’t useful. If it breaks down at some point that can be justified, then it’s still useful2.
When writing my operator, I wore the wrong hat. My software engineer mind was pleased with the simplicity and brevity of the implementation while my scientist and mathematician hats were still on the rack. Ah well, live ‘n’ learn. My mathematician hat needs an awful lot of work anyway…
Le Fix
As with almost every single problem in C++, the fix is trivial and requires only juggling a handful of characters:
bool operator< (const Key &rhs_)
{
if (group == rhs_.group)
return name < rhs_.name;
return group < rhs_.group;
}
It’s almost as terse as before! That makes it even more embarrassing!
I’ve recently started Jeremy Kun’s A Programmer’s Introduction to Mathematics which neatly explains this process in the first chapter. As programmers/coders/software engineers/whatever, we all deal with these frustrations as we progress towards the mastery of a language or framework. We make these stupid mistakes and solve them, banging our heads against these walls thinking “but this is clearly correct!”
Thankfully, the more we bang our heads, the harder they become and the weaker the walls become. The walls always give way, but our heads never do, right? So, I suppose I should celebrate that my head is just a little bit harder now. 🙂
- The implementation of this requires that
operator==
is also defined, but that can itself be derived fromoperator<
too. I’m not sure why the implementation ofoperator!=
depends onoperator==
. I posit that it might be because allrel_ops
operators use exactly one comparison, but deriving eitheroperator==
oroperator!=
fromoperator<
would require two.↩ - Newtonian physics is probably the most accessible example of this. It doesn’t capture the nuances of quantum or relativistic physics, but it’s still pretty good in the environments in which we humans tend to operate. A huge chunk of engineering uses Newtonian physics and it works just fine.↩