How to ignore issues and keep secrets
Problems need to be big when you fix them, so that people know that you are strong
This thing has been happening in production, just every once in a while.
You see it in the log files, not when you are looking for it, but while you are debugging other issues. It is an odd error message that makes no sense to you, but it isn’t really your concern. You didn’t log in today to shave a yak, so you ignore it but make a “mental note” of it:
table ‘Customers’ does not exist or you do not have the correct permissions for access
Making a mental note is a funny thing because it isn’t just a note: it is also a feeling. And this feeling, mild uneasiness, also gets serialized along with the mental note1. So when you see it again, when you are looking yet again at another problem, you might just enforce that feeling: “oh no, not that again”
You know that you should look into it, but you are already so busy, and it is just more work. And this work can’t be that urgent because nobody knows about it yet. Because you aren’t going to tell them.
Maybe it is nothing. All your years of experience is telling you that nothings normally turn into somethings, but all your years of experience is also telling you that wild goose chases and bugs that don’t have enough information are incredibly frustrating.
It does seem weird that the customer table isn't there or that the system thinks it isn’t there. What could that be? A bad connection string—wow, that would be bad. Seems like we would have heard about that by now. Does the system get confused in the ORM-written SQL somewhere? Man, it seems weird. Anyway.
So you continue to ignore it. No alarms are going off, and the part of the system that gets the error2 has 57 retries, and this is only 1 or 2 of them. If anything, this is an automated test of the retry infrastructure, and the tests are passing.
Plus, anytime you bring up a problem and don’t a solution its just adding to the long list of tech debt items you are trying to get done. Its just a future argument in a sprint planning session, and you are an easy-going person, right? We all remember what happened to Terry when he was labeled negative.
So you don’t create a card or look into it further. Don’t even think about it. How bad it would be if the customers table didn’t exist for a few seconds a day, what the secondary effects of that would be
Next thing you know, you have a Ugh field and you can’t think about it, even if you wanted to. When you read the logs you don’t even see it anymore, or that it is happening every morning now.
One Sunny Day
Then one day a production incident happens where the Customers table has mysteriously disappeared and you get to say:
Oh wow this has been happening for months
and are somehow still considered a professional.3
Like how ANSI_NULLS
and QUOTED_IDENTIFIER
settings values get saved with a stored procedure in SQL Server, nerd.
FelineFactoryServiceProvider2
This particular example is real, and was caused by an database maintenance process running much later than it should have, which briefly caused a table to disappear as an OPTIMIZE TABLE
statement was run on it, causing secondary effects.