How to design a system that never works, that you can't be blamed for
On making safe architectural decisions
Software is complicated, with many outside factors that can affect how well your system performs. As Ellen Ullman says:
We build our computers the way we build our cities -- over time, without a plan, on top of ruins
If your software is like a building, and your building falls down, you might get fired. So, it is best to have a prepared set of excuses.
Luckily for you, there are plenty:
The Network
it can be slow
it can not work
it can cost money, but you didn’t think that it would
it can be compromised
When someone complains, immediately ask them to check their bandwidth and say that the wifi might be spotty where they are, even if that is the cubicle next to you. Never mind that the reliability of the network can be designed around; you didn’t have time for that. And it is very complicated. Make sure to say things like the system is only as good as its connection, and roll your eyes when a network engineer starts to speak.
Administrators and Users
Administrators who control your system and the users who attempt to use it are to blame in some cases, but you can create suspicion that it is frequently by constantly asking if they are using it correctly. This works because users tend to think the problem is their fault, so you are simply playing into this weakness. This doesn’t go over as well with administrators.
Third-Party Software
Your software is, of course, made up of other software. The software beneath you likely has the reputation of being better than your software, but your manager might not know this. Operating system bugs do exist, and there are times when a compiler does something you don’t expect. While these cases are rare, marketing them as infrequent can provide effective cover for not handling null conditions and other typical system behavior because you didn’t think of it or test it because you were playing video games instead.
There are two special cases of outside companies that you can shift blame onto easily: your software vendors and your hosting provider. It is even better if you use a cloud provider because it is a mix of both. If you use Microsoft libraries, you can blame them for having issues or doing upgrades that broke your system. If you host your code on Azure, you can check if they are having an outage while your system is on fire to indicate that this is likely. It isn’t. Make sure to forward your boss any cloud outage notices, even if they don’t affect you. If AWS SQS in Singapore is experiencing slowness, then who knows if this will affect your Azure installation in Iowa. Only time will tell.
Until then, when your system has bad days, you can always say things like, “well the system is online, so its only as good as its connection”.
Use a new and exciting technology that people have heard of
There is always one current hot technology. For some reason, the world cannot have two at a time. When this hot technology is at its peak, use a little of it in your system, then blame it when it goes out of fashion. Use something that non-technical people who sign your checks have heard of, don’t understand, but seem to trust for reasons you don’t understand. Right now, this would be ChatGPT or its equivalents; before that, it was Machine Learning, and before this, Data Science. Your CIO might email you great candidates for this. When it turns out this technology isn’t as powerful as you thought, you can explain it as a group regret - a mistake we all made.
Focus on initial quality
One of the ways that cars are judged is via an Initial Quality Survey (IQS), which measures how much maintenance a card needs in its first year. Never mind that a car breaking down in its first year is likely a disaster; treat your software in the same way.
If your software immediately provides cost savings, better performance, or is better at any one thing that can be seen, point furiously at this aspect of it. This creates an initial impression of quality. Build a system that saves a lot of money upfront, buying you time to retire before the true costs set in. This provides you some buffer for when your software costs more, is not functional all the time, or is missing key features. The results might be wrong, but it now gets you the (wrong) results three times as fast.