Integrity, Uncertainty and Optimism…

By / Date: April 12th, 2023

#codinghumans – another thread of writing…

Managing any effort to bring something new into the world is challenging. Even a baby being born has a vast uncertainty around it, yet millions of years of evolution has made the underlying process pretty reliable albeit uniquely weird, messy, scary and wonderful… What do I know having been simply an observer of the process? I digress…

Our culture has an insatiable appetite for new technology. My background training is in Electronic systems; also the discipline I earned my DPhil in. Back in 1999 having worked for a few years in Academia, I moved to Canada and into industry. The pressures changed; the focus was now on building things that worked – turning ideas into technology that could go out into the world on its own. And, for organizations that do this kind of work – doing that reliably is a big deal. No delivery = no money = no job (no pressure…)

Since 2003 I have worked as an independent Consultant, having spent a few years working with a talented group of people making groundbreaking technology in a company that ultimately stumbled and shed half its workforce (including yours truly) as it recovered into a shell of its former self. The contacts I made when I first arrived in Canada, and within others in that company and industry gave me enough of a foothold to establish a new niche, with something of a reputation for knowing stuff and getting s*** done.

And because people are endlessly fascinating – I am also a Coach. I have found in the last decade this has a lot to do with how I approach engineering work.

A strong personal value of mine is integrity. This can be a tough place to live when it comes my daily exercise of staring-down uncertainty. When a problem becomes known, until you have solved it – you have not solved it. And until it is solved, the unknown is ever-present and tends to bite. Dealing with the unknown has an art to it; ensuring it has as few places to hide as possible is a good one. However it does require tenacity to stare it in the face and stare it down when it comes for you. It always does.

Meanwhile keeping your word to people outside this arena of combat that all is actually well, and the mess will be cleaned up soon. Re-promising and re-negotiating when required. Integrity.

A few years back I and a few friends; lets call them Mike and Norm (because those were their names) were engaged in psychological combat up against a deadline. Both of these guys are brilliant. Mike taught me something in the process of developing a tight partnership over a few weeks. He had already impressed me with that he would never let any problem linger around. Once he had his teeth in, he would not let go. We were going to need that.

There was a *lot* of pressure to launch. The problem was, that this particular piece of audio hardware that had just been designed and built and prepared for shipping, had a software problem. It would work perfectly and then, out of the blue – stop working perfectly. Or doing anything at all. The display froze; the audio fell silent. It became a beautifully-designed brick – until the power was switched off and on again when it came back to life as if nothing had ever happened. We figured out the mean-time to failure was somewhere around 20 hours, but for any particular unit it could happen in 5 minutes or 5 days… There was no way it could be shipped with that flaw.

[Caution – technical garblespeak follows...]

The good news was we had a lot of units on hand (because, as the sales team kept reminding us, we really needed to ship them). We enlisted some help and got a shelf-full of 20 units into our lab. Each evening before heading home, we would start them all up (keeping the lab nice and warm for when we came in the following morning) and when we returned there would be 2-5 units that had frozen. These were gold – the only clues we had were buried deep in the memory and CPU state. Very very carefully, Mike and I would connect up the JTAG debugger to talk to the DSP-CPU without causing it to reset and loosing that invaluable data, and try to find out what was up.

Initially it made no sense; the program counter had ended up in the middle of nowhere. Something had screwed up the stack pointer – but how?

It took us two weeks – two painstaking weeks. Each session Mike and I would throw ideas back and forth between us, puzzling away at what could have caused the issue. Uncertainty in no uncertain terms. We just had to face it down. We kept fielding questions. People were jumpy. We kept the door closed and got on with it.

Finally, one Thursday morning, we found it. Some operating-system code had been written several years before by people who had since left the company. It had been working in multiple products seemingly flawlessly. We found two machine instructions in a particular order which, if an interrupt occurred at exactly the right moment, would leave the machine in an incorrect state and screw up the stack. And our application just happened to be the one that used the features of this operating system code that made this likely enough to be a problem. A one-instruction-wide race on a 300MHz processor that would be hit, we figured out, with a probability that matched with our mean-time-to-frozen of 20hrs. The solution? Reverse the two instructions and close the hole that had been inadvertently left there and had never caused a problem before.


To do this kind of work requires a kind of mental toughness, an ability to distil frustration into laser-like focus and tenacity. A key is the ability to hold that. Frustration and anger are incredibly powerful tools. Anger says ‘something is wrong’ and the energy to deal with it is right there, along with tremendous focus. A mis-step and it can be lost; become someone else’s fault (“they screwed up, if only they had been more careful, they need to fix it…”), or my fault (“damn it why am I such an idiot? stupid fool. I thought you knew how to do this. This does not look good for me… better keep my head down…”) and the power dissipates – the opportunity to transcend, lost.

A developer who knows how to wield anger is a force of nature. It takes courage (“I can do this!”), self-belief (“I am smart enough to figure this out!”), responsibility/ownership (“I am it”), humility (“I need others to help me break-out of the mental hole that keeps me not solving this…”). It can be a vulnerable place…

And optimism. I have never met a productive engineer who was not an incurable optimist. This also leads us to make promises that are very challenging to keep once our friendly antagonist (Msr. Unknown) shows up. It probably has a lot to do with why when presented with a deadline, most productive engineering folks I know both see it as a challenge and a joke at the same time.

Strangely, over time, doing the impossible gets easier. And, like in any other discipline, there lies a path to mastery…

12Apr23