I’ve always had trouble with the poachers wanting to be in charge of the gamekeepers. We see it all the time: developers wanting to control the code repository, the PMO wanting to decide what goes in the release. But my favorite is the application delivery team wanting to own release management. It’s not right and it needs to stop.
Now I know you’re thinking: I’m all good because we made it part of the service delivery team a decade ago. Well you got it wrong too.
OK: so if release management doesn’t report to apps and it doesn’t report to ops: who does it report to? Answer: the CIO, or at least it should in my view.
But before we can get there we need to rethink the purpose of release management. For many organizations release management has become a board level discussion. The business has time-to-market imperatives that IT is just not satisfying. Application delivery want to implement more agile ways of working but the too infrequent release windows stymie their ability to get perfectly good code out. And the data center is drowning not just in software changes but in hardware and infrastructure changes and they need to stem the flow so they can take a breath and understand the impact of all this change.
But we have to accept that change is going to happen. It is going to be more complex, contain more risk and the cadence of release is just going to continue to increase in tempo. I have a customer who releases 4 times a day and I read that Facebook releases 3 times an hour. So having four release windows a year might have been acceptable last century but it isn’t any more.
What is it that the operations teams need that they are not getting and what is it that the application delivery teams are doing such that it overloads the system?
In my experience there is a very long list of maladies that are surfacing as full blown business critical problems now that the squeeze is on release management. I’m sure in your own organization you have your own list but here are just some of the issues that making release management the battle ground.
Software quality is terrible. I know that we have had a focus on quality for decades now and we have created vast departments of testers and armed them with world class tools. But the truth remains: initial build quality is unacceptably low, developers do too little unit testing and are allowed to get away with delivering code that has had no rigor applied to prove it. And this is because they are under the time-to-market pressure. Everyone is being rightly business focused but we are just not putting code out that is good enough. Why do we give the automated test tools to QA. Make the developers get 100% rates from automated testing before they give it to QA. The QA team did not put the bugs in the code so they deserve to get better quality code when it is delivered.
Releases are too big. We have got into the habit of making releases that are the size of the Titanic and are likely to go the same way. These massive releases contain every last item the business analysts, project managers, release managers, architects and developers can squeeze into them. The releases end up having massive pre-requisites and co-requisites, contain dozens of requirements, hundreds of change requests and result in thousands of code changes. The amount of churn these release cause and the impact on the system’s stability introduce so much risk that it has become incalculable and so we have stopped trying estimate it. We have to have more releases, not fewer, they need to be smaller and thematic and our mantra should stop being “what else can we squeeze in” and instead be “what more can we leave out”.
We have to trust each other. A typical release management process tests and tests and tests. We go from unit, to functional, to regression, to user acceptance, to integration, to performance, to pre-production (and no doubt you have a few more of your own) and we invariably retest what has already been tested. And we find errors that should have been found earlier. So we keep on doing this, we keep on checking the checkers instead of fixing the root cause. If we are letting errors through an earlier part of the testing cycle we need to fix the area that is missing things NOT add more testing later to compensate. And making the initial build quality better will make it so we have the time to do so. When we get there we have to trust our fellow testers and start doing large parts of the testing in parallel. This alone can have the release time.
Approver aren’t accountable. We set up elaborate meetings to get approval from stakeholders for the changes to go into production. We call them the million-dollar meetings. They happen every week, for two hours, and we walk through a spreadsheet of changes planned in the next release window. There are 60 people on the phone who earn over $100,000 each. And when we get to their item on the agenda they give us a non-committal “we’re happy with it” and we take this as positive affirmation that we can change the infrastructure. It has to stop. The release manager needs to collect electronic signatures and when things go wrong all the approvers need to answer the question “why did you approve?” Not only do they need to be accountable. There needs to be consequences.
Who in the organization has the power to make these kinds of radical change in how we do releases? Only the CIO. If these ideas come from the apps teams or the ops team the others are going to be suspicious.
It’s time release management reported to the CIO. It’s the only way.