Why, after all this time, do we still have so many problems?
With years of collective experience in building software, it feels like we should have all the answers. We apply best practices, looking to both our own experience and research to show us the right way to create technology and build a productive organization. So why, after all this time, do we still have so many problems? The technology stack has become more powerful, and likely more robust. Yet we’re still looking for that next great solution that will pull our messy, complex systems together. Just how many best practices do you need, anyhow? Let’s look at some of the main players.
The hope:
Splitting the application into individual pieces with well-defined APIs between them will give us a clear separation of responsibility, allow more work to be managed independently by each team, and create systems that can scale and be modified fluidly.
The reality:
No one can agree on what size or shape each individual service should take. Is it a single function? A unit of business requirements? Who decides those API definitions? How are you documenting it? The end result is often that you have a mish-mash of components defined by different scopes, core functionality that no one has figured out how to remove from the monolith you started with, and maybe even more meetings to keep track of it all.
On top of that, you might have teams that have decided to use their independence to introduce different programming languages or new platform requirements, adding even more complexity to the stack as a whole. The worst is when this becomes a grand rewrite project, with different versions of the same core features being built in parallel. Someday you'll switch over to the fully microserviced v2.0, right?
The hope:
Automating the process to provision and configure infrastructure makes for a scalable and resilient system. Essential policies are baked right into the tooling. The result is readable, consistent, and easy to audit. Unneeded resources can be torn down without breaking a sweat. Developers can set up their own infrastructure whenever they need to, putting them in charge of the entire software deployment lifecycle.
The reality:
IaC works beautifully where it's been applied. Yet somehow there are critical systems out there still being managed by hand, and planning meetings for how to migrate away from that have dragged on and on. Teams seem to only be automating the parts they find easy, and not what's most critical. Which you can understand, because no one has worked out the best way to test it. Your current disaster: somehow a tooling change left a lot of resources abandoned and the only way to deprovision them is manually, as soon as you find all of it. But at least the mess of shell scripts that make up the rest of your infra tooling are being checked into GitHub. It's a start.
The hope:
"Never deploy on Friday" is a thing of the past. With continuous deployment and feature flagging new code can be pushed as soon as it's ready, and continuous integration ensures that tests always pass and package requirements are validated. Bug fixes get rolled out ASAP too. The test/build/release cycle runs smoothly and can fade into the background while the real work happens.
The reality:
There's that one CI step that kept failing so the team marked it as optional. You really need to add another linter for the docs since no one ever seems to use spellcheck. New requirements for supply chain security have been flagged as critical, but since you don't really have an ops team anymore the engineering managers have been playing hot potato with the task of who's going to be in charge of implementing it.
The hope:
Finally a process that will ensure the important work happens on time, it’s easy to change direction when needed, and everyone knows what’s on deck. No more getting stuck in a mess of requirements-gathering loops that never seem to end, having hours and hours of status update meetings, and being bogged down to the point that no real work ever gets done because all you do is plan.
The reality:
It turns out that “agile” encompasses so many things that arguing about which flavor to pursue takes up as much time as those endless status meetings. You tried to resolve this by sending everyone to training, but there’s at least as many options for that and understanding which ones disagree with each other is giving you a headache. Finally you hired a Scrum-certified project manager – on day one they insisted on throwing all the old processes out and doing things a new way, by the book. Any time you try to get your peers at other companies to help you understand why this is turning into a mess, they tell you that you’re doing agile wrong and send you a link to another version that they swear will fix everything. Meanwhile, you’re still trying to ship software – the more motivated team members seem to be succeeding because they figured out how to check all the boxes for whichever kind of Agile you’re doing this week while still completing new features.
I'm being a little harsh, but I'm pretty sure you've seen at least some of these failure modes happen in your organization. Friends in the industry have probably told you their own horror stories, too. Thankfully, your software stack is a mix of both these hopes and realities. Best practices are serving you well in some areas, and you may have agreement about where it's continuing to struggle. But if following best practices isn't a magic solution, what should we be doing instead?
I’ll talk about that next time.
Photo credits
Photo: https://www.flickr.com/photos/wocintechchat/25900866622/in/photostream/
Photo credit: CC-BY #WOCinTech Chat
Learn more strategies and benefits.