What to Consider Before Embarking on a Rewrite

The itch to rewrite a piece of software from scratch can be one that’s hard to ignore. However, as many developers will learn throughout their career, rewrites can be tricky to pull off successfully because of the sheer amount of complexity and effort involved. If not executed correctly or done for the wrong reasons, then it will result in a loss or little to no gain if you're lucky. Thus, before embarking on that big rewrite it’s important to consider why it’s required and how we can increase the likelihood of success.

Let's take a look at some of the steps we can take throughout the decision making process. Typically, software engineers will be the first ones to raise alarm because they're close to the source. After that happens it’s important to investigate why. Intuition is a good indicator that something is up, but it doesn't necessarily point us to the root cause or the appropriate solution.

When we feel like a piece of software is becoming hard to deal with, it can be tempting to jump to conclusions and say that it has to be rewritten from scratch in a shiny new technology. This is like the example from "Thinking, Fast and Slow", where an investor picked a Ford stocks because he was impressed by their car show. The reason is that they had a tough question in front of them: "Should I invest in Ford stuck?", but subconsciously answered the easier question: "Do I like Ford cars?". We all have that tendency, so we have to make sure we're asking the right questions, and not the easy one about which technology we like.

Therefore, before we start prescribing a solution, we must make sure we have a good understanding of the problem. What is it that we’re trying to fix here?

The proper reasons for requiring a change are roughly going to fall into one of two categories. Either it’s something needed in service of our users and that helps us with company objectives, or it has a big enough impact on developer productivity and happiness to make it worth our while.

If it's something related to users or product, then it's likely a bit more obvious when there's an issue. When we look at metrics like, for example, the performance, weekly bug counts, incidents, and mean time to restore, and we can't find any other explanation than application related problems, then it's pretty obvious we should fix something in the product. This doesn't necessarily mean a rewrite is in order — we'll investigate that question later — but it does mean something has to improve.

Now, if we’re talking about issues related to developer productivity, then humans are involved so the topic becomes quite a bit more complex. We can still look at metrics here, but it’s important to realize that when they relate to human behavior, the picture the metrics alone can paint will not be black-and-white. There’s going to be some gray area, and we'll need to interpret everything in context.
For example, if the cycle time is trending upwards, we should rule out other causes, like how the team breaks up stories, before concluding that it's caused by to technical debt. If the team is healthy and has been stable for a while, then determining whether it's a tech or process issue should be pretty easy.

Once we've identified the category that the problem falls in, have gotten a more objective view of the issue, and determined that it is indeed a technological problem, then we can look at the harder part of the decision, which is figuring out whether a rewrite is appropriate and whether the cost is acceptable.

You may have heard that Google rewrites most of its software every few years, but you're unlikely to be a Google-scale company with more than 20.000 engineers and billions of dollars in profits.

The reality is that rewrites are both costly and risky. The original code has been battle-tested for what may have been years. When we decide to rewrite, all that complexity will resurface and will have to be re-implemented. Knowing that, we can assume there will be surprises during development, which will cause us to take longer to get to parity, which means it will take longer before we can do any new product improvements. On top of that, things could happen during that period that cause the company to shift focus and if we're caught up in the middle of a rewrite, then we may need to shelve it. That could mean losing months of work.

For those reasons, it's preferable to go for an incremental approach where possible. With this approach we try identify the biggest pain points and incrementally start chipping away at them. That way, we'll start providing value much faster than a rewrite ever would. Additionally, it's less risky, because we can change direction in between the increments and won't lose a huge amount of work if we have to drop something.

In many cases it's also difficult to answer the question: Is the cost worth the gain? There's no easy answer for this. The best we can do is make a reasonable estimate based on a cost-benefit analysis. Doing this in some quantifiable way is good, as long as we are aware that it can never be completely accurate and only serves as a rough estimate. For any significantly complex piece of software, there will always be unintended consequences and we should be prepared to deal with those.

We've already seen that we tend to underestimate the cost, and while doing our analysis we should keep in mind that we often overestimate the benefits of switching technologies too. "Hype is the plague on the house of software" as Robert Glass put it in Facts and Fallacies of Software Engineering. Changes in tools and technologies typically only result in a 5 to 30 percent increase in productivity and quality, and that's after the initial learning curve during which productivity is lower. So while there are definitely cases where technology changes are warranted, we should be aware of biases to estimate the costs and benefits incorrectly. Rewriting a React app to Vue probably won't benefit you much, but if you are still hiring COBOL developers, it might be worth thinking about incrementally replacing that system with something more modern to improve scalability and maintenance costs.

Now, once we've truly exhausted all other options, and are certain that the benefits will outweigh the costs, we can conclude that a rewrite is in order and be reasonably confident that it's the right solution.