- Technical debt is a popular metaphor for communicating the long-term implications of architectural decisions and trade-offs to stakeholders, but there are limitations to its usefulness
- Architectural decisions can add or remove technical debt, but the change is hard to quantify in either financial or technical terms.
- Quality Attribute Requirements (QARs) can help to overcome these limitations by reframing stakeholder discussion in terms of capabilities that the system must possess instead of in terms of work that needs to be done.
- Understanding and acknowledging the impact of their decisions on technical debt help teams make better decisions
- When making a decision, teams need to empirically validate the decision as soon as possible to limit the cost of reversing the decision.
- Deferred maintenance is an analogy of technical debt in the world of engineering, and may perhaps be a better metaphor, but it too has limitations.
What is Technical Debt?
Technical debt is a metaphor used in software development that is intended to help people understand that there is a cost to making short-term decisions that result in long-term increases in cost. According to the metaphor, this cost increases similarly to accrued interest, over time. A more accurate expression is that the more work that a team does based on a decision, the more it may have to do to correct that decision at a later point in time.
This term has gained a lot of traction in the software industry. Technical debt is not always bad—it is sometimes beneficial (e.g., quick solutions to get a product to market). The concept was first introduced by Ward Cunningham:
Shipping first time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite. . . . The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest on that debt. Entire engineering organizations can be brought to a stand-still under the debt load of an unconsolidated implementation, object-oriented or otherwise 1
There are two important ideas in this: First, that technical debt is often a useful expedient to actually shipping a software product. This is important – no product is perfect, and many software projects have been sunk by endless “polishing” of code that is good enough in the short run. The second important idea is that those same expedient decisions may need to be reconsidered when long-term supportability is considered.
The term “Technical Debt”, however, can be confusing when it is not clearly defined. In their book Managing Technical Debt, Kruchten, Nord, and Ozkaya provide an excellent overview of the concept and how to manage it. They offer the following definition of technical debt:
In software-intensive systems, technical debt consists of design or implementation constructs that are expedient in the short term but that set up a technical context that can make future change more costly or impossible. Technical debt is a contingent liability whose impact is limited to internal system qualities—primarily, but not only maintainability and evolvability. 2
We like this definition because it focuses more on the impact of technical debt
than on the financial debt metaphor—which only partially captures the issue. As stated in “Continuous Architecture In Practice”,
The focus on maintainability and evolvability is key to how to think about technical debt. It implies that if your system is not expected to evolve, the focus on technical debt should be minimal. For example, software written for the Voyager spacecraft should have very limited focus on technical debt because it is not expected to evolve and has limited maintenance opportunities. 3
We refer readers to Managing Technical Debt (see endnote #2) for a complete treatment of this topic. There has also been a lot of empirical research on technical debt – see the references section in Managing Technical Debt as well as Technical Debt, An Empirical Model of Technical Debt and Interest, or Lehman’s Law for some examples of that research.
Categories of Technical Debt
According to Kruchten, Nord, and Ozkaya, technical debt can be classified into the following three broad categories:
- Architecture: Debt in this category results from architectural decisions made during the software development process;
- Code: This category includes expediently written code that is difficult to maintain and evolve, and
- Production infrastructure: This category includes decisions focused on the infrastructure and code that are used to build, test, and deploy a software system.
In addition, technical debt can be either accidental or intentional. Many practitioners consider latent defects to be part of technical debt, while architectural decisions are almost always trade-offs between at least two conflicting QARs, and therefore fall into the “intentional” category.
Technical Debt Metaphor Shortcomings
Every metaphor has shortcomings, and technical debt is no exception:
- It’s hypothetical. An issue identified as technical debt may never need to be resolved. The code may be ugly, but if it works and has few side effects or dependencies, teams usually have better things to do with their time, such as providing value for customers. Implying that it needs to be fixed overstates the contingent liability that the issue represents. If the effects of the issue can be localized, the issue may not need to be addressed.
- Unlike debt, which has a lender, teams have no constraint on their technical debt. Lenders assess the likelihood of a debtor repaying, and they will seek assurances that create consequences for default. Habitual abusers of debt eventually have their access to credit frozen.
Software teams have fewer constraints; for example, architecture, design, and code reviews, either automated or manual, may prevent accidental technical debt, but in complex systems, many errors cannot be caught by reviews and only become visible in the running system. Error budgets may prevent teams from adding new features until they resolve defects, but the errors have to be identified for this to work, and much technical debt, like an iceberg’s mass, remains hidden.
In most cases, however, software teams can “issue” as much intentional technical debt as they like, and no one will hold them accountable for unsustainable decisions. In fact, if they work in an organization that uses a project-oriented funding model, all they have to do is to release the project into production where it becomes IT operations’ problem, absolving them of responsibility.
- The cost of a deferred decision can grow rapidly over time. There is, in fact, no such thing as a deferred decision. Deciding to take a different approach in the future is more than the cost of the additional work; if the code to be reworked has many dependencies, the cost of rework can be dramatically higher, combinatorially higher in fact (see “In Search of a Metric for Managing Architectural Technical Debt”.) The compound interest analogy behind technical debt implies rework is more costly, but it obscures the dependency cause for cost escalation. Using design tactics such as encapsulation and modularization, teams can reduce or even completely flatten the growth of the cost of change.
- Short of catastrophe, most organizations ignore technical debt. This goes back to the first point; the problem may not need to be solved, and even if it really should be solved, business stakeholders may not value resolving it. The real problem is that the technical debt metaphor, by focusing only on cost, does not communicate what the business stakeholders are giving up by ignoring the technical debt. Arguments about long-term supportability don’t go very far with people who often have a quarterly or annual concern horizon. Not being able to scale, or being vulnerable to a devastating security breach is often more compelling.
- Technical debt is very hard to quantify in financial terms. As explained by Kevlin Henney during his keynote address at QCon Plus 2022 (see Ben Linders’s InfoQ article titled “Technical Debt is Quantifiable as Financial Debt: an Impossible Thing for Developers”), teams can’t measure the financial value of their technical debt. Managers find this baffling; they have heard for years that “You can’t manage what you can’t measure ” 4 , so the inability to express the financial impact of technical debt sometimes makes teams look unwise.
Architectural Decisions and Technical Debt
As we stated in an earlier article, software architecture is about decisions driven by QARs, and these decisions may add or remove technical debt – see Figure 1. The timing of these decisions determines which approach a team uses to create its architectural design. Making most architectural decisions at the beginning of a project, often before the QARs are precisely defined, results in an upfront architecture that may not be easy to evolve and will probably need to be significantly refactored when the QARs are better defined. Contrastingly, having a continuous flow of architectural decisions as part of each Sprint results in an agile architecture that can better respond to QAR changes.
Almost every architectural decision is a trade-off between at least two QARs. For example, consider security vs. usability. Regardless of the decision being made, it is likely to increase technical debt, either by making the system more vulnerable by giving priority to usability or making it less usable by giving priority to security. Either way, this will need to be addressed at some point in the future, as the user population increases, and the initial decision to prioritize one QAR over the other may need to be reversed to keep the technical debt manageable. Other examples include scalability vs. modifiability, and scalability vs. time to market.
These decisions are often characterized as “satisficing”, i.e., “good enough”. You can almost always do better, but you choose to stop when the result is good enough. As stated in Continuous Architecture in Practice, “Architectural decisions can add or remove technical debt.” 5 However, how much is added or removed is hard to quantify in financial or even technical terms.
Unless the team members are extremely knowledgeable or very lucky, some of their architectural decisions, regardless of the approach they are using, may need to be adjusted or even reversed at some point in the future, based on the information provided by the feedback loops (see Figure 1). Adjusting or reversing existing architectural decisions creates additional work, competing with other tasks already in the backlog. Those tasks are usually considered a higher priority as they are expected to deliver useful functionality to the stakeholders. As a result, work associated with adjusting or reversing architectural decisions may be postponed, increasing the software system’s “Technical Debt”.
Figure 1: Quality Attributes, Architectural Decisions, Technical Debt, and Feedback Loops
Every time a team makes a decision, they must grapple with the possibility that the work they do based on that decision may need to be undone in the future. As a result, they need to think of decisions as hypotheses that they need to prove or reject in a fairly short period of time so as to verify that they are not doing work that will need to be undone at some future point in time.
Because all decisions, at least until validated, hold some possibility of creating work that needs to be undone, a team can do one or more of the following:
- They can postpone decisions until they are absolutely necessary
- When they must make a decision, they can validate the decision as soon as possible to limit potential cost exposure
- They can reduce the growth of dependencies by using design techniques like encapsulation and abstraction
Understanding and acknowledging the impact of their decisions on technical debt help teams make better decisions. However, since it is very hard to quantify technical debt in financial terms, technical debt is not helpful in assessing the cost of a decision, and can hardly be used as a precise decision evaluation tool to select an appropriate trade-off between at least two QARs.
Going back to the security vs. usability example, it would be difficult to estimate the financial impact of prioritizing usability over security when designing a minimum viable architecture (MVA) in order to deploy an MVP in a short timeframe. A better approach would be to make a minimal set of technical decisions that are tested and evolved using empiricism over time. These decisions should be complemented by a minimal set of architectural practices that help the team to keep the product architecturally viable while they evolve it.
Still, communication is a very important part of any architectural effort, and leveraging the technical debt metaphor helps teams communicate the long-term ramifications of these decisions, even if the amount of technical debt associated with these decisions can’t be accurately quantified.
Perhaps a better metaphor: deferred maintenance?
Technical debt has an analogy in the world of engineering: deferred maintenance. Deferred maintenance is the reason why well-designed bridges fail, why well-designed buildings collapse, and why well-designed airplanes fall from the sky. In the physical world, increasing entropy has a cost, and failing to keep ahead of it has catastrophic consequences.
The advantage we have in considering the cost of deferred maintenance in the physical world is that it’s easy to see how quickly costs increase when maintenance is deferred: a simple new sealing coat of paint may be sufficient if the underlying steel is sound, but once rust sets in, all the old paint and the rust must be removed before repainting. If cracks develop the steel may need to be reinforced, and if they are bad enough entire sections of the structure may need to be replaced, if that is even possible.
But software is more complex than this. Not only are flaws not easily inspected or observed, but interconnections between components can cause rework to ripple throughout the code base. Even “simple” component replacements can be challenging because new components may have side effects, or they may require different parameter data to which the calling code does not have access. And if the changes are more deeply intrusive, like algorithmic changes, the increase in cost can quickly become exponential.
Even the term “maintenance” has limitations as a metaphor; its normal meaning suggests simple upkeep due to wear and tear, but software does not wear with use. Changes to software can be precipitated by external events like changes to operating systems or frameworks, vendors going out of business, or new versions of infrastructural software. More disruptive changes are caused by changes in customer behavior, business operations, or organizational strategy.
There are limits to every metaphor, and sometimes we must leave them behind to find a better model to help us make decisions.
Most teams do not think that the decisions that they are making now will need to be undone and reworked at some point in the future. The reality, however, is that things change and different approaches are needed to address these changes. While teams cannot predict the future, they can run experiments to detect the potential for change before the impact of that change becomes intolerable.
Technical debt is one example of known future change work. It occurs when a team makes a decision to defer work that they are fairly sure needs to be done. The debt metaphor implies that the cost of dealing with this change will go up exponentially over time, like compound interest. Paradoxically, as we pointed out in this article, technical debt is very hard to quantify in financial terms, which limits its usefulness as a cost model to evaluate decisions.
Although technical debt is helpful in communicating the technical ramifications of the team’s decisions to stakeholders, a better approach may be to reframe the discussion using Quality Attribute Requirements to explain to the stakeholders the capabilities that the software system must possess instead of debating the amount of work that needs to be done.
Finally, we would like to thank Thomas Betts, Murat Erder, John Klein, Philippe Kruchten, and Eoin Woods for reviewing an earlier version of this article.
1. Ward Cunningham, “The WyCash Portfolio Management System,” ACM SIGPLAN OOPS Messenger4, no. 2 (1992)
2. Philippe Kruchten, Rod Nord, and Ipek Ozkaya, Managing Technical Debt: Reducing Friction in Software Development (Addison-Wesley, 2019).
3. Murat Erder, Pierre Pureur, and Eoin Woods, Continuous Architecture in Practice (Addison-Wesley, 2021)
4. This quote is often attributed to Peter Drucker and to W.Edwards Deming
5. Murat Erder, Pierre Pureur, and Eoin Woods, “Continuous Architecture in Practice” (Addison-Wesley, 2021)