Legacy system failures expose the application knowledge gap’s harmful risks

February 24, 2021

February 24, 2021

by Steve Brothers

Government system failures during the rush to provide public benefits to alleviate the economic effects of the COVID-19 pandemic publicly exposed the mainframe knowledge crisis that also threatens financial institutions, healthcare providers, and many other organizations foundational to the world economy.

Several states discovered the knowledge-gap’s potentially devastating consequences when waves of unemployment-claims poured into their systems as COVID-19 ravaged the economy in early 2020. The states’ unemployment computer systems crashed trying to process the deluge of claims using mainframes and decades-old programming languages.

But it wasn’t the mainframes or the legacy programming languages that failed, despite what you may have read. It was the lack of available expert programmers necessary to maintain and update these systems to handle the voluminous claims.

According to The Verge’s article, “Unemployment checks are being held up by a coding language almost nobody knows,” Colorado employed exactly one full-time programmer to maintain the state’s COBOL system prior to the pandemic. Back then, Colorado processed roughly 2,000 unemployment claims per week. In March and April 2020, that number rocketed to as high as 104,572 claims per week.

Now governments, non-profits, and private organizations are reviewing their systems’ strategies to learn from these mistakes. If your business relies on legacy systems, you probably should keep reading – and schedule some time with your IT people.

Mainframes are cornerstones

Legacy mainframe systems and software bedrock many of our most trusted institutions, including government services, finance and banking, healthcare, and insurance. In a substantial number of cases the expert developers that created and maintained these systems and software are retiring without a supporting workforce to replace them.

Besides the people that make your business run, your software is potentially the most important resource your organization has. Internal applications likely drive your employees' capabilities and productivity. Customer-facing programs attract new customers, close business deals, and increase revenue. New applications and features can open new markets and opportunities.

To maintain and improve your critical applications, your software team relies on individual engineers that developed an expert understanding of your programs through years of experience. They know the applications and all the accumulated system changes and challenges.

When those experienced engineers depart your business, the developers that replace them must acquire the same application knowledge through training, mentorship, and on-the-job programming. This exercise introduces several material business risks.

Learning on the job

Developers new to software applications typically require 6-12 months of on-the-job learning to become productive, depending on the size of the source code base. To become proficient, programmers may need up to 3 years.

Without qualified software developers knowledgeable about your applications, you endanger your business's operations, reputation, and security. You also risk a significant decrease in your software teams' productivity and efficiency.

Consider the monumental task confronting newly hired or transferred IRS developers last March. Congress passed the CARES Act on March 25, 2020, and then Treasury Secretary Steven Mnuchin announced that individual stimulus checks would be mailed in early April.

To assist with the delivery of economic stimulus payments, the new developers were required to immediately start working with the agency's source code base, which, in 2019, was estimated at nearly 20 million lines of code and includes over 60 years of legislative and system changes. As of mid-May 2020, nearly 20 million people had not received their stimulus checks, and some recipients had problems throughout the year.

These engineers didn’t have 6-12 months to become productive. They had to hit the ground running on day one. And without the benefit of weeks or months of training and on-the-job learning, they didn’t have the application knowledge necessary to understand how even simple changes could affect entire applications.

And let's not forget the productivity loss due to the remaining IRS's experienced engineers for training, mentoring, and supervising the new recruits.

What’s your risk?

Your situation may not be as dire as what the IRS faced – for now. But how much time can you really give your new developers to learn the system, and how much productivity can you afford to lose while your experienced programmers train and supervise them?

How much do you trust the developers that have just started working on your critical applications?

How confident will you be when your CEO or Board of Directors asks for assurances that the next customer-facing application update will not result in outages and lost revenue – especially if the update was programmed by a developer you hired just weeks ago?

Your software is a critical part of your organization, especially if you rely on legacy mainframe systems. You must have a plan or tool that keeps the code running and bridges the gap between retiring and departing developers and the people that will replace them.

Steve Brothers is the President of Phase Change Software. You can reach him on LinkedIn or at [email protected].

How short-term maintenance practices can double application size in 5 years

December 11, 2019

December 11, 2019

by Todd Erickson and Elizabeth Richards

Software must evolve to stay effective, which makes application maintenance a persistent and growing obligation, especially for organizations with large, legacy systems.
Changing marketplaces, compliance updates, security patches, hardware improvements, bug fixes, and process updates all drive code changes.

In his book, Building Maintainable Software: Ten Guidelines for Future-Proof Code, Software Improvement Group Co-founder and CTO Tobias Kuipers says that in larger codebases, 15% of the source code is changed each year.

And software maintenance isn't cheap. Kuipers says that some of his clients spend up to 90% of their IT budgets on program upkeep.

A common problem is the technical debt that piles up when the software team doesn't understand the code they are modifying and its system interdependencies. Combine that lack of knowledge with time and resource demands and the team often resorts to short-sighted modification techniques that add code instead of modifying it in place, which only increases the codebase size, complexity, and technical debt.

Maintenance Challenges for Legacy Code

Legacy applications can have massive and complex code bases created by hundreds of developers throughout decades of work. For example, in 2012, The Bank of New York Mellon reported that its core banking system contained 323 million lines of code and 112,500 COBOL programs. With that size and complexity, even an experienced developer can’t know the whole system.

One issue is the lack of useful documentation. A Catholic University of Brazil study found that between 40% to 60% of maintenance activity is studying the software just to understand it, and the impact analysis required to make the changes without breaking functionality.

Updating documentation can shorten time-to-competency, but it's frequently a low-priority task when stakeholders are demanding that bug fixes, security updates, and functionality improvements be completed yesterday.

Another challenge is institutional brain drain. Inevitably, experienced developers depart the software team, and the loss of that expert knowledge extends the amount of time it takes new developers to understand the applications because there are fewer experienced colleagues they can rely on for assistance.

How do software teams cope?

Application change is required but the lack of understanding introduces risk. To decrease immediate costs and risks, developers and managers may choose to use short-sighted strategies.

Don’t touch the black box

One technique programmers use to avoid breaking unmanageable applications is building separate subsystems (Figure 1).


Just copy the whole damn thing

Another tactic is to duplicate the applicable code (Figure 2). Good development practices recommend editing code rather than duplicating it, but if developers don't understand the code they are editing or its interdependencies, they risk breaking the old functionality.

Instead, developers leave the applicable code in the application, but copy it and place the copy in a new location. Then they modify the duplicate code, hopeful that by leaving the original code in place, they won’t break its functionality.


Duplicating code reduces the risk of breaking the application in the short-term but increases maintenance costs and program complexity in the future. By adding 15% to the code base annually, it will double in just 5 years, making maintenance that much more difficult, expensive, and risky.


Conclusion

Companies face a difficult situation when they choose short-terms strategies that avoid immediate cost and risk but end up creating long-term technical debt.

The solution is to ensure that developers understand the code completely to make sound development decisions. However, the speed of business and technical change affords few organizations the time needed to completely understand their applications.

To learn more about how Phase Change's COBOL Colleague helps developers understand complex COBOL-based applications and make changes quickly and confidently, visit Phase Change's product page.

Elizabeth Richards is Phase Change's Director of Business Operations. You can reach her at [email protected].
Todd Erickson is a Technology Writer with Phase Change. You can reach him at [email protected].

Contact

651 Corporate Circle
Suite 209A
Golden, Colorado 80401
Phone: +1.303.586.8900
Email: [email protected]

© 2024 Phase Change Software, LLC