Transforming business operations is a constant need, and the pandemic-prompted emphasis on modernizing legacy computing systems has forced organizations across industries to accelerate their modernization plans. The problem with mainframe modernization, however, is that today’s code search tools, linters, and program analysis tools are deficient when it comes to mitigating the risks associated with improving and even simply maintaining legacy systems.
Phase Change President Steve Brothers recently authored a contributed article for DevOps.com about how artificial intelligence (AI) tools can help developers work more productively and decrease the risks associated with legacy system modernization and maintenance.
The article, "How AI Can Improve Software Development," explains how today's bug localization, code visualization, and error detection tools don't actually identify specific lines of code that require change. And, once the code is identified, developers are still required to build mental models of their applications to make sure any source code changes don't make even more bugs or crash the entire system.
Through intelligence augmentation, AI can automate the identification of specific lines of code that require change – developers simply ask the AI-driven knowledge repository where unwanted behaviors are coming from, and the AI quickly identifies the code associated with that behavior. Also, before the developers compile or check in the new code, the AI can forward simulate the changes and validate that they won't create more problems or break the system.
A technical paper co-authored by current and former Phase Change research scientists, and presented at the 2021 annual ICSME event, won a Distinguished Paper Award from the IEEE Computer Society Technical Council on Software Engineering (TCSE).
The paper, "Contemporary COBOL: Developers' Perspectives on Defects and Defect Location," was co-authored by current Phase Change Senior Research Scientist Rahul Pandita, former Senior Research Scientist Aleksander Chakarov, and former intern Agnieszka Ciborowska. It was presented at the 37th annual International Conference on Software Maintenance and Evolution (ICSME) 2021 in Luxembourg City, Great Duchy of Luxembourg, September 27 - October 1.
The authors presented results from surveys of COBOL and more modern programming languages regarding defects and defect-location strategies. While the software industry has made substantial advances in maintaining programs written in modern languages, mainframe programs have received limited attention. Meanwhile, mainframe systems face a critical shortage of experienced developers and replacement developers face significant difficulties even during routine maintenance tasks.
Pandita, who has already co-authored a number of published papers, said that this award is particularly gratifying because all of the authors were working together at Phase Change when it was written, and that he hopes it is just the first of many more like it.
This is the fourth published technical paper co-authored by Phase Change scientists, the third to be presented at scientific conferences, and the second to win a distinguished paper award.
Although the modern enterprise moves quickly to adopt and support helpful new technologies, most organizations must continue to rely on their legacy systems for core functions. Legacy applications struggle to evolve fast enough to support shifting and evolving organization demands. The companies frequently try alternate strategies to keep pace, such as building on top of existing applications or moving them to other platforms, but these approaches only complicate another risk -- the software developer shortage.
On May 19, BetaNews.com published the article, "Leveraging AI to close the application knowledge gap," which was written by Phase Change President Steve Brothers. The story explains how the software-developer shortage forces many companies to work around legacy applications when they lose the expert developers that built and maintained them, and how those word-arounds can produce disastrous results for the organizations' bottom lines and reputations.
Steve also describes how artificial intelligence (AI) can reinterpret what source-code computations represent and convert them into concepts so developers no longer have to research and discern the original developers' intent. This enables new developers to quickly understand the applications' behaviors, and with that knowledge, the AI can quickly guide developers to the precise area of code where changes need to be made.
A team of Oregon State University scientists partnered with Phase Change Research Scientist Rahul Pandita to study how cognitive biases affect software developers' everyday behavior. The resulting academic paper, "A Tale from the Trenches: Cognitive Biases and Software Development," was recently recognized by ICSE 2020 as an ACM SIGSOFT Distinguished Paper.
According to OpenResearch.org and ACM SIGSOFT, only 2% of all ICSE submissions earn Distinguished Paper Awards.
OSU researchers Nicholas Nelson and Anita Sarma enjoying time in Phase Change's offices.
"Bias is an essential tool for human cognition," said Rahul Pandita. "The presence of bias must not be automatically equated to something negative. In fact, some biases are extremely helpful in navigating the complexities of day to day life. The key is to understand how these biases operate. In the case of routine software development activities, such nuanced understanding allows us to develop effective intelligence augmentation (IA) technology to amplify the benefits of such biases and counter the detrimental effects."
The scientists conducted a two-part study from 2017-2018. Part one focused on observing Phase Change developers performing routine development tasks. They observed Phase Change developers at our offices for a week in March 2018.
“Getting to see the 'behind the scene' workings of this agile, innovative team was a great way of understanding how startups work," said Anita Sarma, an Associate Professor at Oregon State.
Part two involved triangulating their findings by interviewing developers from three other companies about how they perceive and deal with the observed biases found in Part One.
Research Scientists Anita Sarma, Nicholas Nelson, Souti Chattopadhyay, and Christopher Sanchez, along with research interns Audrey Au and Natalie Morales, co-wrote the paper with Pandita.
The research results were presented at ICSE 2020, the 42nd annual International Conference on Software Engineering, July 6-11 in Seoul, South Korea, and virtually from June 27-July 19. All of the Distinguished Papers were announced during a July 10 awards ceremony and appeared on other slides shown throughout the conference.
Todd Erickson is a Technology Writer at Phase Change Software. You can reach him at [email protected].
Phase Change research scientist Dr. Rahul Pandita recently had two co-written papers published in well-known research journals. The first paper, “Are vulnerabilities discovered and resolved like other defects?,” was published in the June 2018 volume of the Empirical Software Engineering: An International Journal and presented as a Journal First Paper at the 40th International Conference on Software Engineering (ICSE) in Gothenburg, Sweden.
The paper was co-written with Patrick Morrison, Dr. Xusheng Xiao, Dr. Ram Chillarege, and Dr. Laurie Williams. Patrick Morrison is a Ph.D. candidate in the Computer Science Department at North Carolina State University. Dr. Xusheng Xiao is an assistant professor in the Department of Electrical Engineering and Computer Science at Case Western University.
Dr. Ram Chillarege is the founder and president of Chillarege Inc. Dr. Laurie Williams is a professor, and the department head, at the North Carolina State University Department of Computer Science.
The paper
The goal of the project’s research was to determine if security defects (referred to as vulnerabilities in the paper) are discovered and resolved by different software-development practices in comparison to non-security defects. If true, technical leaders could use the distinction to drive security-specific software development process improvements.
The research consisted of extending Orthogonal Defect Classification (ODC), which is a well-established scheme for classifying software defects, to study process-related differences between vulnerabilities and non-security defects, and thereby creating ODC + Vulnerabilities (ODC+V). This new classification was applied to 583 vulnerabilities and 583 defects across 133 releases of three open-source projects – the Firefox web browser, phpMyAdmin, and Google’s Chrome web browser.
The study found that compared with non-security defects, vulnerabilities are found much later in the development cycle and are more likely to be resolved through changes to conditional logic. The results indicate opportunities may exist for more efficient vulnerability detection and resolution.
The paper was accepted by the 40th International Conference on Software Engineering (ICSE) that was held in Gothenburg Sweden, between May 27 and June 3, as part of the *ICSE 2018* Journal First Papers track. Dr. Williams presented it on May 31, 2018.
But wait, there’s more
The second paper, “Mapping the field of software life cycle security measures,” is scheduled to be published in the October 2018 issue of Information and Software Technology. It was co-written with Patrick Morrison, Dr. Laurie Williams, and David Moye, a program site lead with Aelius Exploration Technologies LLC.
The authors suspected that a catalog of software-development life cycle security metrics could assist practitioners in choosing appropriate metrics, and researchers in identifying opportunities for security measurement refinement.
They conducted a systematic mapping study, beginning with 4,818 papers and focusing on 71 papers reporting on 324 unique security metrics. For each metric, the researchers identified the subject being measured, how the metric had been validated, and how the metric was used. Then they categorized the metrics and included examples of metrics for each category.
The research found that approximately 85% of the security metrics studied were proposed and evaluated solely by their authors, leaving room for replication and confirmation through field studies. Approximately 60% of the metrics were empirically evaluated by their authors or others.
They concluded that the primary application of security metrics to the software development lifecycle is studying the relationship between properties of source code and reported vulnerabilities. This suggests that researchers need to refine vulnerability measurements and give greater attention to metrics for the requirement, design, and testing phases of development.
Rahul Pandita is a senior research scientist at Phase Change. He earned his Ph.D. in computer science from North Carolina State University. You can reach him at [email protected].
Todd Erickson is a tech writer at Phase Change. You can reach him at [email protected].
The AAAI conference is held each spring by the Association for the Advancement of Artificial Intelligence (AAAI) nonprofit and scientific society to promote research in artificial intelligence (AI) and scientific discussion among researchers, practitioners, scientists, and engineers in related fields.
The paper, Towards J.A.R.V.I.S. for Software Engineering: Lessons Learned in Implementing a Natural Language Chat Interface, was co-written by Chakarov and fellow research scientists Rahul Pandita and Hugolin Bergier.
"We're excited about the opportunity to share our work with researchers and get their feedback," Pandita remarked. "We consider it the first of many stepping stones to present the science behind Phase Change's technology."
Phase Change is developing a ground-breaking cognitive platform and an AI-based collaborative agent called Mia that will dramatically improve software development productivity and efficiency. Mia utilizes a natural-language chat interface so users can get up-and-running quickly.
Mia uses a natural language chat interface, much like the virtual assistants in other industries that have demonstrated the potential to significantly improve users' digital experiences.
The paper relates the lessons our developers learned during the first iteration of the Mia chat interface implementation, including:
Reusing components to quickly prototype
Gradually migrating from rule-based to statistical approaches
Adopting recommendation systems
The paper describes these lessons and others, including our experiences applying subliminal priming and the benefits of data-driven prioritization, in more detail.
The workshop
"I feel like we did a good job of setting up the context – what problems we are solving, what our approach is – and then we moved to the takeaways very quickly," Aleksander said about his experience presenting the paper. "People were engaged."
He also described two comments made during his session's brief Q&A time. The first commentator explained how current scientific research supports the paper's findings about subliminal priming and how conversations change over time.
The second commentator discussed our use of rules-based approach at first to develop an optimal work environment and then gradually moving towards a statistical approach. He suggested that there is also a third tactic that uses simulations to quickly gather data and hasten the inclusion of statistical approaches. We will investigate his suggestions for further use.
We welcome your comments and observations.
Rahul Pandita is a senior research scientist at Phase Change. He earned his Ph.D. in computer science from North Carolina State University. You can reach him at [email protected].
Todd Erickson is a tech writer with Phase Change. You can reach him at [email protected].
Our research scientists recently published a workshop paper on the lessons learned implementing the company's natural-language chat interface. This post summarizes the key lessons learned and identifies the open questions we faced during our initial implementation.
Phase Change is developing a ground-breaking cognitive platform and an AI-based collaborative agent, called Mia, that will dramatically improve software development productivity and efficiency. Mia utilizes natural-language processing (NLP) chatbot capabilities so new users can use the technology immediately with little or no training.
The paper, Towards J.A.R.V.I.S. for Software Engineering: Lessons Learned in Implementing a Natural Language Chat Interface, was co-written by research scientists Rahul Pandita, Aleksander Chakarov, Hugolin Bergier, and inventor and company founder Steve Bucuvalas. The full paper text is available here.
The paper
Virtual assistants have demonstrated the potential to significantly improve information technology workers' digital experiences. Mia will help software developers radically improve program comprehension. Then we will gradually expand its capabilities to include program composition and verification.
Here are a few things we learned during the first iteration of the Mia chat interface implementation.
Reuse components to quickly prototype
Instead of building everything from scratch, consider reusing existing frameworks and libraries to quickly prototype and get feedback.
Gradually migrate from rule-based to statistical approaches
With the ever-increasing popularity and efficacy of statistical approaches, teams are often tempted to implement them without enough data to design an optimal work environment.
We have noticed that recent advances in transfer learning require only a small amount of data to begin reaping the benefits of statistical approaches. However, rule-based approaches still allow prototypes to get up-and-running with only a small amount of set-up time.
A rule-based-approach also allowed us to collect more data for a better understanding of the chatbot requirements, and future positioning to effectively leverage statistical approaches.
Adopt recommendation systems
In our testing phase, we learned that although users appreciated honesty when our chatbot did not understand a request, they didn't take it well (to put it mildly) when the chatbot did not provide a way to remedy the situation.
There can be many causes for the chatbot failing to understand a request. For instance, the request might actually fall outside the chatbot's capabilities, or, in our case, one class of incomprehensible requests were due to implementation limitations.
While we can't do much about the former, building a recommendation system for the later class of requests almost always proves beneficial and vastly improves user experience.
For example, the noise in a speech-to-text (STT) component is a major cause of incomprehensible requests. In our fictional banking system, we've created software that allows pets to interact with ATMs, and a Mia user might form a query to discover all of the uses cases in which the actor "pet" participates.
If the user says: "filter by actor pet," we could expect the following transcript from the STT component, which, unfortunately, caused the subsequent pipeline components to misfire:
filter boy actor pet
filter by act or pet
filter by act or pad
filter by a store pet
filter by actor pass
filter by active pet
filter by actor Pat
While users will most likely be more deliberate in their subsequent interactions with the STT component, we noticed that these errors are commonplace and very negatively affected user experience.
To remedy the situation, we used a light-weight, string-similarity-based method to provide recommendations. Subsequent observations indicated that users almost always liked recommendations - except when they were too vague.
To avoid annoying users, we came up with two heuristics. First, we provided no more than three recommendations. Second, to be considered as a candidate query for recommendation, the query's similarity measure had to score higher than an empirically determined threshold with respect to incoming requests.
Over time users stop using fully formed sentences
The novelty of using a natural language interface quickly wears off. We observed that most users began sessions by forming requests with proper English sentences, but the conversation was quickly reduced to keyword utterances. Chatbot designers should plan for this eventuality. 😉
Actually, I find this quite fascinating and the natural evolution of conversation. I think of this phenomena as mirroring our natural conversations. When we first meet someone new, we are deliberate in our conversation. However, overtime, conversations are more informal. But that is a topic for future posts. ~Rahul Pandita, Phase Change research scientist
Subliminal priming
In formal conversation study, the entrainment effect is informally defined as the convergence of the vocabulary of conversation participants over a period of time to achieve effective communication. We stumbled on this effect when we observed that users employed an affected accent to get better mileage out of the STT component.
In psychology and cognitive science, subliminal priming is the phenomenon of eliciting a specific motor or cognitive response from a subject without explicitly asking for it.
We decided to see if subliminal priming would expedite entrainment. We began playing back a normalized version of a query with the query responses. That simple change led users to quickly converge to our chatbot vocabulary.
Consider the frequencies of following user request variations in our system:
Query
# of users by
Test Subjects
list computations with a negative balance
30
filter for computations where output concept Balance is less than 0
17
filter by balance Less Than 0
16
filter by output concept balance is less than 0
09
show computations where output concept balance is less than 0
01
filter by output balance less than 0
224
By playing back "our system found following instances where output concept balance is less than 0," to each of these request responses, we observed that users began using the phrase "output balance less than 0," more, as shown in the frequency counts.
For the keen-eyed, notice that the repeated proper phrase, "filter by output concept balance is less than 0" is used less. However, remember that over time, users stop using fully formed sentences. We also observed that talking with affected American or British accents works. This may be a product of an unbalanced training set used during creation of the speech-to-text models. That's why fairness testing is important. But that is yet another topic for future posts. ~Rahul Pandita
Data-driven prioritization
We also realized the benefits of leveraging data to prioritize engineering tasks as opposed to going with your gut.
A pipeline design is often a used for chatbot realization. Like most pipeline designs, the efficacy of the final product is a function of how well the individual components work in tandem within the pipeline. Thus, optimizing the design involves iteratively tuning and fixing various individual components.
So how does one decide which components to tune first? This is where data-driven prioritization can really help. For instance, in our setting, a light-weight error analysis helped on more than one occasion to identify the components we needed to focus on.
I only imagine that data-driven prioritization will become more useful in the future as we experiment with statistical approaches that often have a pipeline design. ~Rahul Pandita
We hope that our observations will be helpful for those embarking on the journey to build virtual assistants. We would love to hear your experiences.
Rahul Pandita is a senior research scientist at Phase Change. He earned his Ph.D. in computer science from North Carolina State University. You can reach him at [email protected].
Todd Erickson is a tech writer with Phase Change. You can reach him at [email protected].
Only a few short years ago, only humans could interpret the meaning of text and speech. Now our cell phones understand our voices and language well enough to distinguish accents, metaphors, and sarcasm.
IBM's Watson supercomputer even understood Alex Trebek well enough to beat some of Jeopardy!'s® best players.
Computers achieve natural-language understanding through a series of logically consistent normalization steps -- starting with the processing of basic sounds to recognizing words and then understanding sentences.
If computers can understand natural language using logically consistent processes, shouldn't we be able to use similar processes to break down and normalize software?
In fact, shouldn't software be easier to normalize than the messy ambiguity of human communication?
The answer is yes.
Phase Change normalizes software source code into formal data types and organizes them into hierarchical structures that are probabilistically linked (horizontally and vertically). Our technology unlocks the vast domain and system knowledge embedded in software and makes it available to anyone involved in creating and supporting software.
To learn more about how Phase Change's revolutionary technology transforms chaotic code into coherent data and intractable software into artificially intelligent agents, read Steve Bucuvalas' paper: "An Analogy: Software AI and Natural Language."
This is the fourth and final in a series of practical talks by founder and CEO Steve Bucuvalas about Phase Change Software, what we are developing, the math and science behind our technology, and the impact on the software development process.
Using a whimsical example of dog banking, Steve discusses how the knowledge that’s encoded in software is normalized into a data structure, which enables us to create an assistive AI and solve the learning curve problem.
Podcast Slides and References
Time Stamps
Slides and References
00:11
Steve Bucuvalas Podcast – Equality: The fundamental operation for software as data -- science podcast 3 of 4
This is the third in a series of practical talks by founder and CEO Steve Bucuvalas about Phase Change Software, what we are developing, the math and science behind our technology, and the impact on the software development process.
In this podcast, Steve addresses the fundamental operation for software to be treated as data, which is equality, and begins by asking how we know when a fundamental unit of software is equal to something else? The first talk in this series introduces the idea of compiling programs into an AI representation. In the second talk, the Turing and Rice proofs are shown that they only apply to the mental domain of computation.
Podcast Slides and References
Time Stamps
Slides and References
00:28
Steve Bucuvalas Podcast – Changing the essence of software and creating breakaway efficiency — science podcast 1 of 4
00:36
Steve Bucuvalas Podcast – The Turing machine, the Halting problem, and Rice’s use of the Turing proof — science podcast 2 of 4