Sometime back, I wrote about software code being generated by generative software. The possibility of code being generated by Artificial Intelligence or otherwise is very real, and industry leaders are predicting that the move towards automation of code generation will happen in this decade itself as there is a demand from the businesses towards faster delivery. A recent 2022 DevOps report blog bluntly points to this possibility. Let’s talk about testing generated software in this blog.
Sometime back, generative software started as an experimental thing. As the startups and enterprises are looking to generate GUI code faster based on customer feedback, a bunch of GUI generative software came into existence, even without using Artificial Intelligence (with simple rule-based and specification-based directives). At that point of time, the scope of generative software was restricted to GUI as it was necessary for quick prototypes.
With the success seen with GUI software, organisations are moving towards generative software even for domain rules (backends). Let’s look at the challenges involved in testing the code generated.
Automated code is just catering to the limited scope of specifications. Agreed that from a code perspective, redundant or unnecessary code is a waste as well as it could be dangerous. But at the same time, considering various possibilities of how the software might behave is tough, even with writing software units using approaches like TDD. We could do a bunch of things like catering to the unit’s specification expectations, boundary value checks, and other generally applicable checks, but when the units come together for integration, the complexity increases. We know that it is tough even with careful human inspection and care, and it’s going to even more challenging with blind generation of software.
As far as testing this goes, should more automation checks check against automated code? For speed sake, companies may think so, but from a Quality perspective, it would be disastrous. For one, there will be a lot of going back-and-forth between the automated code and automated checks and the associated maintenance (like we already do with GUI automated low-code/no-code tools!). And then if the code is generated using Machine Learning, there will be a struggle to get the model right. Most of such models use Neural Networks, and there’s no way for explaining themselves on why such code is generated. And who is going to review that reasoning even if exists? Obviously, Software Testing folks are going to look for explanations on the reasons behind the way the code is generated.
Yesterday, I wrote about Continuous Testing, and certainly there will be voices that would advocate that no humans should be involved in the continuous pipeline and the software should be tested with checks. In reality, this won’t materialize as the checks will fail to the rigor of what exists out there in production (based on the production feedback).
Even if we assume functional programming whose code does not have complexities like global variables and branching, I think it is going to be challenging to generating and testing the code.
We got to wait and see how this pans out. Would generative software and automated checks replace humans? On the contrary. There would be a lot of work for the humans sorting out the mess. Just that the challenges would get bigger.
I suggest – slow down and give humans a chance.
Feel free to chat with me on testing generated software.