SE Radio 674: Vilhelm von Ehrenheim on Autonomous Testing
Vilhelm von Ehrenheim, co-founder and chief AI officer of QA.tech, speaks with SE Radio's Brijesh Ammanath about autonomous testing.
The discussion starts by covering the fundamentals, and how testing has evolved from manual to automated to now autonomous. Vilhelm then deep dives into the details of autonomous testing and the role of agents in autonomous testing.
They consider the challenges in adopting autonomous testing, and Wilhelm describes the experiences of some clients who have made the transition. Toward the end of the show, Vilhelm describes the impact of autonomous testing on the traditional QA career and what test professionals can do to upskill.
This episode is sponsored by Fly.io.
Vilhelm von Ehrenheim, co-founder and chief AI officer of QA.tech, speaks with SE Radio’s Brijesh Ammanath about autonomous testing. The discussion starts by covering the fundamentals, and how testing has evolved from manual to automated to now autonomous. Vilhelm then deep dives into the details of autonomous testing and the role of agents in autonomous testing. They consider the challenges in adopting autonomous testing, and Wilhelm describes the experiences of some clients who have made the transition. Toward the end of the show, Vilhelm describes the impact of autonomous testing on the traditional QA career and what test professionals can do to upskill.
Show Notes
Related Episodes
Other References:
Transcript
Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.
Brijesh Ammanath 00:00:18 Welcome to Software Engineering Radio. I’m your host, Brijesh Ammanath. Today I will be discussing autonomous testing with Vilhelm von Ehrenheim. Vilhelm is the co-founder and Chief AI officer of QA.Tech, a startup that develops autonomous agents that can interact and test the functionality of webpages. He has over 10 years of experience in data science and machine learning domain before co-founding QA.Tech, Vilhelm built Mother Brain at EQT. Vilhelm has published papers in prestigious conferences such as EMNLP, KTD and CIKM. Vilhelm, welcome to the show.
Vilhelm von Ehrenheim 00:00:54 Thank you. I’m very glad to be here.
Brijesh Ammanath 00:00:56 We’ll start with the fundamentals, if you can help by defining what is autonomous testing and how does it differ from traditional automated testing?
Vilhelm von Ehrenheim 00:01:06 Yeah, so I like to think of the testing and the levels of autonomy in different stages. So the first stage is manual testing where nothing is really automated. You’re just doing everything as a human and try to potentially repeat the same thing again as you have done before. The next stage is where you start using automation, so scripts or different kinds of programs that can repeatedly do the same things over and over that has been popularized by tools like Cypress, Selenium and Playwright. Today we see more and more things that comes into a new category called autonomous testing where we level up the level of autonomy even more. So instead of it being hard coded scripts, you focus more on either self-healing so that you can kind of don’t have to spend as much time developing and maintaining the test suites that you have or you have fully autonomous agents that can understand and validate different kinds of objectives that you want the page to support.
Brijesh Ammanath 00:02:12 Right. Can you expand on that a bit more and maybe walk us through the evolution of software testing? How did it evolve from manual to automated and now to autonomous?
Vilhelm von Ehrenheim 00:02:24 Yeah. I think the manual side of thing comes pretty natural. When you have built something that you want to ship to a potential customer or a user, you want to make sure that it works. And this is something that I think most developers are very accustomed to. You try the different features that you have built, you click around or you interact with it in different ways to make sure that it functions the way that you have intended. The automation of that comes natural. So when you have different ways that you want to test your software, usually you use different kinds of testing in different layers. So you have things like unit tests, testing specific snippets of code, you have integration tests, making sure that stuff works across different systems. Then you have the end-to-end tests where you script that something is working in the browser, in the application or something, and kind of program hard code those steps.
Vilhelm von Ehrenheim 00:03:20 So, for example, maybe you have a possibility to send an invoice in your system or do a checkout, for example. Then you script what should be filled in and you make sure that it clicks on the right buttons and then you wait and try to validate that it went through as expected. On the autonomous side, well first of all, the automated tests are pretty hard to maintain. When you think about hardcoded things in general, they’re very brittle to change. And what’s problematic with scripting something and testing that against a system that is continuously evolving and changing is that then those tests will continuously break. So, when you build a new feature or you change something in your checkout flow, then suddenly all of your tests are failing, not because it’s no longer functioning, but because they no longer do the right thing.
Vilhelm von Ehrenheim 00:04:17 So the buttons have changed or the identifiers on the page are not the same anymore, and that then requires the developer to go back to that code and also update the test suites to make sure that they kind of adhere to the new changes that you made. On the autonomous level, we try to mitigate that by erasing the abstraction one more layer again. AI and Machine Learning systems are essentially designed to be able to handle a vast kind of range of changing input parameters and still produce like a reasonable answer. So essentially generalizing across a lot of different potential things that could happen, which is the same as a human would do. So for example, if I added an extra button in a step in a checkout, then I wouldn’t fail the test because I understand that, oh, that’s a new button and I can take a decision not to click it or interact with it in different ways and still be able to complete the checkout. And this is where AI comes in as well. If we change the application in any of the numerous ways that we normally do when we develop them, it’s then possible to let AI understand and take decisions in real time when it’s doing the testing instead of having to rely on updating all of these tests that we have created before.
Brijesh Ammanath 00:05:31 And that is what you meant by self-healing tests?
[...]