Testers face distinct hurdles when it comes to applications that use artificial intelligence and machine learning techniques. These systems are mainly black boxes that process input in a series of layers and produce a response using various algorithms—sometimes hundreds of them.

While testing can be a difficult task for any program, at its most basic level, it is ensuring that the results provided are those that are expected for a given input. That’s a difficulty with AI/ML systems. The software provides a response, but testers have no method of determining whether it is the correct one. It isn’t always obvious because testers don’t always know what the correct answer is for a particular set of inputs.

Some application results may even be amusing. Individual e-commerce recommendation engines frequently make factual errors, but as long as they collectively persuade shoppers to add things to their carts, they are deemed an economic success. And how can you know whether your machine learning application will be successful before deploying it?

As a result, the definition of a correct answer is determined not only by the application, but also by the level of accuracy necessary. It’s simple if the answer needs to be exact, but how near is close enough? Is it always going to be close enough?

For testers, this is the ultimate black hole. You can’t tell objectively whether or not a solution is right unless you have a working statistical definition of accuracy that is based on the needs of the problem area.

From then, things only get worse. Even with binary answers, testers may have no notion whether an answer is correct or incorrect. While it may be able to go back to the training data and locate a similar scenario in certain cases, there is still no obvious way to confirm results in many cases.

Does it make a difference? Yes, perhaps much more than in traditional business applications. In a standard commercial application, the great majority of outputs may be easily categorized as accurate or incorrect. It is not necessary for testers to understand how the underlying algorithms work, however it would be beneficial if they did.

The applications of machine learning aren’t immediately apparent. A result may appear right, but it could be incorrect due to bias or misinterpreted training data. However, inaccurate answers can also be the result of deploying an incorrect machine learning model that gives less-than-optimal results on a regular or recurring basis. Explainable AI (XAI) can aid in this situation.

Explainable AI explained

XAI is a means for an AI or machine learning program to explain why it came to a particular conclusion. XAI allows a tester to grasp the logic between inputs and outputs that would otherwise be impenetrable by providing a defined path from input to output.

XAI is a very new field, and most commercial AI/ML applications have yet to adopt it. The term’s techniques aren’t well defined. While having a justification for a result might help app users gain trust, any explanation also aids development and testing teams in validating algorithms and training data and ensuring that the results appropriately reflect the issue area.

Pepper, the SoftBank robot that responds to tactile stimulus, is a fascinating example of an early XAI project. Pepper has been designed to speak through its commands as they are being carried out. Users can comprehend why the robot is completing specific sequences of actions by talking through the instructions, which is a type of XAI. Pepper will also spot any inconsistencies or ambiguities during this process and will know when to seek clarification.

Consider how a program feature like this could help testers. The tester can get a result using test data, then question the program how it got that result, working through the process of changing the input data to demonstrate why the result is correct.

But that’s only scraping the surface; XAI must cater to a variety of stakeholders. It can assist developers in validating the technological approach and techniques used. It assists testers in confirming correctness and quality. It is a method of establishing trust in the program for end users.

The three legs of the XAI stool

So, how does XAI function? Although there is still a long way to go, there are a few strategies that show potential. Transparency, interpretability, and explain ability are the guiding principles of XAI.

  • Transparency refers to the ability to see into the algorithms and see how they analyze input data. While this does not reveal how those algorithms are trained, it does reveal the road to the findings and is designed for the design and development teams to comprehend.
  • The interpretability of the results refers to how they are presented for human comprehension. To put it another way, if you have an application and get a specific result, you should be able to see and understand how that result was obtained using the input data and processing algorithms. Between data inputs and result outputs, there should be a logical path.
  • While scholars try to clarify exactly how explain ability works, it remains a hazy idea. We may want to provide support for inquiries into our results, as well as extensive descriptions of more specific processing processes. However, until a greater consensus is reached, this characteristic will remain a grey area.

XIA techniques

Several strategies can aid explain ability in AI/ML applications. These are more likely to make quantitative assumptions about how to qualitatively interpret a given result.

Shapley values and integrated gradients are two common strategies. Both provide quantifiable assessments of how each set of data or feature contributes to a specific outcome.

The contrastive explanations approach, likewise, is an after-the-fact computation that attempts to separate specific results in terms of why one result occurred over another. To put it another way, why did it return this result rather than that one?

This is another quantitative metric that ranks the probability of one outcome over another. The numbers indicate where the input’s strength is relative to the outcome.

Data gets you only partway there

We don’t have any other approach to deliver explanations than data science because AI/ML systems rely on data, and data manipulation must use quantitative methodologies for explain ability. Numerical weights may play a part in interpretability, but they are still a long way from fully explain ability.

These strategies are important for AI/ML development teams to understand and apply, both for their own benefit and for the benefit of testers and users. In instance, without some level of explication of the result, testers may be unable to evaluate whether or not the returned result is correct.

Testers must be able to determine where results come from in order to ensure the quality and integrity of AI/ML systems. XAI is a start, however completely realizing this technology will take some time.

For more info: https://mammoth-ai.com/testing-services/

Also Read: https://www.guru99.com/software-testing.html