Enabling AI to explain its predictions in plain language
Machine learning models can be fallible and difficult to use, so scientists have developed interpretive methods to help users understand when and how to trust the model’s predictions.
However, these explanations are often complex and may contain information about hundreds of model features. They are sometimes presented as multifaceted visualizations that may be difficult to fully understand for users lacking machine learning expertise.
To help people understand artificial intelligence explanations, MIT researchers used large language models (LLMs) to convert plot-based explanations into simple language.
They developed a two-part system that converts machine learning explanations into human-readable text passages, then automatically assesses the quality of the narrative so end users know whether to trust it.
Through some example explanation prompt systems, researchers can customize their narrative descriptions to meet user preferences or the requirements of specific applications.
In the long term, the researchers hope to build on this technology by allowing users to ask follow-up questions to the model and understand how it makes predictions in the real world.
“Our goal with this study is to take the first step in enabling users to have a full conversation with machine learning models and understand why they make certain predictions, so they can make better decisions about whether to listen to the model. decision,” said Alexandra Zytek, an electrical engineering and computer science (EECS) graduate student and lead author of the technical paper.
MIT postdoc Sara Pido co-authored the paper. Sarah Alnegheimish, EECS graduate student; Laure Berti-Équille, research director of the French National Institute for Sustainability; and senior author Kalyan Veeramachaneni, principal research scientist at the Information and Decision Systems Laboratory. The research will be presented at the IEEE Big Data Conference.
clarify explanation
The researchers focused on a popular machine learning interpretation called SHAP. In the SHAP interpretation, each feature used by the model to make predictions is assigned a value. For example, if the model predicts house prices, one of the features might be the location of the house. Positions will be assigned a positive or negative value, indicating how much the feature modifies the model’s overall predictions.
Typically, SHAP explanations are presented as bar graphs showing which features are most or least important. But for models with more than 100 features, this bar chart can quickly become unwieldy.
“As researchers, we have to make a lot of choices about what to present visually. If we choose to only show the top 10, people might wonder what’s going on with another feature that’s not in the plot. Using natural language allows us to You don’t have to make those choices,” Veeramachaneni said.
However, rather than leveraging large language models to produce natural language explanations, the researchers used LL.M.A. to convert existing SHAP explanations into readable narratives.
Zytek explains that having the LL.M. handle only the natural language part of the process limits the opportunity to introduce inaccuracies into interpretations.
Their system, called EXPLINGO, is divided into two parts that work together.
The first component, called NARRATOR, uses LLM to create a narrative description of the SHAP interpretation that satisfies the user’s preferences. By initially providing the narrator with three to five written examples of narrative explanations, the LL.M. will model this style when generating the text.
“Let users write what they want to see, rather than having users try to define what type of explanation they are looking for,” Zytek said.
This allows NARRATOR to be easily customized for new use cases by presenting it with a different set of hand-written examples.
After NARRATOR creates explanations of simple, understandable language, the second component, GRADER, uses LLM to score narratives based on four metrics: conciseness, accuracy, completeness, and fluency. GRADER will automatically prompt LLM with the text of NARRATOR and the SHAP explanation it describes.
“We found that even if the LL.M. made a mistake while performing a task, they generally did not make a mistake when checking or verifying that task,” she said.
Users can also customize GRADER to assign different weights to each indicator.
“You could imagine, for example, that in high-stakes situations, weighted accuracy and completeness would be much higher than fluency,” she added.
Analyze narrative
One of the biggest challenges for Zytek and her colleagues was adapting the LL.M. so that it produced natural-sounding narratives. The more guidelines they add for controlling style, the more likely it is that the LL.M. will introduce errors into the interpretation.
“It takes a lot of real-time adjustments to find and fix each bug one at a time,” she said.
To test their system, the researchers used nine machine learning datasets with explanations and had different users write narratives for each dataset. This allowed them to evaluate NARRATOR’s ability to imitate a unique style. They used GRADER to score each narrative explanation for all four indicators.
Ultimately, the researchers found that their system could produce high-quality narrative explanations and effectively mimic different writing styles.
Their results show that providing some manually written exemplar explanations can significantly improve narrative style. However, these examples must be written carefully – including comparative words, such as “bigger”, may cause GRADER to flag an accurate interpretation as incorrect.
Based on these results, the researchers hope to explore techniques that can help their systems handle comparison words better. They also hope to extend EXPLINGO by adding plausibility to explanations.
In the long term, they hope to use this work as a stepping stone to an interactive system where users can ask the model follow-up questions about the interpretation.
“This will help with decision-making in many ways. If people disagree with the model’s predictions, we want them to be able to quickly figure out whether their intuition is correct, or whether the model’s intuition is correct, and where the difference comes from,” Zytek said.
2024-12-10 21:35:17