AI tool for NASH diagnosis receives first EMA qualification

The European Medicines Agency has qualified the first artificial intelligence-based tool designed to evaluate liver biopsies in patients with metabolic dysfunction-associated steatohepatitis (MASH). The system, called AIM-NASH, helps pathologists assess disease severity and reduce variability in histological scoring, with the aim of accelerating drug development for this increasingly common liver condition.

tool for NASH diagnosis

Breakthrough in MASH clinical trials

The European Medicines Agency (EMA) has granted a Qualification Opinion (QO) for AIM-NASH, the first artificial intelligence (AI) tool designed to assist pathologists in assessing liver biopsies for metabolic dysfunction-associated steatohepatitis (MASH, formerly known as non-alcoholic steatohepatitis or NASH). The milestone, announced on 20 March 2025, represents a significant advancement in standardising histological assessment for MASH clinical trials.

MASH is characterised by liver fat accumulation causing inflammation and scarring without significant alcohol consumption. It’s closely associated with obesity, type 2 diabetes, hypertension, dyslipidaemia, and central adiposity. If untreated, MASH can progress to advanced liver disease, including cirrhosis and liver failure.

The AIM-NASH tool employs a machine learning model trained on over 100,000 annotations from 59 pathologists who assessed more than 5,000 liver biopsies across nine large clinical trials. The system helps pathologists analyse liver biopsy scans to determine MASH severity using the NAFLD Activity Score (NAS) and fibrosis staging.

Reducing variability in MASH assessment

A critical challenge in MASH clinical trials has been the high variability in histological assessments. Even expert pathologists frequently disagree on the severity of inflammation or scarring in biopsy samples, making it difficult to reliably measure treatment efficacy.

According to the EMA’s Committee for Medicinal Products for Human Use (CHMP), evidence demonstrates that AIM-NASH biopsy readings, verified by one expert pathologist, can reliably determine MASH disease activity with less variability than the current standard, which typically requires consensus among three independent pathologists.

The qualification findings indicate that the AI-assisted approach is superior to manual pathologist scoring for hepatocellular ballooning and lobular inflammation—two particularly challenging aspects of MASH assessment—and non-inferior for steatosis and fibrosis evaluation.

“The tool is an aid to a single central pathologist that is to be used for enrolment/inclusion of patients into clinical phase 2 and phase 3 trials in MASH as well as for the evaluation of the study outcomes (primary or secondary) in case this is intended to be based on histology evaluation,” states the EMA qualification document.

Enhancing clinical trial efficiency

The AIM-NASH system is expected to improve both the reliability and efficiency of clinical trials for new MASH treatments. By reducing variability in measuring disease activity, researchers may obtain clearer evidence on treatment benefits with fewer patients. This could ultimately accelerate the delivery of effective treatments to patients.

The qualification opinion specifies that “the pathologist will review the output of AIM-NASH and take an active role in its interpretation by accepting or rejecting each of the NAS components and fibrosis stage, after confirming sample evaluability and determining the presence of any additional findings.”

Importantly, the tool is not intended to replace pathologist expertise but rather supplement it. The pathologist maintains final decision-making authority, with the AI serving as a support tool to enhance consistency.

Technical validation and performance

The AIM-NASH tool underwent extensive validation through multiple studies, including standalone analytical verification, integrated analytical verification, platform validation, overlay validation, and clinical validation.

In clinical validation studies involving over 1,500 cases, AIM-NASH demonstrated weighted kappa values of 0.677 for steatosis, 0.419 for lobular inflammation, 0.563 for hepatocellular ballooning, and 0.653 for fibrosis when compared to the ground truth consensus. These results were either non-inferior or superior to independent manual reads by pathologists.

The system’s reproducibility was also evaluated, with agreement rates between AIM-NASH scoring on slides scanned by different operators and scanners exceeding 85% for hepatocellular ballooning and approaching this threshold for other NASH components.

“For this proposed context of use, performance must satisfy both high levels of accuracy and consistency or reproducibility requirements,” noted the qualification document. “The combination of the accuracy demonstrated overall for steatosis, fibrosis, inflammation, and ballooning, and for the specific clinical trial composite scores comprising a large range of disease activity with varying individual histologic component scores and the superior repeatability/reproducibility of AIM-NASH compared to manual pathology should result in more accurate, standardized and consistent enrolment and detection of steatosis grade change or fibrosis stage change for a patient in a trial.”

Future considerations

The qualified tool is “locked,” meaning the machine learning model cannot be modified or replaced without potential re-qualification. However, the CHMP encourages optimisation of the model and acknowledges that major changes may require reassessment.

This qualification is part of EMA’s broader AI workplan, which aims to ensure safe and responsible use of artificial intelligence across the European medicines regulatory network. As the first AI tool qualified for diagnostic use in MASH, AIM-NASH represents an important step toward improving clinical trial design and accelerating drug development for this increasingly prevalent liver disease.