AI-generated Doctor’s Letters show promise in German healthcare study

Researchers at the Medical Center – University of Freiburg demonstrate that artificial intelligence can effectively produce medical documentation, potentially reducing physicians’ administrative burden.

 

doctor's note

Artificial intelligence (AI) may soon be playing a larger role in medical documentation, according to a new study from researchers at the Medical Center – University of Freiburg. The study, published in JMIR Medical Informatics, found that AI language models can generate high-quality doctor’s letters, with the best-performing model producing usable documents 93.1% of the time. This breakthrough could significantly reduce the administrative workload for medical professionals, allowing them to focus more on patient care.

The challenge of medical documentation

In today’s healthcare environment, physicians spend a considerable amount of time on documentation tasks. Recent surveys indicate that doctors dedicate nearly three hours daily to these activities, often extending beyond their regular work hours. This administrative burden has been linked to increased burnout rates and decreased work-life satisfaction among medical professionals.

Dr. Christian Haverkamp, Acting Director of the Institute for Digitalization in Medicine at the Medical Center – University of Freiburg and the study’s leader, emphasises the potential impact of their findings: “Our results show that models specially trained for the German language can provide valuable support in the creation of medical reports. This could significantly simplify workflows in everyday clinical practice.”

Training AI for medical writing

The research team, led by first author Felix Heilmeyer, utilised 82,482 unique patient encounters spanning approximately a decade of clinical practice at the Department of Ophthalmology. This extensive dataset, comprising about 140 MB of uncompressed text, was used to train several AI language models.

The researchers explored different model architectures and training techniques, focusing on models with around 7 billion parameters. This relatively modest size, compared to behemoths like GPT-3 with 175 billion parameters, was chosen to make the technology more accessible and economically viable for healthcare providers.

Interestingly, the study found that a model specifically designed for the German language, BLOOM-CLP-German, outperformed more extensively trained multilingual models like LLaMA and LLaMA-2. This suggests that language alignment may be more crucial than extended training periods for generating high-quality medical documentation.

Evaluating AI-generated letters

The research team employed a two-step evaluation process to assess the quality of the AI-generated doctor’s letters. First, they used another AI model, Claude-v2, to compare the generated text with human-written reports across three main categories: main diagnosis, therapeutic procedures, and recommendations for further intervention.

Following this automated evaluation, two independent expert senior physicians manually reviewed 102 reports generated by the best-performing model. The results were impressive, with 93.1% of the reports deemed suitable for use with minor or no changes.

Addressing legal and ethical concerns

One of the key advantages of the approach taken in this study is its potential to address legal and ethical concerns surrounding the use of AI in healthcare. By using non-proprietary models that can be deployed on-site, healthcare providers can maintain full control over patient data and comply with strict data protection regulations, such as the European Union’s General Data Protection Regulation (GDPR).

Prof. Dr. Frederik Wenz, Chief Medical Director of the Medical Center – University of Freiburg, highlights the broader implications of this research: “The AI doctor’s letter is an excellent example of how much potential AI applications have in medicine. For such solutions, we need bright minds who are willing to experiment and develop new things. I am delighted that we have created an environment at the University Medical Center Freiburg that strongly promotes these activities.”

Future directions and limitations

While the results of this study are promising, the researchers acknowledge several limitations and areas for future work. The evaluation was conducted in a dedicated research setting, and it remains to be seen how AI writing assistance will perform in real clinical environments with more complex cases.

The team suggests that leveraging German clinical corpora for pre-training could provide useful in-domain semantics, potentially improving the models’ performance further. They plan to explore the use of their models in real-world settings in future studies.

As healthcare systems worldwide grapple with increasing administrative burdens on medical professionals, this research offers a glimpse into a future where AI can shoulder some of that load. By demonstrating the feasibility of using non-proprietary, on-site AI models to generate high-quality medical documentation, the study paves the way for more efficient, cost-effective, and compliant AI solutions in healthcare.

While there is still work to be done before such systems can be widely implemented, the potential benefits for both healthcare providers and patients are significant. As AI continues to evolve and improve, it may soon become an invaluable tool in the medical profession, allowing doctors to focus more on what matters most: patient care.

Reference:

Heilmeyer, F., Böhringer, D., Reinhard, T., et. al. (2024). Viability of Open Large Language Models for Clinical Documentation in German Health Care: Real-World Model Evaluation Study. JMIR Medical Informatics, 12, e59617. https://doi.org/10.2196/59617