Paper: Agentic Systems in Radiology: Design, Applications, Evaluation and Challenges
Manage episode 516608508 series 3698658
Beyond building ever more capable AI models, much can be gained from making the models we already have work better with our systems and workflows. This is a critical step into making these models more useful for radiology, a field of highly complex, multi-step tasks.
Building artificial agents — systems that observe their environment and act on it — has long been a goal of AI. The emergence of large language models (LLMs) has recently made this pursuit much more feasible, thanks to their remarkable ability to process and interact in natural language. This enables LLMs to plan, reason, and make decisions, though how far that capability extends remains a matter of active research. Embedding LLMs within frameworks where they can iteratively refine their context and select between tools allows them to interact flexibly with connected systems.
In a new review, we explored how LLMs can help radiology move beyond fragmented "islands of automation" toward adaptive, context-aware systems. This enables a new form of (semi-)automation across a spectrum of autonomy, from LLM-assisted workflows where models make targeted, predefined contributions to more autonomous LLM-based agents that operate in feedback loops, adapting dynamically to evolving tasks and information.
In this work, we:
🔹 Define what LLM-driven agents are
🔹 Outline the technical foundations of agentic AI: planning, reasoning, memory, and tool use
🔹 Frame radiology as an "agent environment" of IT systems (PACS, RIS, HIS) and stakeholders that LLMs can connect with and act upon through interfaces such as MCP, FHIR, and DICOMweb
🔹 Illustrate real-world use cases from report consistency checking to semi-automated lung cancer screening and adaptive multidisciplinary team assistance
🔹 Propose an evaluation framework covering planning, execution, outcomes, and system-level impact
🔹 Discuss the challenges ahead, from cascading errors to governance, safety, and human–AI collaboration
Authors: Christian Bluethgen, Dave Van Veen, Daniel Truhn, Jakob Nikolas Kather, Michael Moor, Małgorzata Połacin, Akshay Chaudhari, Thomas Frauenfelder, Curtis Langlotz, Michael Krauthammer, and Farhad Nooralahzdeh.
Full preprint: https://arxiv.org/abs/2510.09404
Narrated by NotebookLM.
This podcast walks you through the main findings of the review.
One episode