본문으로 건너뛰기
← 뒤로

Artificial Intelligence in Proton Therapy: What Works, What Does Not, and What Is Next.

1/5 보강
Cancer journal (Sudbury, Mass.) 📖 저널 OA 4.5% 2024: 0/2 OA 2025: 0/1 OA 2026: 1/19 OA 2024~2026 2026 Vol.32(2)
Retraction 확인
출처

Li H, Yang M, Ndassi DK, Meng S, Jia X

📝 환자 설명용 한 줄

Artificial intelligence is increasingly shaping the evolution of proton therapy, with applications spanning imaging, treatment planning, quality assurance, adaptive workflows, and outcome modeling.

이 논문을 인용하기

↓ .bib ↓ .ris
APA Li H, Yang M, et al. (2026). Artificial Intelligence in Proton Therapy: What Works, What Does Not, and What Is Next.. Cancer journal (Sudbury, Mass.), 32(2). https://doi.org/10.1097/PPO.0000000000000821
MLA Li H, et al.. "Artificial Intelligence in Proton Therapy: What Works, What Does Not, and What Is Next.." Cancer journal (Sudbury, Mass.), vol. 32, no. 2, 2026.
PMID 41880273 ↗

Abstract

Artificial intelligence is increasingly shaping the evolution of proton therapy, with applications spanning imaging, treatment planning, quality assurance, adaptive workflows, and outcome modeling. Unlike conventional task-specific algorithms, modern AI methods, including machine learning and deep learning, enable integration of heterogeneous data and capture complex relationships across the clinical workflow. These capabilities are particularly relevant in proton therapy, where sensitivity to range uncertainty, anatomic variation, and biological heterogeneity presents persistent clinical and operational challenges. This review summarizes current and emerging AI applications in proton therapy, including image reconstruction and synthesis, segmentation, dose prediction, robustness and uncertainty management, biological optimization, and adaptive treatment strategies. We also discuss the expanding role of AI in quality assurance and workflow coordination, emphasizing the distinction between task-level automation and workflow-level intelligence. Finally, we address broader considerations related to clinical validation, safety, interpretability, economic value, and access, which will be critical for translating AI-enabled proton therapy into routine clinical practice.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (5)

📖 전문 본문 읽기 PMC JATS · ~40 KB · 영문

Introduction and Scope

1.
Introduction and Scope
Artificial intelligence (AI) has rapidly expanded across radiation oncology (RT) [1–4], influencing imaging[5], treatment planning[6], quality assurance[7], and outcome modeling[8]. In photon-based workflows, AI has primarily served to automate labor-intensive tasks or improve efficiency within well-established clinical paradigms[5, 9]. Proton therapy, however, presents a distinct clinical and technical context compared to photon-based RT, largely due to its sensitivity to patient-specific anatomy and tissue composition[10]. These characteristics introduce both opportunities and constraints for AI integration, such that the role of AI in proton therapy cannot be understood as a simple extension of experience from photon RT.
The clinical impact of this sensitivity in proton therapy is that small uncertainties in imaging, patient setup, or anatomical change can translate into clinically meaningful deviations in dose distribution[11]. To manage this risk, proton therapy workflows have evolved around conservative planning margins, robustness evaluation, and resource-intensive verification strategies that are designed to mitigate uncertainty rather than resolve it at the individual patient level[12, 13]. While these approaches have enabled safe clinical delivery, they impose practical limits on treatment adaptation, scalability, and broader access. It is within this tension between physical precision and operational burden that AI has the potential to play a transformative role.
Importantly, not all AI applications address the same problems or operate at the same level of the clinical workflow. Task-specific models, such as those used for image segmentation, dose prediction, or stopping power estimation, directly interact with physical or numerical components of treatment planning and delivery[14, 15]. Other forms of AI, including large language models, operate at a different layer, supporting reasoning, coordination, and interpretation across complex workflows.[16–18] Treating these approaches as interchangeable risks obscuring their distinct capabilities, limitations, and safety considerations.
The existing literature on AI in proton therapy reflects this diversity. Numerous studies have demonstrated technical feasibility for specific tasks, often within single institutions or controlled settings. Fewer studies have addressed generalizability, robustness, or clinical integration at scale. Moreover, the maturity of AI applications varies widely across domains, ranging from near-routine use in segmentation to exploratory investigations in biologically guided planning and outcome prediction. A critical synthesis therefore requires not only summarizing what has been attempted, but also assessing what is ready, what remains limited, and what directions are most likely to influence clinical practice.
In this review, we examine AI applications across the proton therapy workflow, with emphasis on areas where AI addresses challenges intrinsic to proton physics and clinical delivery. We focus on imaging and segmentation, stopping power and range modeling, treatment planning and optimization, quality assurance and adaptation, and cross-cutting workflow intelligence. Rather than providing an exhaustive catalog, we aim to distinguish demonstrated capability from aspirational applications, highlight domain-specific risks, and identify pathways toward responsible clinical integration. By doing so, we seek to clarify not only how AI is being used in proton therapy, but how it may realistically shape the field in the coming years.

Data Foundations

2.
Data Foundations
The effectiveness of AI in proton therapy is fundamentally constrained by data availability, quality, and diversity. Unlike more established photon therapy workflows, proton therapy is delivered at a relatively limited number of centers, each with distinct treatment platforms, imaging protocols, and planning practices. As a result, datasets suitable for AI development are often fragmented, institution-specific, and heterogeneous, posing challenges for model generalizability and clinical translation.
Proton therapy data are also inherently multimodal, encompassing volumetric imaging, structure sets, spot-level delivery parameters, and longitudinal treatment information. While this richness creates opportunities for advanced modeling, it also increases the complexity of data curation and harmonization. Differences in scanner calibration, beam models, and contouring conventions can introduce systematic variability that confounds naïve model training. AI systems trained without careful attention to these factors risk learning institution-specific artifacts rather than clinically meaningful patterns.
In addition, proton therapy represents a comparatively young modality relative to the photon therapy counterpart, and the characteristics of the data it generates have changed over time. Advances such as pencil beam scanning, intensity-modulated proton therapy, adaptive workflows, image guidance strategies, and emerging delivery techniques have continuously reshaped planning and treatment paradigms. Consequently, historical datasets may reflect outdated practices that are no longer representative of current clinical standards. This temporal heterogeneity in both technology and clinical practice introduces an additional layer of complexity for AI development, as models trained on legacy data may not generalize to modern systems without careful curation and contextualization. This issue must be addressed explicitly, particularly for longitudinal or multi-generation AI models.
Privacy and governance considerations further complicate centralized data aggregation[19]. Proton therapy datasets are often linked to rare disease populations or pediatric cohorts, where patient numbers are small and reidentification risk is heightened. Traditional approaches that rely on pooling data at a single site may therefore be impractical or undesirable, limiting the scale at which AI models can be developed and validated.
Federated learning has emerged as a promising strategy to address some of these challenges by enabling collaborative model training without direct data sharing[20]. In a federated framework, models are trained locally at participating institutions, and only model updates are exchanged and aggregated centrally. This approach allows institutions to retain control of sensitive patient data while contributing to a shared learning process. For proton therapy, where multi-institutional datasets are essential but difficult to centralize, federated learning offers a pragmatic pathway toward more generalizable AI models.
However, federated approaches introduce their own complexities. Variability in data distributions across institutions can lead to imbalanced learning, and differences in computational infrastructure may affect participation and performance. Moreover, federated learning does not eliminate the need for careful dataset definition, quality control, and external validation. Without consistent labeling standards and performance benchmarks, federated models may converge without achieving true clinical robustness.
In summary, data foundations represent both the primary enabler and the primary bottleneck for AI in proton therapy. Addressing issues of heterogeneity, privacy, and collaboration is essential for translating promising AI methods into reliable, multi-institutional clinical tools. Federated learning provides a viable framework for progress, but its success will depend on coordinated governance, shared standards, and sustained cross-center collaboration.

Artificial Intelligence Across the Proton Therapy Workflow

3.
Artificial Intelligence Across the Proton Therapy Workflow
3a.
Imaging and Segmentation
Imaging and target delineation form the foundation of proton therapy, where geometric accuracy and tissue characterization directly affect range estimation, dose distribution, and robustness[13, 21]. Conventional imaging and segmentation workflows rely heavily on manual effort and modality-specific assumptions, which become increasingly limiting as treatment complexity and adaptation demands grow. AI introduces capabilities that fundamentally change how imaging data can be generated, interpreted, and integrated into proton therapy workflows.
A key contribution of AI is the ability to generate modality-translated images that are suitable for proton dose calculation. Synthetic CT generation from cone-beam CT or magnetic resonance imaging addresses longstanding barriers to adaptive and MRI-only proton therapy workflows[15, 22, 23]. Traditional approaches to synthetic imaging require explicit physical modeling or deformable image registration, both of which are sensitive to anatomical change and imaging artifacts. In contrast, deep learning models learn voxel-level relationships directly from paired datasets, enabling rapid generation of synthetic CT volumes that preserve anatomical detail relevant to proton dose calculation. This capability reduces dependence on repeated diagnostic-quality CT scans and supports more frequent image-based assessment during treatment.
AI also uniquely enables scalable, consistent segmentation across complex anatomy. Manual contouring remains a major bottleneck in proton therapy, particularly for sites requiring extensive organ-at-risk delineation or repeated recontouring during adaptive workflows. Deep learning–based auto-segmentation models can produce contours with consistency that is difficult to achieve across planners and institutions, especially for small or anatomically variable structures.[5] While conventional atlas-based methods offer partial automation, they often struggle with large anatomical deviations or postoperative changes, where data-driven models demonstrate greater robustness.
AI also facilitates integration of time-resolved and four-dimensional imaging into proton workflows. Proton therapy is particularly sensitive to motion and anatomical variation, yet conventional segmentation strategies are typically static. Deep learning approaches applied to sparse or low-quality volumetric imaging enable reconstruction and segmentation of time-resolved datasets, supporting motion-aware planning and dose evaluation[24, 25]. This capability is especially relevant for pediatric and thoracic applications, where repeated high-dose imaging is undesirable and anatomical change is common.
Despite these advantages, limitations remain. AI-generated images and contours may introduce systematic errors that are not immediately apparent, particularly when models are applied outside their training domain. Image synthesis errors can propagate into dose calculation, and segmentation inaccuracies may disproportionately affect structures critical to proton range. As such, clinical deployment requires careful validation, uncertainty awareness, and integration with existing quality assurance processes rather than blind automation.
In summary, AI uniquely enables proton therapy imaging workflows by making synthetic imaging, large-scale segmentation, and adaptive recontouring feasible within clinical time constraints. These capabilities address long-standing bottlenecks in proton therapy and create a foundation for downstream advances in range modeling, planning, and adaptation, while underscoring the need for continued evaluation and governance.

3b.
Range and Stopping Power Modeling
Range uncertainty is not merely one of several technical challenges in proton therapy; it is the defining limitation that shapes treatment margins, robustness strategies, and clinical confidence[26]. Unlike photon therapy, where modest uncertainties in tissue composition have limited impact on dose deposition, proton therapy relies critically on accurate estimation of stopping power along the beam path. For decades, this dependency has been managed through population-based calibration curves, conservative margin recipes, and robustness evaluation frameworks that acknowledge uncertainty without resolving it at the individual patient level.[13, 27]
AI introduces a fundamentally different approach to this problem. Conventional stopping power estimation relies on explicit tissue classification and analytically derived relationships between Hounsfield units and relative stopping power, typically calibrated on population averages. In contrast, AI-based methods learn voxel-level mappings directly from imaging data and reference stopping power information, implicitly capturing nonlinear dependencies, imaging artifacts, and tissue heterogeneity that are difficult to model analytically[28–30]. This shift from calibration to data-driven inference is not incremental; it represents a change in how uncertainty itself is modeled.
A growing body of literature demonstrates the technical feasibility of AI-based stopping power estimation across a range of imaging modalities. Deep learning models trained on paired CT or dual-energy CT datasets have shown improved agreement with reference stopping power maps compared with conventional single-energy CT calibration, particularly under conditions of image noise, artifacts, and tissue heterogeneity that challenge analytic calibration methods.[30, 31] Importantly, these gains are not limited to ideal imaging conditions. Several studies have extended AI-based approaches to cone-beam CT, a modality long considered unsuitable for proton dose calculation due to scatter contamination and unreliable Hounsfield units. By learning modality-specific error characteristics, AI models have enabled generation of stopping power maps from cone-beam CT with accuracy sufficient for investigational adaptive proton therapy workflows.[15]
The implications of this capability are substantial. Cone-beam CT is acquired routinely during treatment, yet has historically been excluded from proton range evaluation. AI-based stopping power estimation effectively rehabilitates this data source, allowing anatomical changes that affect range to be assessed without repeated simulation CT scans. This directly addresses one of the major practical barriers to adaptive proton therapy, shifting adaptation from a theoretically appealing concept to an operationally plausible strategy.
AI similarly enables proton therapy workflows that decouple stopping power estimation from conventional CT altogether. MRI-only planning has long been attractive for its superior soft-tissue contrast and lack of ionizing radiation, particularly in pediatric and neuro-oncology settings. However, the absence of electron density information has constrained its use in proton therapy. AI-driven synthetic dual-energy CT generation and direct stopping power inference provide a practical pathway toward MRI-based proton planning, allowing proton-relevant physical information to be inferred from modalities optimized for anatomy rather than physics[32–34].
Beyond enabling new imaging workflows, AI highlights limitations of current robustness paradigms. Existing margin and robustness recipes treat range uncertainty as a fixed, population-level quantity, applied uniformly across patients, beam paths, and imaging conditions. This approach is intentionally conservative, but it is also blunt. Evidence from AI-based stopping power studies suggests that a significant fraction of the uncertainty absorbed into generic margins reflects limitations of the conversion model, such as image noise, partial volume effects, and image artifacts, rather than irreducible physical variability that sets the lower bound of the achievable certainty.
More importantly, AI enables a transition from deterministic to data-informed uncertainty characterization. Rather than representing range uncertainty as a single percentage value, data-driven models can, in principle, associate uncertainty with specific anatomical regions, imaging modalities, or acquisition conditions. The literature already hints at this possibility: performance of AI-based stopping power estimation varies systematically with image quality, anatomical complexity, and training domain coverage.[15, 35, 36] These dependencies are largely invisible to conventional calibration-based methods but become explicit when modeled statistically. This reframes robustness evaluation as a problem of uncertainty modeling rather than margin selection.
At the same time, AI complicates robustness in ways that must be acknowledged explicitly. Unlike analytic calibration curves, AI models may fail in structured, nonrandom ways when applied outside their training distribution. Errors in stopping power estimation may be spatially correlated and clinically subtle, potentially escaping detection through standard robustness perturbations. This introduces a nontrival risk of silent failure: a plan may appear robust under conventional setup and range perturbations while being systematically biased due to model error. Consequently, AI-based range modeling does not eliminate the need for robustness evaluation; it heightens the importance of coupling data-driven inference with physics-based safeguards and conservative clinical oversight.
Taken together, the literature supports a nuanced interpretation of AI’s role in proton range modeling. AI-based stopping power estimation has demonstrated technical feasibility and, in some settings, improved accuracy relative to conventional methods. However, its most transformative impact is unlikely to be immediate margin reduction. Rather, AI enables expansion of feasible imaging modalities, and opens the door to patient-specific uncertainty characterization. Whether this potential translates into routine clinical practice will depend less on algorithmic performance alone than on rigorous multi-institutional validation, explicit evaluation of failure modes, and disciplined integration with existing robustness frameworks.

3c.
Treatment Planning, Dose Prediction, and Biological Optimization
Proton therapy treatment planning is shaped by competing demands, including robustness to uncertainty, plan quality consistency, biological complexity, and clinical efficiency. While advances in optimization algorithms and computation power have improved efficiency, these approaches remain constrained by predefined abstractions that simplify delivery and optimization at the expense of patient-specific flexibility. AI introduces capabilities that complement, rather than replace, conventional planning by enabling data-driven guidance across multiple stages of the planning process.
One of the most established roles of AI in proton therapy planning is the ability to learn plan quality implicitly from prior clinical experience.[37] Deep learning models trained on curated proton plans can encode relationships among patient anatomy, beam arrangement, and achievable dose distributions without relying on hand-crafted planning rules.[38, 39] By reflecting how trade-offs are resolved in clinically accepted plans, these models can generate planning objectives or reference dose distributions that reduce inter-planner variability and reliance on expert manual tuning.[40, 41] In this context, AI functions as an intelligent prior that narrows the solution space to clinically realistic regions while preserving physician oversight and physics-based optimization.
AI also enables near-instantaneous dose estimation, which fundamentally alters how planning workflows can be structured. Conventional proton dose calculation and iterative optimization are computationally expensive, limiting the number of candidate solutions that can be explored. Deep learning–based dose prediction models provide rapid approximations of three-dimensional dose distributions[42], allowing planners to assess feasibility, compare trade-offs, and refine beam strategies early in the planning process. When integrated into dose-mimicking or inverse planning workflows, these models act as accelerators rather than endpoints, preserving the rigor of final dose calculation while enabling more informed decision-making.
Beyond accelerating existing workflows, AI suggests a future direction of intensity modulated proton therapy planning that is less dependent on predefined spot lattices and heuristic spot-selection rules. Current proton planning systems rely on fixed lateral spot grids and rule-based pruning strategies that are applied uniformly across patients to balance plan quality, robustness, and delivery efficiency. While effective, these abstractions constrain the solution space before optimization begins and may not reflect patient-specific anatomical or dosimetric needs. Data-driven approaches offer a pathway toward learning which spatial degrees of freedom and spot patterns are most relevant for a given anatomy and uncertainty context, while still respecting discrete energy availability and machine delivery constraints. In this paradigm, AI does not alter physical dose engines or hardware limitations, but guides spot utilization and modulation in a patient-adaptive manner.
Transformer-based architectures further extend AI’s potential role in planning by capturing long-range dependencies inherent to proton dose formation.[43] Proton dose deposition is influenced by interactions along extended beam paths, making it difficult to represent with purely local models. Architectures designed to model global relationships offer a promising framework for learning spot-level interactions and dose patterns that complement traditional dose engines. These approaches remain early-stage, but they highlight a direction toward more structure-aware AI models that align naturally with the physics of proton therapy.
Beyond physical dose optimization, AI also provides a practical pathway toward incorporating biological information into planning. Variations in linear energy transfer (LET) and relative biological effectiveness are recognized features of proton therapy[44], yet their clinical use has been limited by computational burden and modeling uncertainty. Deep learning–based LET estimation enables rapid voxel-level surrogates that can be integrated into planning objectives or evaluation metrics within clinically feasible time frames[38, 45, 46]. While biological optimization remains investigational and requires further clinical correlation, AI lowers the technical barriers that have historically constrained its exploration.
Finally, AI supports decision-making upstream of plan generation by assisting with patient selection and feasibility assessment. By linking anatomical features, predicted dose distributions, and toxicity models, AI-based frameworks can help identify patients most likely to benefit from proton therapy before substantial planning resources are expended.[47, 48] This capability is particularly relevant in settings where planning efficiency and appropriate resource allocation are critical.
Despite these capabilities, AI-driven planning remains constrained by data availability, generalizability, and the need for robust validation. Most current models are trained on institution-specific datasets and do not explicitly incorporate uncertainty into training objectives. Moreover, emerging concepts such as patient-adaptive spot utilization and biological optimization require careful evaluation of failure modes before influencing clinical decisions. Accordingly, AI should be viewed as an enabling layer that augments established planning paradigms, enhancing efficiency and flexibility while remaining grounded in physics-based safeguards and clinical oversight.

3d.
Quality Assurance, Adaptation, and Digital Twins
Quality assurance (QA) and adaptation are central to the safe delivery of proton therapy, where small deviations in anatomy, setup, or delivery parameters can translate into clinically meaningful dosimetric consequences. Conventional QA workflows rely heavily on manual review, deterministic checks, and periodic verification imaging, approaches that are increasingly strained by the complexity of modern proton therapy techniques. AI introduces capabilities that fundamentally alter how safety, consistency, and adaptability can be achieved at scale.
A key contribution of AI is its ability to continuously monitor and evaluate complex, high-dimensional data streams that exceed human review capacity. In proton therapy, this includes contours, plans, and delivery-related information that must be assessed for consistency and plausibility. AI-based contour QA tools can automatically identify outliers, inconsistencies, or anatomically implausible segmentations, enabling rapid triage and targeted human review.[49, 50] This function is particularly important as auto-segmentation becomes more prevalent, where silent errors may propagate downstream if not detected.
AI also provides a systematic framework for identifying bias, variability, and failure modes in automated workflows[51]. Traditional QA processes are often binary and rule-based, designed to detect gross errors rather than subtle systematic deviations. Data-driven approaches can quantify performance variability across patient subgroups, anatomical regions, and clinical contexts, offering insights into where models perform reliably and where caution is warranted. This capability supports more transparent and evidence-based governance of AI tools in proton therapy, moving beyond anecdotal confidence toward measurable trust.
Perhaps the most transformative role of AI in this domain is its enablement of practical adaptive proton therapy[52–55]. While the dosimetric rationale for adaptation has long been recognized, its clinical implementation has been limited by the time and expertise required for recontouring, replanning, and verification. AI-based segmentation, dose estimation, and quality checks reduce these barriers, making it feasible to respond to anatomical changes within clinically acceptable time frames. In this sense, AI shifts adaptation from an exceptional intervention to a potentially routine component of proton therapy.
The concept of the digital twin represents an extension of this adaptive paradigm. Digital twins aim to maintain a continuously updated, patient-specific representation of anatomy, delivery, and response throughout the treatment course. AI serves as the enabling layer that integrates imaging, planning, and delivery data into a coherent model capable of informing decision-making in near real time. While still largely conceptual, early demonstrations suggest that such frameworks could support proactive adaptation, scenario evaluation, and longitudinal assessment of treatment fidelity[56, 57].
Despite these advances, AI-enabled QA and adaptation raise important challenges. Automated systems may obscure error provenance, and excessive reliance on AI-generated outputs risks eroding independent verification practices. Moreover, adaptive decisions driven by AI must be carefully bounded within clinically defined safety envelopes, with clear criteria for human intervention. As with other AI applications in proton therapy, successful integration depends on thoughtful workflow design that preserves accountability and leverages AI as a decision-support tool rather than an autonomous authority.
In summary, AI uniquely enables continuous quality monitoring, scalable safety assurance, and feasible online adaptation in proton therapy. By augmenting human oversight and reducing operational barriers, AI provides the technical foundation for adaptive and digital twin–based proton therapy, while reinforcing the need for rigorous validation and disciplined clinical governance.

Other Considerations

4.
Other Considerations
While much of the current AI literature in proton therapy focuses on technical feasibility and workflow integration, broader considerations related to clinical outcomes, economic value, access, and governance are increasingly relevant as these tools move closer to routine use. This is particularly true as newer classes of AI, including large language models, are introduced into clinical environments, where their influence is more likely to manifest through workflow coordination, documentation, and decision support rather than direct physical modeling. In many cases, evidence in these areas remains limited, and a balanced review requires acknowledging both potential impact and current uncertainty.
Clinical outcomes and toxicity prediction represent a natural extension of AI-enabled proton therapy workflows. [47, 58, 59] By integrating imaging, dose, and clinical features, AI-based models have the potential to support patient selection, personalize treatment strategies, and anticipate normal tissue complications. However, compared with technical domains such as imaging or planning, outcome-driven AI applications in proton therapy remain relatively immature. The main challenges are cohort size, surrogate endpoints, and the lack of external validation. As a result, while AI-assisted outcome modeling is conceptually attractive, its role in guiding proton therapy decisions remains largely exploration and should be interpreted cautiously.
Economic considerations are also central to the adoption of AI in proton therapy. Proton therapy is resource intensive, and AI is often promoted as a means to improve efficiency, reduce planning time, and lower operational burden. Automation of labor-intensive tasks such as contouring, planning, and QA may yield meaningful workflow gains, particularly as treatment volumes increase. However, the costs associated with AI development, validation, integration, and maintenance are nontrivial and are rarely accounted for systematically. Demonstrating true value will require moving beyond time savings to evaluate downstream effects on throughput, staffing models, and clinical outcomes.
Global access and equity considerations further complicate the picture. Proton therapy remains concentrated in high-resource settings, and AI has the potential to either mitigate or exacerbate existing disparities. On one hand, AI-enabled automation and decision support could lower expertise barriers and facilitate more standardized care across institutions. On the other hand, models trained predominantly on data from well-resourced centers may not generalize to diverse clinical environments. Ensuring that AI tools contribute to broader access rather than reinforcing existing inequities will require intentional dataset design, inclusive validation strategies, and attention to deployment contexts. Beyond routine clinical workflows, AI has also been increasingly leveraged in accelerator design, beam delivery optimization, and physics-based modeling[60, 61], enabling the development of more compact, efficient, and potentially more affordable proton therapy technologies, thereby supporting broader global access to this treatment modality.
Large language models (LLMs) represent a distinct class of AI whose relevance to proton therapy lies at the workflow level rather than in physics-based computation. Unlike task-specific models used for imaging, dose prediction, or range estimation, LLMs operate on unstructured information and are optimized for synthesis, interpretation, and communication. Early literature in RT has explored LLMs primarily in assistive roles, such as treatment plan summarization, protocol interpretation, chart review, and clinical documentation, with a consistent emphasis on human oversight rather than autonomous decision-making. In proton therapy, where planning, robustness evaluation, QA, and adaptive workflows generate fragmented outputs across multiple systems, LLMs may offer value by organizing information, highlighting deviations from prior practice, and supporting documentation and governance processes. Importantly, these applications do not replace numerical optimization or physical modeling; instead, they address the growing cognitive and organizational burden associated with increasingly complex proton workflows. Given known limitations, including non-determinism and the risk of plausible but incorrect outputs, LLM use in proton therapy must remain constrained to supervised, assistive functions embedded within established clinical and governance frameworks.
Finally, ethical and governance considerations cut across all AI applications in proton therapy[62, 63]. Issues related to transparency, interpretability, accountability, and human oversight become particularly salient as AI systems influence safety-critical decisions. While professional guidelines increasingly emphasize responsible development and reporting, practical implementation requires embedding AI tools within established clinical governance structures. This includes clear delineation of responsibility, mechanisms for monitoring performance over time, and processes for responding to model failure or drift.
Taken together, these considerations underscore that technical capability alone is insufficient to justify widespread adoption of AI in proton therapy. Clinical impact, value, access, and governance must evolve in parallel to ensure that AI-enhanced workflows translate into meaningful and equitable improvements in patient care.

Summary and Conclusion

5.
Summary and Conclusion
AI has moved beyond exploratory application in proton therapy and is beginning to influence how core clinical challenges are addressed. However, the maturity and impact of AI vary substantially across domains. Treating AI as a monolithic capability risks obscuring where it meaningfully advances practice and where limitations remain. A critical assessment therefore requires distinguishing demonstrated clinical utility from promising but immature approaches.
Several AI applications have reached a level of readiness where they can support routine clinical workflows. Automated imaging segmentation has become a practical enabler of adaptive strategies by reducing manual effort and inter-observer variability. AI-assisted treatment planning, particularly in the form of dose prediction and plan quality guidance, has demonstrated value as an assistive tool that accelerates optimization and facilitates exploration of trade-offs without replacing established planning systems. In these areas, AI primarily improves scalability and consistency rather than altering the fundamental treatment paradigm.
The most proton-specific and potentially transformative role of AI lies in stopping power estimation and range uncertainty management. Data-driven approaches have demonstrated improved accuracy relative to conventional calibration methods and, critically, have enabled investigational use of imaging modalities such as cone-beam CT and MRI that were previously impractical for proton dose calculation. Beyond incremental accuracy gains, AI exposes limitations of population-based robustness strategies and opens the possibility of more patient-specific uncertainty characterization. At the same time, these advances introduce new risks, including structured model failure and silent error propagation, underscoring the need for rigorous validation and conservative clinical integration.
Other areas remain at an earlier stage of development. AI-driven biological optimization, outcome prediction, and digital twin concepts are scientifically compelling but currently lack sufficient validation for routine clinical decision-making. In these domains, AI serves primarily as a research tool rather than a clinical one, and its outputs should be interpreted cautiously. Progress will depend on prospective studies, standardized evaluation frameworks, and integration with clinical endpoints rather than surrogate metrics alone.
LLMs represent a qualitatively different class of AI with cross-cutting relevance to proton therapy. Existing literature demonstrates their feasibility as workflow-level intelligence, supporting plan interpretation, documentation, protocol compliance, and coordination across complex clinical processes. Their value lies in reducing cognitive and organizational burden rather than performing numerical inference. When appropriately constrained and supervised, LLMs may enhance consistency and governance across the proton therapy workflow, but they are not substitutes for physics-based modeling or safety-critical decision-making.
Taken together, the current body of evidence supports a selective and disciplined view of AI in proton therapy. AI already works as an enabler of efficiency and feasibility in imaging, planning support, and selected adaptive workflows. It falls short when asked to replace physics-based models or to autonomously guide high-stakes clinical decisions. The most impactful next steps lie in combining data-driven methods with robust uncertainty management, multi-institutional validation, and workflow-level intelligence that supports human expertise rather than displacing it. Under these conditions, AI has the potential not merely to automate existing practice, but to reshape how precision, robustness, and scalability are balanced in proton therapy.

출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🟢 PMC 전문 열기