Multi-Patient Vision Transformer for Markerless Tumor Motion Forecasting.

Rotsart de Hertaing G; Manjah D; Macq B

doi:10.3390/biomedicines14030496

← 뒤로

Multi-Patient Vision Transformer for Markerless Tumor Motion Forecasting.

Biomedicines 2026 Vol.14(3)

Rotsart de Hertaing G, Manjah D, Macq B

PMC 전문 ↗ 원문 ↗ DOI ↗ BibTeX ↓ RIS ↓

📝 환자 설명용 한 줄

Accurate forecasting of lung tumor motion is crucial for precise radiotherapy.

이 논문을 인용하기

BibTeX ↓ RIS ↓

APA Rotsart de Hertaing G, Manjah D, Macq B (2026). Multi-Patient Vision Transformer for Markerless Tumor Motion Forecasting.. Biomedicines, 14(3). https://doi.org/10.3390/biomedicines14030496

MLA Rotsart de Hertaing G, et al.. "Multi-Patient Vision Transformer for Markerless Tumor Motion Forecasting.." Biomedicines, vol. 14, no. 3, 2026.

PMID 41898143

DOI 10.3390/biomedicines14030496

Abstract

Accurate forecasting of lung tumor motion is crucial for precise radiotherapy. Deep-learning-based markerless tracking methods have been explored, but extending these approaches to predict future tumor trajectories remains largely unaddressed. We address this by framing markerless lung tumor motion forecasting as a spatio-temporal prediction task using a vision transformer to estimate three-dimensional tumor positions over short horizons. Digitally reconstructed radiographs (DRRs) generated from four-dimensional computed tomography scans of 12 lung cancer patients were used to train a multi-patient (MP) model. Patient-specific (PS) models trained solely on planning data were compared, and the MP model was further fine-tuned using a small number of patient-specific treatment images under realistic clinical constraints. Models processed sequences of 12 DRRs, with performance evaluated via root mean square error. The results indicate that low-resolution inputs with larger patch sizes outperform higher-resolution configurations by reducing image noise. PS models require extensive data to match MP performance, whereas fine-tuning the MP model with limited patient-specific data achieves comparable or superior forecasting accuracy at a lower cost. These findings demonstrate that Vision Transformers can extend markerless tracking methods to accurate short-term forecasting and highlight fine-tuning as an efficient strategy for personalized prediction.