Reliability of [F]FDG PET/CT for post-treatment surveillance of non-small cell lung cancer: agreement among multiple centers.
[PURPOSE] Fluorine-18 fluorodeoxyglucose positron emission tomography/computed tomography ([F]FDG PET/CT) has shown promise for post-treatment surveillance in patients with non-small cell lung cancer
- 95% CI 0.41-0.69
APA
Guldbrandsen KF, Lonsdale MN, et al. (2025). Reliability of [F]FDG PET/CT for post-treatment surveillance of non-small cell lung cancer: agreement among multiple centers.. European journal of nuclear medicine and molecular imaging, 53(1), 275-282. https://doi.org/10.1007/s00259-025-07420-x
MLA
Guldbrandsen KF, et al.. "Reliability of [F]FDG PET/CT for post-treatment surveillance of non-small cell lung cancer: agreement among multiple centers.." European journal of nuclear medicine and molecular imaging, vol. 53, no. 1, 2025, pp. 275-282.
PMID
40542151
Abstract
[PURPOSE] Fluorine-18 fluorodeoxyglucose positron emission tomography/computed tomography ([F]FDG PET/CT) has shown promise for post-treatment surveillance in patients with non-small cell lung cancer (NSCLC). This study evaluated interobserver agreement of PET/CT interpretation for NSCLC surveillance in a multicenter setting.
[METHODS] Nine teams from seven centers, each team consisting of a nuclear medicine specialist and a radiologist, participated in the study. A total of 150 PET/CT scans were selected, and each was independently reviewed by two randomly assigned teams. Scans were performed six months post-treatment for scheduled recurrence assessment in stage Ia-IIIc NSCLC patients. Each scan was evaluated for suspicion of recurrence using two methods; without any pre-specified criteria (conventional assessment) and using pre-specified, qualitative criteria (Hopkins criteria). Both scoring methods were compared to a reference standard to assess accuracy.
[RESULTS] Conventional assessment showed moderate interobserver agreement (κ = 0.55, 95% CI 0.41-0.69; 79% overall agreement) for the diagnosis of recurrence. Hopkins criteria demonstrated substantial agreement (κ = 0.61, 95% CI 0.45-0.77; 87% overall agreement). There was no difference in the area under the curve (AUC) between conventional assessment (0.80, 95% CI 0.72-0.88) and Hopkins criteria (0.82, 95% CI 0.74-0.90) compared to the reference standard (p = 0.21).
[CONCLUSIONS] Interobserver agreement for [F]FDG PET/CT interpretation in NSCLC surveillance was moderate to substantial. While applying pre-specified reporting criteria did not significantly improve the agreement, it did not hinder the diagnostic accuracy. Efforts to reduce the variability of reporting, including continuous training and structured reporting, could improve the clinical impact of this technology.
[METHODS] Nine teams from seven centers, each team consisting of a nuclear medicine specialist and a radiologist, participated in the study. A total of 150 PET/CT scans were selected, and each was independently reviewed by two randomly assigned teams. Scans were performed six months post-treatment for scheduled recurrence assessment in stage Ia-IIIc NSCLC patients. Each scan was evaluated for suspicion of recurrence using two methods; without any pre-specified criteria (conventional assessment) and using pre-specified, qualitative criteria (Hopkins criteria). Both scoring methods were compared to a reference standard to assess accuracy.
[RESULTS] Conventional assessment showed moderate interobserver agreement (κ = 0.55, 95% CI 0.41-0.69; 79% overall agreement) for the diagnosis of recurrence. Hopkins criteria demonstrated substantial agreement (κ = 0.61, 95% CI 0.45-0.77; 87% overall agreement). There was no difference in the area under the curve (AUC) between conventional assessment (0.80, 95% CI 0.72-0.88) and Hopkins criteria (0.82, 95% CI 0.74-0.90) compared to the reference standard (p = 0.21).
[CONCLUSIONS] Interobserver agreement for [F]FDG PET/CT interpretation in NSCLC surveillance was moderate to substantial. While applying pre-specified reporting criteria did not significantly improve the agreement, it did not hinder the diagnostic accuracy. Efforts to reduce the variability of reporting, including continuous training and structured reporting, could improve the clinical impact of this technology.
MeSH Terms
Humans; Positron Emission Tomography Computed Tomography; Carcinoma, Non-Small-Cell Lung; Fluorodeoxyglucose F18; Lung Neoplasms; Male; Female; Middle Aged; Aged; Reproducibility of Results; Observer Variation; Radiopharmaceuticals; Aged, 80 and over
같은 제1저자의 인용 많은 논문 (2)
- Corrigendum to 'Surveillance With Fluorine-18 Fluorodeoxyglucose Positron Emission Tomography/Computed Tomography of Patients With Stage I-to-III Lung Cancer After Completion of Curative treatment (SUPE_R): A Randomized Controlled Trial' [Journal of Thoracic Oncology Volume 20 Issue 8 (2025) 1086-1097].
- Diagnostic Accuracy of [F]FDG PET/CT versus CT for NSCLC Surveillance: Secondary Analysis of a Randomized Clinical Trial.