Artificial Intelligence-assisted Periapical Radiographic Assessment: Lesion Detection, Endodontic Complication Analysis, and Review of Clinical Treatment Recommendations

Akyüz, İpek; Kıvırcık, Beyza; ASLAN, TUĞRUL

doi:10.1016/j.joen.2026.04.008

Artificial Intelligence-assisted Periapical Radiographic Assessment: Lesion Detection, Endodontic Complication Analysis, and Review of Clinical Treatment Recommendations

Akyüz İ. E., Kıvırcık B. E., ASLAN T.

Journal of Endodontics, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Basım Tarihi: 2026
Doi Numarası: 10.1016/j.joen.2026.04.008
Dergi Adı: Journal of Endodontics
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Anahtar Kelimeler: Artificial intelligence, endodontic diagnosis, postoperative complications, root canal treatment, treatment outcome
Erciyes Üniversitesi Adresli: Evet

Özet

Introduction Artificial intelligence (AI) systems are increasingly used in dental radiology to support endodontic diagnosis. However, their diagnostic reliability across different clinical categories remains unclear. This study compared 3 vision–language AI models (ChatGPT-5 Plus, Gemini 2.5 Pro, and Copilot Pro) with expert endodontists by assessing sensitivity, specificity, overall diagnostic agreement, and Youden's Index across multiple endodontic conditions. Methods This retrospective diagnostic accuracy study evaluated the relationship between periapical radiographs and treatment decisions, procedural complications, and lesion detection. Expert endodontists served as the gold standard of reference. Diagnostic categories included primary treatment selection, nonsurgical retreatment, final treatment decisions, perforation, underfilling, overfilling, broken file, calcification, and periapical lesion detection. Results There was an almost perfect agreement between the endodontists (κ = 0.95). Gemini 2.5 Pro demonstrated the highest diagnostic accuracy, particularly in periapical lesion detection (sensitivity 100%, specificity 88%), while ChatGPT-5 Plus showed similarly strong performance in treatment selection. Copilot Pro exhibited markedly low sensitivity for complications such as perforation and instrument fracture. Kappa values for preoperative and postoperative treatment decisions were high for Gemini and ChatGPT-5 Plus, but low for Copilot Pro. The Friedman test confirmed significant differences among the groups ( P < .001). Conclusions AI systems demonstrated promising diagnostic accuracy in treatment selection tasks and lesion detection, but performed less reliably in identifying complex procedural complications. Gemini 2.5 Pro showed the most balanced performance, whereas Copilot Pro displayed the highest variability across diagnostic categories.