Purpose To evaluate the association between lumbar lordosis and age using an AI-based automated measurement model applied to a large dataset of standing lateral spinal radiographs.
Materials and Methods This retrospective study analyzed 904 high-quality radiographs selected from 2,397 images acquired between 2019 and 2021. Lumbar lordosis was defined as the angle between the superior endplates of L1 and S1 and automatically measured using a validated deep learning model. Subjects were categorized into nine age groups. One-way ANOVA compared lumbar lordosis across age groups, and Pearson correlation assessed the relationship between age and lumbar lordosis.
Results Lumbar lordosis ranged from 0° to 84° (mean 45.9°±13.4°). The highest mean value was in the 10–19-year group (52.1°), and the lowest in the ≥80-year group (39.6°). Minimum values decreased to 0° in individuals aged ≥60 years. No significant differences were found across age groups (p=0.561). A weak but significant negative correlation was observed between age and lumbar lordosis (r=–0.247, p<0.0001).
Conclusions AI-based automated measurement enabled efficient large-scale analysis and revealed a wide distribution of lumbar lordosis with a gradual age-related decline. These findings highlight the value of AI in spinal alignment assessment.
Purpose To develop and validate a deep learning–based artificial intelligence (AI) model for automated measurement of lumbar lordosis (LL) angles from whole spine lateral radiographs.
Materials and Methods A total of 888 lateral spine X-rays (2019–2021) were retrospectively collected and annotated with four anatomical keypoints (L1 and S1 vertebral landmarks). An AI model using Detectron2 with a Keypoint R-CNN and ResNeXt-101 backbone was trained with data augmentation. Performance was evaluated on 50 test images, comparing AI results to manual annotations by two orthopedic surgeons using intraclass correlation coefficient (ICC), Pearson’s correlation, and Bland–Altman analysis.
Results The model achieved an average precision of 71.63 for bounding boxes and 86.61 for keypoints. ICCs between AI and human raters ranged from 0.918 to 0.962. Pearson correlation coefficients were r=0.849 and r=0.903. Bland–Altman analysis showed minor underestimation biases (–3.42° and –4.28°) with acceptable agreement.
Conclusions The AI model showed excellent agreement with expert measurements and high reliability in LL angle assessment. Despite a slight underestimation, it offers a scalable, consistent tool for clinical use. Further studies should evaluate generalizability and interpretability in broader settings.
Citations
Citations to this article as recorded by
Deep Learning–based AI Analysis of the Correlation Between Lumbar Lordosis and Age Soo-Bin Lee, Ja-Yeong Yoon, Dong-Sik Chae, Sang-Bum Kim, Young-Seo Park, Kyung-Yil Kang, Min-Kyu Lee Journal of Advanced Spine Surgery.2025; 15(2): 78. CrossRef
Efficacy of Biportal Endoscopic Decompression for Lumbar Spinal Stenosis: A Meta-Analysis With Single-Arm Analysis and Comparative Analysis With Microscopic Decompression and Uniportal Endoscopic Decompression Shuangwen Lv, Haiwen Lv, Yupeng He, Xiansheng Xia Operative Neurosurgery.2024; 27(2): 158. CrossRef
Objective To investigate the utility of a deep learning model in diagnosing traumatic lumbar fractures on computed tomography (CT) images.
Summary of Background Data: CT scans are widely used as the first choice for detecting spinal fractures in patients with severe trauma. Although CT scans have high diagnostic accuracy, fractures can occasionally be missed.
Recently, deep learning has been applied in various fields of medical imaging.
Methods CT images from 480 patients (3695 vertebrae) who visited a level-one trauma center with lumbar fractures were retrospectively analyzed. The diagnostic results were confirmed by two experienced musculoskeletal radiologists and one experienced spine surgeon using magnetic resonance imaging (MRI). Deep learning networks were employed for diagnosis, with 425 cases used for training and 55 cases for testing. Sensitivity, specificity, accuracy, and the area under the receiver operating characteristic curve (AUROC) were calculated to evaluate diagnostic performance.
Results The model successfully identified 107 out of 129 vertebrae with fractures, achieving a sensitivity of 82.95%, a specificity of 93.24%, an AUROC of 0.936, and an overall accuracy of 88.45%.
Conclusions This study demonstrated that the deep learning model showed high accuracy in diagnosing traumatic lumbar fractures. This approach has the potential to assist spine specialists, radiologists, and trauma care experts.
Further validation is needed to determine its effectiveness in clinical settings.