Upper gastrointestinal cancers (including oesophageal cancer and gastric cancer) are
the most common cancers worldwide. Artificial intelligence platforms using deep learning
algorithms have made remarkable progress in medical imaging but their application
in upper gastrointestinal cancers has been limited. We aimed to develop and validate
the Gastrointestinal Artificial Intelligence Diagnostic System (GRAIDS) for the diagnosis
of upper gastrointestinal cancers through analysis of imaging data from clinical endoscopies.
This multicentre, case-control, diagnostic study was done in six hospitals of different
tiers (ie, municipal, provincial, and national) in China. The images of consecutive
participants, aged 18 years or older, who had not had a previous endoscopy were retrieved
from all participating hospitals. All patients with upper gastrointestinal cancer
lesions (including oesophageal cancer and gastric cancer) that were histologically
proven malignancies were eligible for this study. Only images with standard white
light were deemed eligible. The images from Sun Yat-sen University Cancer Center were
randomly assigned (8:1:1) to the training and intrinsic verification datasets for
developing GRAIDS, and the internal validation dataset for evaluating the performance
of GRAIDS. Its diagnostic performance was evaluated using an internal and prospective
validation set from Sun Yat-sen University Cancer Center (a national hospital) and
additional external validation sets from five primary care hospitals. The performance
of GRAIDS was also compared with endoscopists with three degrees of expertise: expert,
competent, and trainee. The diagnostic accuracy, sensitivity, specificity, positive
predictive value, and negative predictive value of GRAIDS and endoscopists for the
identification of cancerous lesions were evaluated by calculating the 95% CIs using
the Clopper-Pearson method.
1 036 496 endoscopy images from 84 424 individuals were used to develop and test GRAIDS.
The diagnostic accuracy in identifying upper gastrointestinal cancers was 0·955 (95%
CI 0·952–0·957) in the internal validation set, 0·927 (0·925–0·929) in the prospective
set, and ranged from 0·915 (0·913–0·917) to 0·977 (0·977–0·978) in the five external
validation sets. GRAIDS achieved diagnostic sensitivity similar to that of the expert
endoscopist (0·942 [95% CI 0·924–0·957]
vs 0·945 [0·927–0·959]; p=0·692) and superior sensitivity compared with competent (0·858
[0·832–0·880], p<0·0001) and trainee (0·722 [0·691–0·752], p<0·0001) endoscopists. The positive predictive value was 0·814 (95% CI 0·788–0·838) for GRAIDS, 0·932 (0·913–0·948) for the expert endoscopist, 0·974 (0·960–0·984) for the competent endoscopist, and 0·824 (0·795–0·850) for the trainee endoscopist. The negative predictive value was 0·978 (95% CI 0·971–0·984) for GRAIDS, 0·980 (0·974–0·985) for the expert endoscopist, 0·951 (0·942–0·959) for the competent endoscopist, and 0·904 (0·893–0·916) for the trainee endoscopist.
GRAIDS achieved high diagnostic accuracy in detecting upper gastrointestinal cancers,
with sensitivity similar to that of expert endoscopists and was superior to that of
non-expert endoscopists. This system could assist community-based hospitals in improving
their effectiveness in upper gastrointestinal cancer diagnoses.
The National Key R&D Program of China, the Natural Science Foundation of Guangdong
Province, the Science and Technology Program of Guangdong, the Science and Technology
Program of Guangzhou, and the Fundamental Research Funds for the Central Universities.