IADR Abstract Archives

Predicting Oral Cancer Risk Using Machine Learning

Objectives: High-risk oral cancer screening is most effective. This study aims to develop a machine learning-based platform to predict the risk of oral cancer and oral potentially malignant disorders(OPMDs).
Methods: Visual oral examination(VOE) was performed among 1467 participants of a community-based screening program by three calibrated dentists prospectively. Each individual’s status was defined as positive/negative for oral cancer/OPMDs and histologic confirmation of epithelial dysplasia(ED) and squamous cell carcinoma(SCC) was performed for positive status. Follow-up status of those that screened negative was monitored via state-linked electronic health records. Information on demography, habitual, lifestyle and familial risk factors was obtained, and expired carbon monoxide levels(in ppm) were assessed using a monitor. Input features(n=40) and histologic diagnoses were used to populate 12 machine learning algorithms with 80:20 train-test splitting applied to the data randomly during development. Recursive feature elimination with 10-fold cross-validation was used for feature selection while synthetic-minority-oversampling-technique with edited-nearest-neighbors was implemented for class imbalance correction. Internal validation was conducted with the unused 20% data with the comparison of outputs using McNemar’s test used for optimal model selection Performance metrics included recall, specificity, and F1-score.
Results: Suspicious lesions and confirmed ED/SCC were identified in 4.50%(n=66) and 1.64%(n=24) of participants respectively. AdaBoost (F1:0.98±0.02, accuracy:0.99±0.03) and k-nearest-neighbors(kNN) (F1:0.99±0.01, accuracy:0.99±0.01) classifiers outperformed other algorithms. Upon internal validation, the AdaBoost model (accuracy-0.94, recall-0.75, specificity-0.95) was significantly better than the kNN model (accuracy-0.85, recall-0.75, specificity-0.85) (p<0.001) and comparable to the status classification provided by the trained examiners on-site for oral cancer and OPMDs following VOE (specificity and accuracy-0.91) (p=0.839). Models were deployed as web-based tools available at https://oral-cancer-risk-predictor-hku.herokuapp.com.
Conclusions: Machine learning is successful in predicting oral cancer risk and may be applied to identify ‘at-risk populations’ in opportunistic and organized screening.
Division:
Meeting: 2022 IADR/APR General Session (Virtual)
Location:
Year: 2022
Final Presentation ID: 1635
Abstract Category|Abstract Category(s): e-Oral Health Network
Authors
  • Adeoye, John  ( University of Hong Kong , Hong Kong , Hong Kong )
  • Alkandari, Abdulrahman  ( University of Hong Kong , Hong Kong , Hong Kong )
  • Zhu, Wang-yong  ( University of Hong Kong , Hong Kong , Hong Kong )
  • Zheng, Li-wu  ( University of Hong Kong , Hong Kong , Hong Kong )
  • Thomson, Peter  ( James Cook University , Cairns , Queensland , Australia )
  • Choi, Siu-wai  ( University of Hong Kong , Hong Kong , Hong Kong )
  • Su, Yuxiong  ( University of Hong Kong , Hong Kong , Hong Kong )
  • Financial Interest Disclosure: NONE
    SESSION INFORMATION
    Interactive Talk Session
    e-Oral Health Network I
    Saturday, 06/25/2022 , 02:00PM - 03:30PM