Objectives Lung cancer is the leading cause of cancer-related mortality worldwide, with poor prognosis largely due to late-stage diagnosis. Current screening methods such as low-dose CT face accessibility and cost barriers in resource-limited settings. This study develops a lightweight multichannel convolutional neural network for lung cancer screening support through longitudinal risk stratification using routine pre-diagnostic healthcare data. Methods We conducted a retrospective cohort study using Taiwan’s National Health Insurance Research Database, comprising 99 615 individuals (575 lung cancer cases; 99 040 non-cancer controls). Diagnostic codes, medication records and medical orders within a 36-month observation window were extracted. Log-likelihood ratio feature selection was implemented to reduce dimensionality, achieving 99.8% reduction in computational requirements while retaining clinical relevance. A multichannel Convolutional Neural Network (CNN) architecture was designed to process these heterogeneous data modalities simultaneously. Results The proposed method achieved an F₁-score of 0.5738, precision of 0.7149, Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.8316 and Area Under the Precision-Recall Curve (AUPRC) of 0.1617, outperforming baseline methods in precision and F₁-score. Ablation studies confirmed that medical orders provide primary predictive value, while medication features contribute limited discriminative signal in the pre-diagnostic phase. SHapley Additive exPlanations analysis revealed that routine healthcare utilisation patterns, rather than cancer-specific features, drive risk stratification. Discussion The lightweight architecture enables deployment in resource-constrained clinical environments while maintaining robust performance, offering potential as a preliminary screening tool to identify high-risk individuals for further diagnostic examination. Conclusion Efficient deep learning models using routine clinical data can facilitate lung cancer risk stratification and screening, providing a scalable solution for clinical implementation.
Longitudinal multisource clinical model for early lung cancer risk stratification and screening
Chia-Hui Chien,Shih-Chuan Chang,Yung-Chun Chang,Y. Li
Published 2026 in BMJ Health & Care Informatics
ABSTRACT
PUBLICATION RECORD
- Publication year
2026
- Venue
BMJ Health & Care Informatics
- Publication date
2026-02-01
- Fields of study
Medicine
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-24 of 24 references · Page 1 of 1
CITED BY
- No citing papers are available for this paper.
Showing 0-0 of 0 citing papers · Page 1 of 1