Start Your Search
OA06 - Refining Lung Cancer Screening (ID 131)
- Event: WCLC 2019
- Type: Oral Session
- Track: Screening and Early Detection
- Presentations: 1
- Now Available
- Moderators:Tomasz Grodzki, Lluis Esteban Tejero
- Coordinates: 9/09/2019, 11:00 - 12:30, Hilton Head (1978)
OA06.05 - Evaluation of a Deep Learning-Based Automatic Classifier for the Classification of Perifissural Nodules (Now Available) (ID 1928)
11:00 - 12:30 | Presenting Author(s): Daiwei Han
Perifissural nodules (PFNs) comprise approximately 20% of screening-detected nodules and are almost certainly benign. Automatic PFN classification could therefore reduce the number of follow-up procedures required for nodule work-up. Prior work has shown some success in AI classification with limited datasets. Here we evaluate the performance of a new deep convolutional neural network (CNN) for PFN classification, trained on a dataset of nodules retrospectively collected from multiple European centers, including validation on an independent reader-study dataset.Method
Data (1103 Patients, 1557 unique nodules and 3320 nodule images) were collected from three centers in the UK and the Netherlands. Each nodule was categorized into subtypes, including “PFN”, by on-site radiologists. Labels were reviewed centrally, overseen by a single clinician to ensure consistency between sites.
A CNN classifier was trained to produce a score that classifies nodules as (typical) PFN or not, using five-fold cross validation. The PFN classifier was developed by “transfer learning” from an existing benign-vs-malignant AI trained on the US National Lung Screening Trial.
To compare the CNN with human performance, independent validation was performed on a separate dataset of 158 benign patients (196 nodules/nodule images) from two of the sites. Three readers (two radiologists and a radiology resident) were asked to label each nodule as typical PFN, atypical PFN, or non-PFN. To match the AI training procedure, only the typical-PFN labels were used in the reader study, and compared to atypical/non-PFN classified nodules.
Model performance was evaluated by area under the ROC curve (AUC). For the independent validation, Cohen’s kappa was used to measure both the model’s agreement with reader consensus (at least 2 in agreement) and inter-reader agreement. For Cohen’s kappa calculations the CNN score was binarized using a threshold determined from the internal validation data.Result
The mean cross-validated AUC on the internal dataset was 92% (95% CI = 90.6–92.9). For the independent dataset, the classifier labelled 61/196 (31%) as typical PFNs, and reader consensus gave 45/196 (23%). Versus reader consensus, the AUC of the CNN on the reader-study dataset was 96% (95% CI 93.3–98.4). Both the classifier–reader agreement [(k=0.74) 90%] and the inter-reader agreement [(k=0.64–0.79) 88%-92%] were substantial.Conclusion
The performance of the PFN classifier is similar to that of radiologists and is within the inter-reader variability of radiologists. This demonstrates the potential utility of CNN-based systems for automatic PFN classification.
Only Active Members that have purchased this event or have registered via an access code will be able to view this content. To view this presentation, please login or select "Add to Cart" and proceed to checkout.