Deep learning and convolutional neural networks (CNNs) in particular are increasingly popular tools for segmentation and classification of medical images. CNNs were shown to be successful for segmentation of brain tumors into multiple regions or labels. However, in the environment which fosters data-sharing and collection of multi-institutional datasets, a question arises: does training with data from another institution with potentially different imaging equipment, contrast protocol, and patient population impact the segmentation performance of the CNN? Our study presents preliminary data towards answering this question. Specifically, we used MRI data of glioblastoma (GBM) patients for two institutions present in The Cancer Imaging Archive. We performed a process of training and testing CNN multiple times such that half of the time the CNN was tested on data from the same institution that was used for training and half of the time it was tested on another institution, keeping the training and testing set size constant. We observed a decrease in performance as measured by Dice coefficient when the CNN was trained with data from a different institution as compared to training with data from the same institution. The changes in performance for the entire tumor and for four different labels within the tumor were: 0.72 to 0.65 (p=0.06), 0.61 to 0.58 (p=0.49), 0.54 to 0.51 (p=0.82), 0.31 to 0.24 (p<0.03), and 0.43 to 0.31(p<0.003) respectively. In summary, we found that while data across institutions can be used for development of CNNs, this might be associated with a decrease in performance.
|