Published! (November 6, 2021)
Success! After one series of reviews to add a few more experiments, my first, first-author paper is officially accepted and published online with The Journal of Pathology (impact factor = 6.253)! As mentioned, the title is: "The utility of color normalization for AI-based diagnosis of hematoxylin and eosin-stained pathology images". I have copy and pasted the abstract below, and you can read the full paper with this link.
Abstract
The color variation of hematoxylin and eosin (H&E)-stained tissues has presented a challenge for applications of artificial intelligence (AI) in digital pathology. Many color normalization algorithms have been developed in recentyears in order to reduce the color variation between H&E images. However, previous efforts in benchmarking thesealgorithms have produced conflicting results and none have sufficiently assessed the efficacy of the various color nor-malization methods for improving diagnostic performance of AI systems. In this study, we systematically investigatedeight color normalization algorithms for AI-based classification of H&E-stained histopathology slides, in the contextof using images both from one center and from multiple centers. Our results show that color normalization does notconsistently improve classification performance when both training and testing data are from a single center. How-ever, using four multi-center datasets of two cancer types (ovarian and pleural) and objective functions, we show thatcolor normalization can significantly improve the classification accuracy of images from external datasets (ovarian cancer: 0.25 AUC increase, p = 1.6 e-05; pleural cancer: 0.21 AUC increase, p = 1.4 e-10). Furthermore, we intro-duce a novel augmentation strategy by mixing color-normalized images using three easily accessible algorithms thatconsistently improves the diagnosis of test images from external centers, even when the individual normalizationmethods had varied results. We anticipate our study to be a starting point for reliable use of color normalization toimprove AI-based, digital pathology-empowered diagnosis of cancers sourced from multiple centers.
My First, First-Author Paper (July 7, 2021)
This project highlighted my versatility in quickly learning multiple new algorithms and implementing them from various sources. As the research involved comparing eight different color normalization pre-processing algorithms, the first step was to get all the different codes to work; some were written in Matlab, some Python, and one involved using a generative adversarial network (GAN). Where possible, I created Singularity containers (similar to Docker) to facilitate using that particular method for future projects.
Through this experience, I also honed a lot of important skills that are key to being an effective programmer and data scientist. While collaborating with my teammates using Git, I contributed to writing and troubleshooting the workflow pipeline (mainly in Python and BASH) including the deep learning models (using the Pytorch framework). As well, because of the Covid-19 pandemic, I was working from home the whole time and everything had to be run on remote servers. Thus, I got very comfortable navigating Linux systems and using Vim as my text editor of choice. Although it was initially difficult to adapt to learning many new things at once by myself at home, I eventually mastered all the aforementioned skills and was even put in charge of training newcomers to the lab (maybe because I understood their struggles, haha). Overall, I am very thankful for everything so far, especially for the ongoing support of Dr. Ali Bashashati (my supervisor), Dr. Hossein Farahani (the lead machine learning scientist), and the rest of the AIM Lab team!