Table of Contents
Definition / general | Essential features | Terminology | Diagrams / tables | Description of convolutional neural networks | Image analysis | Applications in pathology | Tools for performing image analysis using convolutional neural networks | Additional references | Board review style question #1 | Board review style answer #1 | Board review style question #2 | Board review style answer #2Cite this page: Cheng J. Convolutional neural networks. PathologyOutlines.com website. https://www.pathologyoutlines.com/topic/informaticsconvnet.html. Accessed November 27th, 2024.
Definition / general
- Also referred to as ConvNet
- Convolutional neural network (CNN) is a machine learning method that is inspired by the way our visual cortex processes images through receptive fields, whereby individual retinal neurons receive stimuli from different regions of the visual field; information from multiple retinal neurons is subsequently passed on to neurons further down the chain (The Data Science Blog: A Quick Introduction to Neural Networks [Accessed 2 March 2022])
- Likewise, many CNN architectures have a feed forward neural network architecture composed of convolution and pooling (downsampling) layers, followed by 1 or more fully connected layers (J Cancer 2019;10:4876)
- Mainly used for image classification, object detection with classification, semantic segmentation and natural language processing
- Generative adversarial network (GAN): pits 2 neural networks (generator and discriminator) against each other to create realistic looking fake images; many GANs have CNNs as part of their architecture but other types of neural networks may also be used in GANs
Essential features
- In machine learning, a convolutional neural network is a class of deep, feed forward artificial neural networks, most commonly applied in pathology to image classification and semantic segmentation (Wikipedia: Convolutional Neural Network [Accessed 2 March 2022])
- GANs have been used for stain normalization, virtual staining, ink dot removal, image data augmentation and synthesis of pathology images
- Neural networks, like other supervised machine learning methods, are trained using a dataset with an expected outcome and other parameters that contribute to the prediction of the outcome
- For instance, a dataset for predicting the presence of hemolysis would have entries for the patient's sex, hemoglobin levels, serum lactate dehydrogenase, serum haptoglobin, indirect bilirubin and the presence or absence of hemolysis as the target feature
- Variables that contribute to the prediction include laboratory values and parameter values called weights
- Weight values are adjusted in an iterative manner called backpropagation, where the accuracy of the neural network is assessed through a formula (loss function) and the weights are updated until it arrives at the weight values that give the best prediction accuracy
- In convolutional neural networks involving images, weights are often in the form of 3 dimensional matrices; the target feature is the class an image belongs to (e.g., benign versus malignant) and the variables that contribute to the prediction are data from the image itself
Terminology
- Feed forward: refers to how the data flows from 1 layer of the network to a subsequent layer of the network; it is then further passed on to the next layer of the network after calculations are made in the preceding layer
- Transfer learning: machine learning models trained to solve 1 type of problem can be reused to solve a problem with different subject matter (e.g., a machine learning model trained on a nonhistopathological dataset like ImageNet may be utilized to categorize benign and malignant pathology images with some fine tuning or addition of another machine learning layer)
- Semantic segmentation: a pixelwise classification of objects based on image class (e.g., some CNNs may be used to highlight cancerous and stromal regions in an image with different colors) (Am J Pathol 2019;189:1686)
Diagrams / tables
- The figure below illustrates the type of calculations that image data goes through in convolution and pooling operations
- Convolution operations involve an elementwise product between the filter and different segments of equal dimensions from the input matrix
- Pooling operations perform an aggregate operation (e.g., maximum or average) on a region
- In the example below, the maximum value was returned from 2 x 2 regions of the input matrix
Contributed by Jerome Cheng, M.D.
Images hosted on other servers:
Description of convolutional neural networks
- Convolution or pooling operations are carried out on information from 1 layer and the results are passed on to a deeper layer of the network
- Calculations involved in a convolutional neural network (CNN) are complex
- Fortunately, with all the tools available to us, we do not need to write a program specifying all of the mathematical operations involved
- Machine learning frameworks, such as TensorFlow and PyTorch, simplify the process of designing and training CNN models
- Reference: Nature 2015;521:436
Image analysis
- Image analysis through convolutional neural network (CNN) is usually performed on digital slides obtained from a whole slide scanner or an image taken through a digital camera mounted to a microscope
- Convolutional neural network architecture can be built from scratch or pretrained models can be used for image classification
- Several pretrained CNN models, such as those based on VGG-16, VGG-19, Inception v3, ResNet50, MobileNet and Efficient, are freely available on the internet and were trained on a 1,000 category subset of the ImageNet dataset
- ImageNet image database comprises over a million images belonging to thousands of classes of real world objects, such as animals, cars and tables
- Despite being trained on nonhistological images, these can still be used for analysis of pathology based image datasets, due to an overlap of low level image features (color, lines, dots, curves) in both types of images
- Training a new CNN model with a purely histopathological image dataset should further improve prediction accuracy but it would take considerable effort to collect millions of images; additionally, model training is expected to take several days to complete
- Through a process referred to as transfer learning, pretrained CNN models can produce results in minutes; in contrast, training a CNN model from scratch with a large image dataset can take days, even with the aid of a powerful GPU (graphical processing unit)
- CNNs are widely regarded as black boxes, due to the millions of parameters (weights) involved in calculations and difficulty in understanding how these arrive at a prediction; however, in some types of CNNs, class activation maps can highlight which regions in an image contributed most to the prediction (Diagnostics (Basel) 2019;9:38)
- With the increasing adoption of whole slide digital imaging solutions in pathology departments for research, education and clinical practice, CNN may be used on digital slides to aid in identifying histological structures, such as mitosis, nuclei, cancerous tissue and regions with cancer metastasis in lymph nodes
Applications in pathology
- Identification of tumor regions in digital slides / images and prediction of survival outcome based on tumor shape (Sci Rep 2018;8:10393)
- Nuclei segmentation (AMIA Jt Summits Transl Sci Proc 2018;2017:227)
- Cancer detection (PLoS One 2018;13:e0196828)
- Mitosis detection (J Med Imaging (Bellingham) 2014;1:034003)
- Automated detection of Mycobacterium tuberculosis (J Thorac Dis 2018;10:1936)
- Automated interpretation of blood culture Gram stains (J Clin Microbiol 2018;56:e01521)
- Hepatocellular carcinoma nuclei grading (Comput Biol Med 2017;84:156)
- Malaria parasite detection in blood smear images (PeerJ 2019;7:e6977)
- Lymphoma classification (Ann Clin Lab Sci 2019;49:153)
- Natural language processing (JCO Clin Cancer Inform 2019;3:1)
- Stain normalization (Sci Rep 2020;10:14398)
- Data augmentation (Math Biosci Eng 2021;18:1740)
- Virtual staining (e.g., H&E to trichrome) (Mod Pathol 2021;34:808)
- Virtually removing ink dots from digital slides (J Pathol Inform 2021;12:43)
- Synthesis of diagnostic quality images (J Pathol 2020;252:178)
Tools for performing image analysis using convolutional neural networks
- Python (programming language):
- Currently the most popular language used for machine learning
- Has several libraries for convolutional neural network (CNN)
- Orange (Biolab):
- Includes an image analysis add-on that can extract features from images using 1 of 7 pretrained CNN models: VGG-16, VGG-19, Inception v3, OpenFace, DeepLoc, Painters, SqueezeNet
- Extracted features can be combined with machine learning algorithms, such as Random Forest, to create an image classifier (e.g., benign versus malignant lesions)
- TensorFlow:
- Open source machine learning library developed by Google
- Can be used along with Python to develop a CNN architecture for classifying images
- Keras:
- High level neural network library that works on top of TensorFlow, providing a simpler programming framework for developing deep learning models
- PyTorch:
- Open source machine learning framework popular in research
- PyHIST:
- Splits a digital slide into separate tiles; whole slide images are very large so these have to be tiled into smaller and uniformly sized images before they can be processed by a CNN for training or classification
- QuPath:
- Good for whole slide image annotation (e.g., draw an outline or rectangle around cancer regions) and tiling
- Also has image analysis capabilities
- Google Colab:
- Provides a preconfigured interactive Python environment (web based) for deep learning
- Useful for running deep learning experiments with TensorFlow or PyTorch
Additional references
Board review style question #1
Which machine learning method was inspired by the way our visual cortex processes images through receptive fields, whereby retinal neurons receive stimuli from different regions of the visual field and information from multiple retinal neurons are relayed to neurons further down the chain?
- Convolutional neural network (CNN)
- Logistic regression
- Random Forest
- Support vector machine
Board review style answer #1
A. CNN has a feed forward neural network architecture composed of convolution and pooling (downsampling) layers, followed by 1 or more fully connected layers. Convolution or pooling operations are carried out on information from 1 layer and the results are passed on to a deeper layer of the network. CNN has been used on digital slides to aid in identifying histological structures, such as mitosis, nuclei and regions with cancer metastasis.
Comment Here
Reference: Convolutional neural networks
Comment Here
Reference: Convolutional neural networks
Board review style question #2
Board review style answer #2
C. The pooling layer performs a downsampling operation (e.g., the maximum, minimum or average value of elements belonging to a 2 x 2 matrix may be computed)
Comment Here
Reference: Convolutional neural networks
Comment Here
Reference: Convolutional neural networks