Table of Contents
Definition / general | Essential features | Terminology | Motivation for artificial intelligence algorithms | Scope and types of artificial intelligence | Types of machine learning | Developing an AI algorithm | Applications | Implementation | Advantages | Limitations | Software | Methods to improve data curation | Diagrams / tables | Board review style question #1 | Board review style answer #1 | Board review style question #2 | Board review style answer #2Cite this page: Levy J, Vaickus L. Artificial intelligence. PathologyOutlines.com website. https://www.pathologyoutlines.com/topic/informaticsAI.html. Accessed November 27th, 2024.
Definition / general
- Through a set of computational heuristics, artificial intelligence (AI) technologies efficiently parse and summarize millions of clinical variables collected in modern pathology laboratories in a knowledge / rules based or data driven way to augment clinical decision making
- Machine learning (ML) is a subset of AI approaches that learn patterned associations and rules to solve specific problems in instances where the number of clinical variables is far too large and complex for normal human comprehension
- Supervised algorithms can make predictions on data that has been annotated by pathologists, whereas unsupervised algorithms do not require pathologist annotations
Essential features
- Advances in genomics and imaging have generated complex biomedical data, posing challenges in comprehensive evaluation of clinical variables (Lab Invest 2021;101:412)
- Early AI applications relied on rules based approaches embedded in electronic medical record systems (Artif Intell Med 1992;4:463)
- Machine learning algorithms can derive patterns and rules from diverse pathology datasets, with deep learning excelling in image / text processing (Acad Pathol 2019;6:2374289519873088)
- Machine learning models learn from training sets and their generalizability is assessed using validation / test sets
- Several informatics software solutions have been developed to streamline the AI algorithm development process
- Investments in technical personnel, computing and data infrastructure enable rapid prototyping and techniques like transfer learning and expert in the loop which reduce costs in data collection and annotation (Pathol Res Pract 2020;216:153040)
- Potential for batch effects requires partitioning patients into separate training / test sets to avoid biasing the prediction models (JCO Clin Cancer Inform 2019;3:1)
Terminology
- Whole slide image (WSI): digitized representation of histologic slide after whole slide scanning at 20x or 40x resolution (e.g., using Aperio AT2 or GT450 scanner); slide dimensionality can exceed 100,000 pixels in any given spatial dimension and typically contains 3 color channels, red, green and blue (RGB) (Annu Rev Pathol 2013;8:331)
- Subimage / patch: smaller, local rectangular region extracted from a WSI, often done to reduce the computational resources required for the development and deployment of machine learning algorithms (J Pathol Inform 2019;10:9)
- Gene expression array: using DNA microarrays and next generation sequencing to simultaneously estimate the expression of thousands of genes from a sample; used for diagnosis, prognosis and the selection of optimal therapeutics (Clin Biochem Rev 2011;32:177)
- Spatial / single cell omics: technologies that report gene expression for individual genes or locations within distinct spatial architectures within a tissue section (Nat Methods 2021;18:997, Trends Biotechnol 2010;28:281)
- Pathology note / sign out: textual representation of clinical narrative that can be broken down into words and phrases for further analysis (J Biomed Inform 2021;116:103712, Sci Rep 2021;11:23823, JCO Clin Cancer Inform 2018;2:1, J Pathol Inform 2019;10:13, J Pathol Inform 2022;13:3)
- Artificial intelligence: computational approaches developed to perform tasks that typically require human intelligence / semantic understanding (Lancet Oncol 2019;20:e253)
- Machine learning: computational heuristics that learn patterns from data without requiring explicit programming to make decisions (Med Image Anal 2016;33:170)
- Artificial neural networks (ANN) (Nature 2015;521:436)
- Type of machine learning algorithm that represents input data (e.g., images) as nodes (neurons)
- Learns image filters (e.g., color, shapes) used to extract histomorphological / cytological features
- Comprised of multiple processing layers to represent object at multiple levels of abstraction (deep learning)
- Inspired by the visual cortex
- Classification: use of computer algorithm for assigning type of object into a specific, predetermined grouping (Med Image Anal 2016;33:170)
- Regression: computer algorithm that can predict a continuous measure from input information (Med Image Anal 2016;33:170)
- Clustering: use of computer algorithm to group objects together, either based on similar features or spatially based on their colocalization (Med Image Anal 2016;33:170)
- Dimensionality reduction: use of a computer algorithm to visualize high dimensional data (e.g., many genes) into a low dimensional space (e.g., 2D scatterplot); each dimension typically represents a combination of markers and the distance between points in the scatterplot depicts relationships between datapoints in a simplified form (Med Image Anal 2016;33:170)
- Feature selection: use of a computer algorithm to rank and select features based on their perceived relevance to the target of interest using a quantitative metric (Med Image Anal 2016;33:170)
- Segmentation: use of a computer algorithm for pixelwise assignment of specific classes (e.g., nucleus) without specific separation of objects (Med Image Comput Comput Assist Interv 2015;18:234)
- Detection: use of computer algorithm to isolate specific objects in an image and report object’s bounding box location, etc. (Am J Pathol 2021;191:1693)
- Generative adversarial networks (GAN): type of neural network that generates highly realistic synthetic images from input signal (e.g., noise, source image) through iterative optimization of a generator that synthesizes images and discriminator / critic that attempts to distinguish generated from real images (Mod Pathol 2021;34:808)
- Evaluation metrics: used to depict performance of automated algorithm (Arch Pathol Lab Med 2021;145:1228)
- Accuracy: proportion of observations that were classified correctly
- Sensitivity: proportion of cases that were classified correctly at the given cutoff probability threshold
- Specificity: proportion of controls that were classified correctly at the given cutoff probability threshold
- AUC: area under the receiver operating characteristic curve, an overall measure of performance considering sensitivity / specificity reported across many cutoff thresholds
- F1 score: captures tradeoff between sensitivity / specificity
- Intersection over union (IoU): used to evaluate accuracy of cell localization algorithm by comparing the area overlap between the predicted cell location and ground truth location to the area union between the predicted / ground truth location
- Mean average precision (mAP) averages the precision of the model across many IoU thresholds (i.e., minimum IoU between predicted / ground truth locations to indicate a true positive detection)
- Training cohort: set of cases used for training machine learning model
- Validation / test cohort: set of cases used for evaluating machine learning model that the model does not train on
- Validation set is used to optimize the hyperparameters of the machine learning model, i.e., its training configuration (e.g., learning rate)
- Data complexities (Reg Anesth Pain Med 2021;46:936, Advances in Molecular Pathology 2022;5:e1)
- Confounding: risk of a specific outcome varies due to an exposure, this difference can be attributed to an unrelated variable
- Effect modification: risk of a specific outcome, given an exposure, changes depending on the patient subgroup
- Nonlinearity: relationship between a specific clinical variable and the outcome changes based on the variable's value
- Interactions: relationship between a specific clinical variable and the outcome changes based on a separate clinical variable, similar to effect modification
- Dimensionality: a large number of clinical predictors (millions to billions) are regularly collected per patient, surpassing the number of patients; these variables are further complicated by nonlinearity and interactions
- Personalized medicine: individualized categorization of health conditions by considering various factors specific to each person
Motivation for artificial intelligence algorithms
- Current health assessment methods are often unreliable, slow, nuanced and tedious, highlighting the need for reliable, quantitative and efficient algorithms (Mod Pathol 2022;35:1540)
- Data complexities necessitate pattern mining AI algorithms: assessing nuanced information becomes challenging due to confounding, effect modification, nonlinearity and interactions in the context of hyperdimensional data, where each patient may have millions to billions of features (BMC Med Res Methodol 2020;20:171)
- Pursuit of personalized medicine will lead to individualized assessments, enhancing health outcomes across different diagnostic settings and geographic regions (Genome 2021;64:416)
- Includes molecular alterations, comorbidities and healthcare data that interact across space and time and considers unique screening, assessment and treatment options
- While it is important to understand population level risk factors, it is equally important to assess risks for individual of diverse backgrounds, epigenetics / genetic makeup and exposures who experience care differently
Scope and types of artificial intelligence
- Originated as a summer project at Dartmouth College in 1956 (Acad Radiol 2021;28:1810)
- Concept of machines imitating human cognitive abilities
- Turing test: if a human evaluator cannot distinguish between responses from a computer and a human, it indicates that the machine is capable of human level intelligence (Nat Cancer 2020;1:137)
- Narrow intelligence: optimize / solve one narrow problem at a time, excel in one domain at a time (Lab Invest 2021;101:412)
- Does not possess human-like cognitive processes
- Encompasses the following methods
- Expert systems: rules based approaches that leverage human curated knowledge, common to most EMR systems and used in real time decision support systems focused on monitoring a specific function or a small set of information (Diagram 1)
- Knowledge engineering: embedding human expertise and knowledge into computer systems; domain experts provide rules, heuristics and guidelines to implement in AI systems for clinical decision making
- Symbolic reasoning: representing knowledge through a series of logical statements, symbols (that contain certain mathematical properties) and rules; the manipulation of these 3 components is used to solve problems
- Automated planning processes: generation of plans and sequences of actions to achieve a goal of interest
- Given a problem, search algorithms / optimization techniques are employed to generate a series of possible events and relevant actions
- The set of actions taken reflects the maximum possible expected reward
- Planning / scheduling tasks may be used to optimize the laboratory workflow and the surgical pathology schedule based on a number of constraints and the patient’s complexity
- Case based problem solving
- Identifies similar cases in a database of prior cases and their solutions to solve a problem
- Used to solve rare cases by providing a similar guide to follow
- Quality assurance / detect errors through comparison to nominal findings
- Used to compare against similar cases for clinical decision support
- Evolutionary computation: optimization systems that recommend a number of solutions that compete / combine with each other to form an optimal solution, inspired by natural selection with generations of recombination, mutation and selection; genetic algorithms can be used as an optimization routine for the decision support systems themselves
- Machine learning: discovers rules and patterns by directly learning relevant features from the data (e.g., collections of relevant shapes within image)
- Artificial general intelligence (AGI): intelligence comparable to that of a human, capable of solving large, complex tasks from first principles (Semin Diagn Pathol 2023;40:71)
- Artificial superintelligence (ASI): computer possessing intelligence that far surpasses that of a human (Arrhythm Electrophysiol Rev 2021;10:223)
- Natural language processing: development of algorithms to understand, interpret / parse and imitate human language (Am J Pathol 2022 Aug 17 [Epub ahead of print])
- Tasks include tokenization (breaking sentences into words), parsing (locating specific words), stop word removal (removing common words), lemmatization / word stemming (converting word to common word root), part of speech tagging, named entity recognition (e.g., identifying mention of disease within pathology report), topic modeling (i.e., identifying prevailing trends), text generation (e.g., autocomplete for pathology report), classification (e.g., assigning report case complexity CPT code), etc.
- Count matrices (frequency of each words by report) and word vectors / embeddings (i.e., assigning each word a numerical descriptor that can be compared to other words; e.g., word2vec, transformer, etc.) used to represent data
- Large language models (e.g., ChatGPT) have risen in popularity over the past few years and have many use cases in pathology, including assistance with writing / summarizing clinical reports and grant proposals (PLOS Digit Health 2023;2:e0000198)
- Computer vision: aims to identify and quantify the presence of significant patterns within an image (NPJ Digit Med 2021;4:5)
- Imaging features can be used to effectively organize and extract meaningful information from the image
- Capable of taking measurements such as nucleus size, the number of nuclear grooves and other relevant metrics derived from imaging features
- Tasks include classification (e.g., binning cell based on assigned cell type), segmentation (e.g., separating nucleus from cytoplasm in pixel by pixel manner), detection (e.g., locating instances of cells within slide images), coregistration (aligning 2 restained sections together to tag cells with multiple immunostains), etc.
Types of machine learning
- Supervised learning: when diagnosis / outcome is known
- Classification: prediction of a categorical assignment
- Regression: prediction of a continuous dependent variable
- Survival analysis: special form of regression analysis to estimate time to event outcomes; commonly formulated using Cox proportional hazards
- Example approaches: multivariable linear / logistic regression, decision trees, random forest, support vector machines, discriminant analyses, K nearest neighbors
- Unsupervised learning: when diagnosis / outcome is not known
- Dimensionality reduction: reduces the number of high dimensional input variables to a manageable form (i.e., makes it easier to visualize data in a 2 dimensional scatterplot to see how patients relate to one another) (Nat Biotechnol 2018 Dec 3 [Epub ahead of print], mSystems 2021;6:e0069121)
- Example approaches: principal component analysis (PCA), uniform manifold approximation and projection (UMAP), T distributed stochastic neighbor embedding (TSNE), variational autoencoders (VAE)
- Clustering: process of grouping patients, genes etc., based on similar characteristics
- Example algorithms: K means, hierarchical clustering, spectral clustering, density based clustering, mixture models
- Dimensionality reduction: reduces the number of high dimensional input variables to a manageable form (i.e., makes it easier to visualize data in a 2 dimensional scatterplot to see how patients relate to one another) (Nat Biotechnol 2018 Dec 3 [Epub ahead of print], mSystems 2021;6:e0069121)
- Deep learning: leverage artificial neural networks (ANN), comprised of multiple processing layers to represent objects at several levels of abstraction (J R Soc Interface 2018;15:20170387)
- Can perform both supervised and unsupervised tasks
- Requires significant computing capabilities in the form of graphics processing units (GPU)
- Convolutional neural networks: ideal for image data (e.g., whole slide images), work by storing filters to extract task relevant shapes and patterns
- Recurrent neural networks: well suited for sequence data (e.g., genomics, text, time series) by keeping a working memory of previous states in a sequence
- Graph neural networks: optimal for graph structured data through message passing operations between linked entities (e.g., incorporating information from adjacent histological structures, cells, etc.) (Nat Biomed Eng 2022;6:1353)
- Attention mechanism: dynamically assigns weights to subcomponents of different data types (e.g., images, text, genomics, graphs, etc.), considering their relevance and importance
- Generative adversarial networks: capable of generating synthetic data for various pathological data types (e.g., images of cells, simulating application of different chemical staining reagents)
- Reinforcement learning: learns from feedback in the environment or system of interaction to anticipate future actions based on the patient's state and expected rewards (Acad Pathol 2019;6:2374289519873088)
- Popularly used for drug design and delivery with only limited applications in pathology
- Data preprocessing: transform data into a format that can be readily understood by AI algorithms (Pac Symp Biocomput 2020;25:307)
- Missing data: imputation (i.e., replacement) of missing features if data is missing at random (MAR) or missing completely at random (MCAR); removal of features if excessive missingness or missing not at random (MNAR; i.e., missingness tied to outcome) if the reason for missingness is not well known
- Standardization / normalization: process of lessening the impact of extraneous values to improve model performance
- Feature selection: selecting a subset of the most relevant input variables to improve predictive models (Machine Intelligence and Pattern Recognition 1994;16:403)
- Variance filtering: retains features with greatest variation within the dataset
- Variance inflation factor: metric that scores features based on correlation / redundancy of variable to other independent variables (Qual Quant 2007;41:673)
- LASSO / ridge: shrinks or eliminates predictors, largely based on whether they are correlated / contain redundant information
- Feature importance: model specific scoring of features based on contribution to predictive performance; retains most relevant features
- Recursive feature elimination: iteratively considers smaller set of features using the metrics defined above
Developing an AI algorithm
- Algorithmic development process (Diagram 2) (Nat Med 2020;26:1320, BMJ Health Care Inform 2021;28:e100385)
- Defining the problem: guide collection of data germane to the task at hand; decide whether the problem conforms to a specific task (e.g., classification, regression, etc.) based on the available data and the anticipated outcomes
- Data: organizing (e.g., annotation by a pathologist using software) and preprocessing data into a format digestible by computer algorithms and partitioned to demonstrate broad scale applicability
- ASAP, Qupath, ImageJ: used for annotating image data (Sci Rep 2017;7:16878, Comput Struct Biotechnol J 2021;19:852)
- Doccano: used for annotating text data (arXiv: POTATO - The Portable Text Annotation Tool [Accessed 19 July 2023])
- Cross validation procedures iteratively partition the dataset to compare models (e.g., random forest) and their set of hyperparameters (i.e., specifies constraints for how a machine learning model learns from the data; e.g., maximum depth of the decision tree) to decide on an optimal set to use as the final model while avoiding overfitting (memorization of the input data); model is trained across the cross validated dataset and evaluated on the test set
- Alternatively, separate training, validation and test datasets can be specified, identifying the optimal set based on validation set performance, ensuring patients and other sources of systematic variation are solely assigned into separate cohorts (e.g., all samples from one patient are only in the validation set)
- See evaluation metrics
- Solutions
- Statistics: interpreting the relationship between predictors and outcomes
- Algorithms: prioritizing prediction accuracy for specific tasks, with less emphasis on interpretation
- Evaluation: utilizing performance metrics to assess algorithm performance in real world clinical settings
- Interpretation techniques assign scores to features that are pertinent to the prediction; used to evaluate the coherence and validity of the model's outputs or identify sources of systematic bias
- Feedback / refine: iteratively collecting data, improving algorithms and aligning algorithms with clinical needs through consultation with clinical stakeholders
Applications
- Anatomic pathology (Diagram 3)
- Prostate cancer Gleason grading (Mod Pathol 2018;31:S96)
- Surgical pathology: rapid intraoperative margin assessment for Mohs surgery (medRxiv 2023 May 16 [Preprint], medRxiv 2022 May 20 [Preprint])
- Nuclei detection: localizes nuclei within an H&E or IHC slide for further characterization of their spatial distribution (Nat Methods 2019;16:1233)
- Cytopathology
- Separation of cytoplasmic boundaries for characterization of cell clusters (Cancer Cytopathol 2023;131:19)
- Rapid bladder cancer screening and recurrence assessment (medRxiv 2023 March 2 [Preprint], medRxiv 2023 March 5 [Preprint])
- Subclassification of thyroid nodules with atypia of undetermined significance (J Pathol Inform 2022;13:100004)
- Virtual staining: digital conversion between different chemical staining reagents (Light Sci Appl 2020;9:78, Mod Pathol 2021;34:808)
- Image registration: process of aligning 2 different images (e.g., 2 different IHC stained slides from serial sections) into a common coordinate system (Cancer Res 2023;83:2078)
- Graph neural networks: algorithms capable of contextualizing smaller regions / points of interest within a tissue slide by their surrounding architecture (Comput Med Imaging Graph 2022;95:102027, Pac Symp Biocomput 2021;26:285)
- Clinical trials: uses digital algorithms to predict disease outcomes / quantitatively assess prognostic risk as a study endpoint or to derive biomarker for clinical validation (NPJ Precis Oncol 2022;6:37)
- Molecular pathology (Diagram 4)
- Genetic polymorphisms: fast algorithms for calling single nucleotide polymorphisms, estimating epistatic interactions (Hum Genet 2011;129:101)
- DNA methylation: AI can be used to study age acceleration, cellular heterogeneity, cancer subtyping and prognostication (BMC Bioinformatics 2020;21:108)
- RNASeq: machine learning to define intrinsic molecular subtypes (e.g., PAM50, scleroderma) with associated therapeutic response (Arthritis Rheumatol 2019;71:1701)
- Single cell and spatial omics: helps localize distinct cellular populations to distinct histological architectures; machine learning tools help map single cells to tissue slides and predict spatial expression from H&E WSI (Nat Rev Genet 2019;20:257, Nat Methods 2022;19:534)
- Genetic diversity as a prognostic signature: classification and dimensionality reduction of genomic reads assigned to taxa can inform prognostically important microbiome diversity measures (Bioinformatics 2012;28:i356, Front Microbiol 2021;12:634511)
- CRISPR: prediction of off target binding effects and diagnostics (PLoS One 2022;17:e0262299)
- Intersections between anatomic and molecular pathology
- Multimodal models: enhances the ability to prognosticate by leveraging / combining information from different data types (IEEE Trans Med Imaging 2022;41:757, Cancer Cell 2022;40:1095)
- Spatial multimodal models: learn additional biologically relevant features from both coregistered H&E and spatial molecular information (Diagram 4) (J Pathol Inform 2023;14:100308)
- Tumor purity control: uses image analysis methods to estimate the proportion of malignant cells within macrodissected tumor regions for accurate estimation of the tumor mutational burden (Mod Pathol 2022;35:1791)
- NLP and pathology notes
- Current procedural terminology (CPT) code prediction: identifies instances of underbilling by serving as second check for assignment of primary CPT codes (Diagram 5) (J Pathol Inform 2022;13:3)
- Named entity recognition: extraction of relevant reporting information (e.g., staining / staging results) into a structured format for integration into electronic health records (IEEE J Biomed Health Inform 2018;22:244)
- Automated input / autocomplete of pathology reports: improves the speed of inputting information into pathology reports and can assist with scanning and structuring reports shared from other institutions (Sci Rep 2021;11:23823, Acta Neurochir Suppl 2022;134:207, arXiv: Fast, Structured Clinical Documentation via Contextual Autocomplete [Accessed 19 July 2023])
Implementation
- Guidelines for development and validation of AI algorithms
- CLIA / CAP (CAP: How to Validate AI Algorithms in Anatomic Pathology [Accessed 19 July 2023])
- Using AI to preclassify tissue specimens with follow up review by the pathologist - must be integrated with FDA approved digital pathology system
- Validation on at least 60 samples is required for new diagnostic decision aids and should follow guidelines for validating whole slide imaging systems, reflecting real world clinical scenarios; additional validation is needed for changes to the system
- Pathologists cannot always claim that diagnostic overlay had misled them in instances where discrepancies arise - understanding appropriate usage of the system is crucial
- For instance, a system that is configured for diagnostic purposes (e.g., correctly classifying a benign reactive case for cervical biopsies for cervical cancer screening) cannot be used to localize cells suggestive of herpes simplex virus if that is not its intended use
- The pathologist would still be required to both affirm the benign reactive case if they are in agreement and perform manual examination for additional findings
- Devices not approved by the FDA can be used as long as the system is validated and information on the regulatory status of the device is disclosed in the pathology report
- Health care delivery science: design thinking, process improvement and implementation science, along with continuous quality control / monitoring and evaluation of technologies in realistic scenarios are essential for envisioning successful implementation of AI in the clinic (NPJ Digit Med 2020;3:107)
- Examples of checklists for design and evaluation of AI technologies
- SPIRIT-AI: trial protocols - guidelines for designing clinical trials involving the use of artificial intelligence interventions (Nat Med 2020;26:1351)
- CONSORT-AI: trial reports - guidelines for reporting clinical trials involving the use of artificial intelligence interventions (Nat Med 2020;26:1364)
- MI-CLAIM: minimum information required to evaluate a clinical AI study, involving assessment of the following components: study design, data separation, optimization and final model selection, performance evaluation, model examination and reproducible pipeline (NPJ Digit Med 2022;5:2)
- Features a tiered system of reporting standards, enabling release of different levels of information
- Enhancing diversity in biomedical AI applications (EBioMedicine 2021;67:103358, NPJ Digit Med 2022;5:2)
- Short term solution: diversify data collections and monitor AI algorithms for sources of bias
- Long term solutions: policy, regulatory changes regarding funding, education and publications (e.g., health disparities in FDA guidelines), enhancing the development of diverse teams
- Large language models authorship policies: large language models (LLM) should be used to help perform minor edits to originally drafted content to improve readability and language rather than generate new content
- Most journals do not consider large language models as satisfying authorship criteria as they are unable to assume accountability for their work
- A statement in the methods or acknowledgements section can address use of LLM for academic writing assistance
- CLIA / CAP (CAP: How to Validate AI Algorithms in Anatomic Pathology [Accessed 19 July 2023])
Advantages
- Advantages: informatics approaches that can be employed in the clinical setting (Ann R Coll Surg Engl 2004;86:334, J Clin Med 2020;9:3697)
- Data driven: leverages various types of data to facilitate enhanced learning and synthesis of information from multiple sources, including but not limited to genomics, laboratory data and imaging data
- Offers distinct perspectives on a patient's health and wellness
- Technologies sift through vast amounts of data, uncovering hidden, interconnected patterns that would have been difficult to identify individually
- Flexible programming, enabling them to possess a comprehensive understanding of the abundant data generated across numerous healthcare systems
- Efficient: empower healthcare professionals to accomplish more within shorter timeframes
- Digital connectivity: algorithms can efficiently process and deliver results regardless of the data's location, providing real time insights, enabling faster, less error prone decision making and significantly reduces the burden of tedious tasks, making them more manageable and higher throughput
- Reduces barriers to entry for nonspecialists, particularly helpful within lower resourced areas around the world
- Effective: potential to translate into positive health outcomes, enhance healthcare delivery and contribute to overall better health and well being through personalized care recommendations
- Data driven: leverages various types of data to facilitate enhanced learning and synthesis of information from multiple sources, including but not limited to genomics, laboratory data and imaging data
Limitations
- Disadvantages / ethical considerations
- Algorithms should complement clinical training and intuition, serving as a secondary check or to identify areas to focus on or to select high priority cases for triage; relying solely on automation can transfer the responsibility of clinical decisions to the algorithm makers, potentially resulting in erroneous decisions due to overreliance (JMIR Form Res 2022;6:e36501)
- Data ownership: the question of who owns the data (patients, departments, hospitals, research organizations) and generates the models can influence clinical decisions and inform concerns about bias, conflicts of interest and commercialization interests; collaboration between healthcare organization, researchers, patients and regulatory bodies are essential for establishing clear ethical guidelines
- Bias in AI algorithms can result in algorithmic behavior that disproportionately affects historically underserved groups, leading to higher rates of underdiagnosis and misdiagnosis in these populations, further emphasizing the need for increased representation and diversity in AI development and STEM training (Engag Sci Technol Soc 2017;3:139, PLoS Comput Biol 2022;18:e1009719)
- Data hungry AI algorithms are susceptible to security risks, which can lead to costly litigation
- Federated learning and other security / privacy preserving technologies can enable the development of AI models across multiple institutions while minimizing security / privacy risks by keeping the data hidden / encrypted (NPJ Digit Med 2020;3:119)
- Not every clinical problem can be effectively solved with AI and incorporating clinical input into development and deployment strategies can foster successful translation and adoption of medical AI technologies (BMC Med 2019;17:143)
- Focus is primarily on prediction rather than statistical inference, which may overlook important risk factors that require intervention; sometimes simpler modeling heuristics are sufficient to solve most biomedical problems (BMC Bioinformatics 2018;19:270, BMC Med Res Methodol 2020;20:171)
Software
- Select examples for AI algorithm development include
- Scikit-image, OpenCV2: image analysis frameworks (PeerJ 2014;2:e453, Comput Biol Med 2017;84:189)
- Scikit-learn, Caret: machine learning framework (Bioinformatics 2023;39:btac829, J Open Source Softw 2019;4:1903)
- PyTorch, Keras, Tensorflow: deep learning frameworks (Mol Cancer Res 2022;20:202)
- Instructional book for developing deep learning workflows: D2L: Dive into Deep Learning [Accessed 19 July 2023]
- Detectron2, MMDetection: cell detection frameworks (Cancer Cytopathol 2023;131:19)
- Regex, Textblob, Spacy, NLTK, Gensim, Huggingface: Text processing frameworks (arXiv: HuggingFace's Transformers - State-of-the-art Natural Language Processing [Accessed 19 July 2023], Sarkar: Text Analytics with Python - A Practitioner's Guide to Natural Language Processing, 2nd Edition, 2019)
- Captum, SHAP: model interpretation frameworks (Nat Mach Intell 2020;2:56, arXiv: Captum - A Unified and Generic Model Interpretability Library for PyTorch [Accessed 19 July 2023])
- OWL: web ontology language, used for knowledge representation and reasoning (Decision Support Systems 2010;50:1)
- ROS: example of robotics operating systems (Koubaa: Robot Operating System (ROS) - The Complete Reference (Volume 2) (Studies in Computational Intelligence, 707), 1st Edition, 2017)
- DEAP: distributed evolutionary algorithms in Python, example of an evolutionary computation framework (The Journal of Machine Learning Research 2012;13:2171)
- TPOT: example of an AutoML approach (train a machine learning model with a single line of code) (Hutter: Automated Machine Learning, 1st Edition, 2019)
- Docker / Singularity and Conda: software to create reproducible, production level software for deploying machine learning algorithms (PLoS Comput Biol 2020;16:e1008316, Nat Methods 2021;18:1161)
- Introductory machine learning tutorials in R and Python (GitHub: Molecular Pathology Machine Learning Tutorial [Accessed 19 July 2023])
Methods to improve data curation
- Human in the loop: systems that incorporate user feedback in real time to improve predictive models
- Bayesian active learning: iteratively labeling images based on algorithmic uncertainty / mistakes (Sci Rep 2019;9:14347)
- Unsupervised clustering: sorts objects into groups and suggests groups to annotate (J Pathol Inform 2022;13:100146, Nat Commun 2012;3:1032, Am J Clin Pathol 2022;157:5)
- Point annotations: avoids need to outline objects in WSI by extrapolating from a single central point (Med Image Anal 2020;65:101771, IEEE J Biomed Health Inform 2021;25:1673, IEEE Trans Med Imaging 2020;39:3655)
- Segment anything model: deep learning approach that can generate polygonal annotations from minimal specification of points and boxes (e.g., used to quickly annotate epithelial regions in WSI or localize nuclei) (arXiv: Segment Anything Model (SAM) for Digital Pathology - Assess Zero-shot Segmentation on Whole Slide Imaging [Accessed 19 July 2023])
- Transfer learning: training models on data from a separate, related domain / tissue type (e.g., stomach cancer) and further improving / fine tuning these models on the target tissue type (e.g., colon cancer) (Nat Commun 2020;11:6367)
- Less data is needed as deep learning model populates an initial information registry from the stomach cancer set that will complement the colon cancer set
- Assumes similar domains between source and target models
Diagrams / tables
Board review style question #1
Board review style answer #1
B. Decision trees are a prime example of a supervised learning algorithm. Answers C and D are incorrect because they are both examples of unsupervised machine learning algorithms, dimensionality reduction and clustering respectively. Answer A is incorrect because an artificial neural network can be used for supervised tasks as well as unsupervised tasks (e.g., dimensionality reduction, clustering).
Comment Here
Reference: Artificial intelligence
Comment Here
Reference: Artificial intelligence
Board review style question #2
Which of the following is an example of an artificial intelligence algorithm for pathology?
- Deep learning algorithm used to detect vertebral fractures from Xrays
- Hard programmed system designed by programmers and informed by domain experts, which is able to calculate the total NAS score / NAFLD activity score from a collection of pathology reports input into the electronic medical record system, helping determine a NASH diagnosis for each patient
- Manual examination of a urine cytology slide for evidence of cytological atypia for bladder cancer screening
- Multivariable logistic regression model that is able to ascertain a risk estimate for the impact of copper depletion within and around tumors on the potential for metastasis
Board review style answer #2
B. Hard programmed system designed by programmers and informed by domain experts, which is able to calculate the total NAS score / NAFLD activity score from a collection of pathology reports input into the electronic medical record system, helping determine a NASH diagnosis for each patient. Answer B represents a rules based system that can sum the scores of steatosis (0 - 3), lobular inflammation (0 - 3) and hepatic ballooning (0 - 2) to determine a final NAS score, where a final NAS score ≥ 5 is highly associated with definitive diagnosis of NASH. The explicit programming and incorporation of expert knowledge on NASH, along with integration into the EMR, demonstrate its applicability as a predictive AI approach to inform disease diagnostics, despite its simplicity compared to answers A and D.
Answer C is incorrect because it does not involve the use of any programming or computer analysis as decisions are rendered through manual assessment. Answer A is incorrect because it is an example of an artificial intelligence technique; however, the domain of application is not pathology but more likely found in a radiology setting. Answer D is incorrect because it represents a statistical analysis that was done to study an association between a specific treatment (copper depletion) and outcome (formation of metastasis). The emphasis for the employed algorithm was to study a potential risk factor or treatment of a health outcome rather than to make a prediction for diagnostic decision making.
Comment Here
Reference: Artificial intelligence
Answer C is incorrect because it does not involve the use of any programming or computer analysis as decisions are rendered through manual assessment. Answer A is incorrect because it is an example of an artificial intelligence technique; however, the domain of application is not pathology but more likely found in a radiology setting. Answer D is incorrect because it represents a statistical analysis that was done to study an association between a specific treatment (copper depletion) and outcome (formation of metastasis). The emphasis for the employed algorithm was to study a potential risk factor or treatment of a health outcome rather than to make a prediction for diagnostic decision making.
Comment Here
Reference: Artificial intelligence