The Three P's of Machine Learning and Artificial Intelligence in GI Endoscopy

Promise, Pain, and Process

The Promise

At Virgo, we believe that GI endoscopy is ripe for AI-driven innovation. We share that view with clinical, industrial, and academic thought leaders, and are building tools to improve clinical performance and patient care. Virgo customers are on the cutting edge and determine which tools we build and how they’re implemented.

We know there is a great deal of buzz in the market about the promises AI products will deliver. Despite positive buzz and excitement, sometimes important discussions regarding the practicality of how these tools get into end users' hands don't make it to print. We get it – it isn't usually all that exciting to pop the excitement balloon – but at the same time, we all need to come together and be realistic if we ever want to see true clinical value from AI. From our perspective, some of the important topics missing from the conversation include data acquisition and data management.

The Pain

Machine learning is the process of labeling enough data that a neural network can be trained to classify a given scenario. This requires a diverse dataset that allows the neural network to accurately generalize its ability to classify input and generate metadata when it recognizes the characteristics for which it was trained.

This explains why machine learning is absolutely dependent upon large scale data acquisition. In general, the larger your dataset, the greater the likelihood it includes the low-probability scenarios that are required to train highly accurate models – those that provide genuine value to physicians.

Consider, also, the most fundamental rule of computation: GIGO – Garbage In, Garbage Out. A large dataset is useless without the tooling and processes required to find the needle in the haystack and sort the wheat from the chaff. Models MUST be trained with high quality data in order to provide value.

For all of these reasons, the best machine learning and artificial intelligence systems are built atop a large, diverse, and continuously growing data set coupled with high quality classification and verification. This data set must be continuously fed into precision processes that train, validate and deploy improved neural nets.

To that end, Virgo has amassed over 100 terabytes of endoscopic video comprising more than 17.5 billion frames of endoscopic video, from 15+ leading endoscopy centers including: Northwestern University, University of Virginia and University of Colorado. This includes video from modern video systems including Olympus, Pentax, and Fuji endoscopy video processors, as well as ultrasound, fluoroscopy, SpyGlass, and other advanced imaging modalities.

The Process

We are applying these principles to our production and commercially available products, as well as our pipeline initiatives.

Atop our large, high-quality, and diverse dataset, we have trained our first Auto Procedures model to provide high-value AI to our platform. With zero input from physicians, our system now auto-detects basic procedures and does basic instrument detection. Our customers no longer need to search through a raw video feed since our AI platform has classified over 85,000 endoscopic procedures, a large portion of which further contain instrument detections – allowing physicians to directly and immediately skip to the critical parts of procedures when they need to review a procedure. This has saved physicians an immense amount of time.

We’re now training and validating our second generation Auto Procedures model, built atop Google’s AutoML image classification neural net. This process is nearly complete, and we have a substantial pipeline of additional models, including advanced Auto Procedures classification, advanced instrument classification, basic polyp detection and classification, and basic IBD detection and scoring.

In order to provide ongoing value, model training must be: repeatable, validatable, and continuously adjusted to ensure high classification accuracy and regression-free operation. This is challenging due to both labeling and training time and cost. To meet these challenges, Virgo has invested in software development and will continue to invest to meet these needs in the future.

Our process and software computes statistics to guarantee the input set is diverse across video systems, tools, procedures, and institutions. This is auditable and repeatable across multiple-iterations using a highly efficient asynchronous architecture running on systems with very high I/O SSD RAID configurations and gigabyte per second network interface cards in order to saturate all available CPU capacity, even on massively parallel systems with 224 CPU threads. This architecture trains and refines models in hours and days –– which often requires weeks and months using less efficient means.

The final mile to ideal AI development is democratizing the data labeling process, which we deeply believe in. Our development pipeline includes features such as end user driven post-procedure labeling, which will move us toward model-initiated labeling where the needles in the haystacks are first roughly identified by machine models which simply need verification by humans to keep them honest and provide ever increasing texture. This may well be the ideal way to ensure that models are continuously improved and adapt to present conditions.

We are always looking for physicians and researchers who are as excited as we are about applying artificial intelligence to endoscopy. We look forward to a future where anyone who is interested can actively capture data and contribute to AI projects relevant to their needs and expertise.

If you want to help us build that future, please contact info@virgosvs.com.