AI Diagnostic Computer Outperforms Human Expert


Our findings show that a deep learning algorithm can use images collected during routine cervical cancer screening to identify precancerous changes that, if left untreated, may develop into cancer“, commented  Mark Schiffman, M.D., M.P.H., of NCI’s Division of Cancer Epidemiology and Genetics. “In fact, the computer analysis of the images was better at identifying precancer than a human expert reviewer of Pap tests under the microscope (cytology)”.

Dr. Schiffman is referring to the collaboration between the National Institutes of Health and Global Good, which resulted in the creation of a computer algorithm that can analyze digital images of a woman’s cervix and accurately identify precancerous changes that require medical attention. The approach, called automated visual evaluation, could revolutionize cancer screening, particularly for low-resource settings.


At the moment, in the absence of advanced screening methods, health care workers use a method called visual inspection with acetic acid (VIA). It relies on visual examination, looking for “aceto whitening”, a possible indicator of disease. While it is cheap and convenient, VIA is known to be inaccurate.

For designing the algorithm, the team used more than 60,000 cervical images from an NCI archive of photos taken during a screening study in the 1990s. More than 9,400 women participated and follow-ups lasted up to 18 years. Thanks to the perspective nature of the project, researchers gained nearly complete information on which cervical changes became precancers and which did not.

When this algorithm is combined with advances in HPV vaccination, emerging HPV detection technologies, and improvements in treatment, it is conceivable that cervical cancer could be brought under control, even in low-resource settings“, noted executive vice president of Global Good, Maurizio Vecchione.

The next step would be load the computer with samples of images from women in communities around the world, using a variety of cameras and other imaging options. This would help build a more complete picture and train the algorithm to distinguish among the possible subtle variations.