Accessibility: Skip to content

Healthcare provides various types of Real World Data (RWD) that can be used to generate Real World Evidence. Traditional statistical approaches are suitable for the majority of the data (numbers, measures, classes), however, utilizing the full potential of more complex data types (such as images, free text, and various signals from medical devices) require more sophisticated analytic tools. This is where machine learning (ML) based algorithms and artificial intelligence (AI) have a role to play.

The Finnish RWD/RWE landscape has a lot to offer. The various registries, biobanks and hospital data-lakes collect massive amounts of data from everyday patient care and beyond. Some can offer nation-wide coverage, some almost real time updated data. The Finnish RWE landscape has been described in more detail in previous Future Care Finland posts (‘Finnish ecosystem supports the use of RWE‘ and ‘Finnish hospital biobanks promote research both nationally and internationally‘). These data sources can effectively be used in RWE generation via research projects, and usually, the main interest has been on structured data (numbers and labels in a table) that can be easily processed and analyzed.

Antti Karlsson, PhD (Theoretical Physics), Data Science Development Manager, Auria Biobank

However, some of the data-sources (namely hospital data-lakes and biobanks) collect and store also more complex data-types and formats, such as images (for example from pathology, or radiology), unstructured text (such as patient records, and medication prescriptions), and signals from various measurement devices (such as ECG signals). These data formats are difficult or impossible to analyze using traditional statistical approaches, and thus there is a need for more sophisticated analysis tools to unleash the full potential of the data.

Auria Biobank is one of the Finnish data sources that collect and utilize these data in scientific research purposes. The development manager of Auria Biobank, Antti Karlsson, is working hands-on with these tools. He was an invited speaker in Medaffcon’s Customer Evening “EMMA 2019” and shared his insight on neural networks, with the main emphasis on image processing. *

The neural networks Antti described have solved multiple challenging problems that were believed to be almost impossible for computers to solve only a decade ago. When it comes to the analysis of images, neural networks can, for example, classify images with very high accuracy. These tools have made the recent leaps possible for example in the fields of artificial vision and facial recognition. Everyday applications that are based on advanced image analytics include, for example, Google’s reverse image search, and real time “selfie filters” used in Snapchat or other such photo/video apps.

However, in the medical field, these approaches are still mainly limited to academic research, even though the ML based image classifiers can yield amazing classification accuracies or detect patterns and differences that are impossible for humans to do, as Antti demonstrated in his presentation.

One of the limitations, why this is still the case, is the “lack of training data” in the medical field. In non-medical fields of science, the images that can be used for ML training are generally easily and publicly available or quick and cheap to generate. However, in the case of medical imaging, the images are sensitive patient data and thus not openly distributed due to patient security. Additionally, taking medical images is relatively slow, expensive, and might require invasive procedures (such as biopsies to generate histology images). Especially correctly pre-labeled medical images are hard to find, making the classifier training hard or even impossible for certain problems, at least for the time being.

Overall, there is a great need for high-quality and properly annotated medical images, if we want to utilize the full potential of image recognition also in the medical field. Here lies also a huge potential for Finnish hospital data-lakes: the work they are now putting in collecting and storing such Real World data to be used now and in near future might be the key to implement these approaches and tools based on them also to everyday medical practice.


*Antti’s ideas and views on text-mining and natural language processing has been published in previous FCF post ‘AI-based tools for mining electronic health record data’ and for example in Auria Biobank’s blog post ‘Mikä ihmeen ULMFiT?’ (in Finnish).

Previous section:

Next section:
Kehittyvä terveydenhuolto