Elin Björnsson, MPALG, och Jan Liu, MPCAS

Automatic assessment of cardiac ultrasound images using deep learning

​A deep learning based solution for classification of viewpoint and ejection fraction in echocardiograms

Examinator: Fredrik Kahl, Inst för elektroteknik
Handledare: Jennifer Alvén, Inst för elektroteknik


By performing an echocardiography a physician is able to detect, determine and even anticipate heart diseases. However, it takes several years to train a physician to become a senior specialist in echocardiography. This master thesis proposes the initial steps in the development of a deep learning based solution, using convolutional neural networks, for classification of echocardiogram viewpoints and the ejection fraction (EF) of the left ventricle. The information gained from a convolutional neural network may assist physicians to assess cardiac ultrasound images and act as a support at hospitals that have no senior specialist in echocardiography.

Models for viewpoint classification were trained on a manually labelled set of echocardiograms from 510 patients where each examination includes 6-15 videos. The dataset was divided into a training/validation set (75%) and a test set (25%). 23 viewpoint classes were used and two physicians annotated different subsets of the training/validation set. The test set was annotated separately by both physicians with an inter-observer accuracy of 91.5%, the inter-observer accuracy was used as a baseline when evaluating the models. The accuracy of the best performing viewpoint model was 89.6% and 87.0% respectively, based on the test set that was labelled separately by the two physicians. The purpose of the viewpoint networks was to identify videos from 13 specific viewpoints in a large dataset consisting of 7.3 TB from 205000 examinations. These 13 viewpoints are the ones that are useful for EF classification. The purpose of such a dataset, only consisting of viewpoints relevant to EF, is to train a network for EF classification. For these 13 viewpoint classes, the viewpoint model achieved an accuracy of 93.8% and 93.6%.

The best performing viewpoint model was a ResNet30. It was evaluated on each frame of the echocardiogram video. The viewpoint of each echocardiogram video was then decided by majority voting. The 2D model with majority voting was compared to 3D models, i.e. models that replace the majority voting with learnable convolutions over the frames. However, the 2D model outperformed all 3D models, possibly indicating that these 3D models can be improved. The 3D models were also tested on EF classification. The convolutional networks were trained and evaluated on echocardiograms from the same 510 patients for one specific viewpoint. Unfortunately, these models failed to predict the EF. The next step of the project is to improve
the 3D models for EF classification by utilising the larger dataset consisting of 7.3 TB from 205000 examinations and by using several different viewpoints of the echocardiogram as input.

Keywords: Convolutional neural networks, echocardiogram viewpoint, ejection fraction, cardiac ultrasound images, deep learning.

Kategori Studentarbete
Plats: Blå rummet (rum 3340), Hörsalsvägen 11, plan 3
Tid: 2020-02-03 10:00
Sluttid: 2020-02-03 11:00

Publicerad: to 09 jan 2020.