KIT - Institute of Biomedical Engineering - Teaching - Student Projects - Bronchoscopic Video Translation using Generative Adversarial Network

Bronchoscopic Video Translation using Generative Adversarial Network

type:Master thesis
tutor:
M.Sc. Lu Guo
Download as PDF

Motivation

A vision-based bronchoscopic navigation system helps physicians locate the bronchoscope during the endobronchial inspection and diagnostic procedures by applying video-CT registration techniques and works like a GPS system. Among the registration approaches, recovering the 3D geometrical structure of the scene based on depth estimation from bronchoscopic videos has been proven to be more robust to illumination and texture variations and to preserve the morphological scene information. As a significant step of this approach, the development and evaluation of data-driven depth estimation methods re-quire a large amount of bronchoscopic videos and corresponding known ground truth in depth, which is difficult to access in real clinical conditions. One way to overcome this obstacle is to use synthetic videos with their rendered depth maps, where the synthetic videos are desired to be as realistic-looking as possible.

Fig. Examples of translation from virtual to realistic-looking bronchoscopic images using CycleGAN

Project Description

The goal of this work is to generate realistic-looking bronchoscopic videos using GAN-based methods. The scope of the work includes:

- Literature research on image/video translation

- Generate virtual bronchoscopic videos using virtual bronchoscopy mimicking the behavior of physicians during real bronchoscopic operations

- Translate virtual bronchoscopic videos to realistic-looking ones

With this project, you will try to answer the following questions:

- Which of the state-of-the-art GAN-based methods outperforms in translating bronchoscopic videos, considering the limitations of “scarce data” and artefacts of bronchoscopic videos?

- What could be further improved to enhance the translation performance/ close the gap between bronchoscopic image and video translation?

- How to evaluate the generated bronchoscopic videos in terms of reality and temporal consistency?

- How to evaluate the robustness and generalization of the translation?

If you are interested or have any questions, please get in touch!