Access the full text.
Sign up today, get DeepDyve free for 14 days.
DE GRUYTER Current Directions in Biomedical Engineering 2022;8(1): 133-137 Ramy A. Zeineldin*, Alex Pollok**, Tim Mangliers**, Mohamed E. Karar, Franziska Mathis- Ullrich, and Oliver Burgert Deep automatic segmentation of brain tumours in interventional ultrasound data https://doi.org/10.1515/cdbme-2022-0034 of tumour and healthy brain parenchyma. Further, iUS data have a limited field of view and may contain artefacts. All Abstract: Intraoperative imaging can assist neurosurgeons to these issues make interpreting iUS data challenging and highly define brain tumours and other surrounding brain structures. dependent on the surgeon’s experience. Interventional ultrasound (iUS) is a convenient modality with In fact, manual delineation is robust against noise; fast scan times. However, iUS data may suffer from noise and however, it is a time-consuming and error-prone process that artefacts which limit their interpretation during brain surgery. cannot be performed in the operating room due to the human- In this work, we use two deep learning networks, namely UNet machine interaction constraints in sterile environments. and TransUNet, to make automatic and accurate segmentation Alternatively, automatic segmentation methods can be used to of the brain tumour in iUS data. Experiments were conducted enhance the visualization of the brain tumour and the risk on a dataset of 27 iUS volumes. The outcomes show that using structures intraoperatively. In recent years, deep learning- a transformer with UNet is advantageous providing an based approaches, especially convolutional neural networks efficient segmentation modelling long-range dependencies (CNNs), have achieved tremendous success in medical between each iUS image. In particular, the enhanced segmentation tasks [3, 4]. TransUNet was able to predict cavity segmentation in iUS data Few studies in the literature addressed the introduction of with an inference rate of more than 125 FPS. These promising deep learning into automatic structure segmentation in iUS results suggest that deep learning networks can be successfully data. For instance, Canalini et al.  focused on salient deployed to assist neurosurgeons in the operating room. structures segmentation, sulci, and falx cerebri, for guiding the registration of US volumes acquired before and after resection. Keywords: Brain tumour, Deep learning, Image-guided The authors in  proposed an enhanced method based on the neurosurgery, iUS, Segmentation. segmentation of the resection cavity using 3D CNN as a prior step to register corresponding iUS images. Similarly, Carton et al.  proposed an automatic method for low-grade brain 1 Introduction tumours segmentation in iUS images using two UNet-based models. Interventional ultrasound (iUS) imaging offers real-time In this work, we present deep learning to tackle the guidance information about the brain tissues including the problem of automatic brain tumour segmentation in iUS data brain tumour and surrounding anatomical structures [1, 2]. iUS during neurosurgery. To do this, we use two CNN models, is frequently used during brain surgery to guide neurosurgeons UNet  and TransUNet . First, UNet  is used as the and to ensure that the tumour is resected completely while baseline CNN due to its impressive performance and keeping other healthy tissues safe. However, the iUS signal popularity in medical image segmentation tasks. The main can be poorly contrasted which limits the accurate definition limitation of UNet is the lack of learning long-range dependencies since each convolutional layer operates on only a local subset of the input image making the network focus on ______ *Corresponding author: Ramy A. Zeineldin: Institute for local features instead of the global context. On the other hand, Anthropomatics and Robotics, Karlsruhe Institute of Technology, TransUNet  was proposed to tackle this problem by using Karlsruhe, Germany, e-mail: email@example.com the self-attention mechanism that encodes the dependency Ramy A. Zeineldin, Alex Pollok, Tim Mangliers, Oliver between all given input pixels. Further, we evaluate the Burgert: Research Group Computer Assisted Medicine, performance of the two networks in the RESECT dataset and Reutlingen University, Reutlingen, Germany compare the results with the state-of-the-art methods. Ramy A. Zeineldin, Mohamed E. Karar: Faculty of Electronic Engineering (FEE), Menoufia University, Menouf, Egypt Franziska Mathis-Ullrich: Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, Karlsruhe, Germany ** The second and third author made equal contributions. Open Access. © 2022 The Author(s), published by De Gruyter. This work is licensed under the Creative Commons Attribution 4.0 International License. 133 cavities. Therefore, we limited our experiments to the subset 2 Materials and Methods containing only resection cavities. 2.1 Data and pre-processing 2.2 UNet We used a publicly available dataset containing iUS images The baseline model is an encode-decoder CNN based on UNet obtained at two different stages of tumour resection . , which implements the specifiable topological and other Manual ground annotations for resection cavity segmentation architectural properties explained in the following. The were provided separately by two experts who annotated the encoder consists of four 3 × 3 convolutional blocks with an volumes using the MEVIS Draw tool . The dataset contains output of 16, 32, 64, and 128 feature channels, respectively. 27 3D iUS scan volumes for low-grade glioma patients All convolutions consist of 1 × 1 stride as well as a zero- obtained during and after tumour resection. Each iUS volume padding of the input. Each convolutional block is followed by was acquired using the Sonowand Invite System with a batch normalization, Rectified Linear Unit (ReLU) activation frequency range of 6–12 MHz and with a voxel spatial as well as 2 × 2 max-pooling layers. Skip connections pass the resolution of 0.14 × 0.14 × 0.14 mm to 0.24 × 0.24 × 0.24 ReLU output of each convolution block to its counterpart in mm . the decoder. The maximum number of feature channels was In order to obtain suitable 2D data for our models, we fixed to 128. sliced the manual segmentations along all three axes. This The decoder consists of four convolutional blocks, each procedure results in a total of 28057 data points of which 8746 being preceded by a 2 × 2 up-sampling convolutional block contain resection cavities. To achieve a uniform image size, and a concatenation block which includes the data of the we resized all images to 256x256 pixels. Then, the resultant corresponding encoder blocks. The feature channels are images were normalized by subtracting the mean value and reduced by half after each convolution block. In the last dividing by the standard deviation for each image. Training convolutional block, there is an additional convolution that our models with the whole dataset did not yield good results reduces the feature channels to one with a stride of one in each due to the high proportion of data points without resection dimension. Finally, the output layer consists of a sigmoid activation function after the 1 × 1 convolution. Figure 1: An overview of the customized TransUNet architecture with a detailed representation of the transformer layer on the left. https://www.mevis.fraunhofer.de/en/research-and- technologies/image-and-data-analysis/mevis-draw.html 134 than in the ground truth segmentations. On the other hand, 2.3 TransUNet TransUNnet performs better than UNet making better use of long-range dependencies with sharper segmentation edges. Figure 1 displays the second deep learning model which is Examples can be seen in rows 1, 2, and 5 where UNet misses designed after the TransUNet  architecture. TransUNet smaller cavities while TransUNet can detect the long-range consists of a CNN-Transformer hybrid model as the encoder dependencies well. An exception can be seen in row 3, where and a classical cascaded upsampler as the decoder. the TransUNet identified part of the resection cavity as part of Transformers come from the field of natural language the background. processing (NLP) and are built upon stacked self-attention Table 1 summarizes the quantitative results achieved by mechanisms. Then, patch embedding is applied to the using the proposed segmentation approaches. The average of extracted feature map instead of raw images. The cascaded the dice score of all dataset volumes is calculated and provided upsampler was applied to decode the hidden feature to enable in the last column since the training, validation, and test splits precise localization of the segmentation output. were selected randomly from the whole dataset. In general, the We made some modifications to the original TransUNet quantitative evaluation supports the visual observations. Both as follows. The number of feature channels for encoding levels networks, UNet and TransUNet, were able to outline brain is 16, 32, 48, and 64, respectively. The number of tumor in iUS precisely with average Dice scores of 93.50 and convolutional layers stack was set to one for both 93.70, respectively. downsampling and upsampling levels. Similarly, the number of transformer blocks and attention heads was set to one. Furthermore, the transformer Multi-layer Perceptrons each have a Gaussian Error Linear Unit (GELU) . A ReLU was otherwise chosen after the convolution layer, as well as a sigmoid output activation function and a batch normalization. Finally, downsampling was made through max-pooling and upsampling with bilinear interpolation. 3 Experiments 3.1 Experimental setup For all experiments, the deep learning models were implemented with TensorFlow and Keras and trained on an Nvidia RTX 3060 graphic card. The dataset was randomly image-wise split into a ratio of 80, 10, and 10 for the training, validation, and testing subsets, respectively. To optimize the hyperparameters of the learning rate, batch size, and whether augmentations should be used, a manual grid search was performed. In total, the UNet model has a number of 1,227,521 trainable parameters while the TransUNet model has 8,135,297 trainable parameters. As an evaluation metric, the Dice coefficient (Dice) was used for our quantitative assessment. Figure 2: Qualitative results of the deep learning networks on the 3.2 Results test dataset. The red box highlights regions where TransUNet performs better than UNet (volumes 4, 30, 460, Figure 2 visualizes the segmentation results from our two deep and 460) or attempts to segment non-existing small regions learning models: UNet and TransUNet. It can be seen that (volume 317). UNet tends to provide resection cavities with smoother edges 135 Table 1: Quantitative comparison of our best models and other which means that they can consider both local and global models at different stages. Segmentation performance measured contexts in signals. with Dice. Quantitatively, it can be observed that our developed networks outperform the state-of-the-art approaches with the Method Training Validation Test Average average Dice coefficient of 93.50 and 93.70 versus 84.00 UNet 93.79 93.63 93.09 93.50 achieved by Canalini et al. . Remarkably, the enhanced TransUNet 94.72 93.14 93.26 93.70 TransUNet model can predict more than 125 FPS using a Canalini et al.  - - 75.00 84.00 modern GPU, which allows its use in interventional settings Carton et al. - - 67.00 67.00 assisting brain surgery. (2D)  Carton et al. - - 72.00 72.00 Furthermore, our architectural modifications result in a (3D)  much smaller model with fewer parameters than the original TransUNET architecture. However, this is motivated by the Moreover, the utilized deep neural networks were fact that we want to achieve a short inference time to check compared to the other comparative approaches in  and . whether the method could be suitable for real-time application. In , Canalini et al. proposed a segmentation method as a One question still unanswered is whether this affects the prior step for the registration of iUS data during neurosurgery. inference time in 3D, which can be explored in future work. Furthermore, they provided a manually annotated version of the RESECT dataset containing 27 iUS volumes acquired during and after brain tumour resection. An average Dice of 4 Conclusion 0.84 over the 27 volumes, including the training and validation volumes, were reported. Another work by Carton et al.  also In this study, we investigated the use of two deep learning- used the RESECT dataset to train and evaluate their networks. based methods, UNet and TransUNet, to automatically 2D and 3D UNets were trained with manual segmentations segment the resection cavity in iUS volumes for neurosurgical based on 17 volumes before resection . They concluded assistance. Quantitative and qualitative results indicate that that the 3D model performed better than the 2D model due to both networks were able to correctly segment the brain tumour more contextual information. However, their 2D model was in the iUS images. TransUNet provided slightly better faster and generalized well to the unseen test dataset (refer to performance than UNet with a mean Dice of 93.70. In table 1). particular, using transformers with UNet successfully It is notable to note that the TransUNet performed a improves the performance for brain tumour segmentation over prediction on 55 batches of 16 images each in 6.94s resulting standard CNNs as a potential for its use in neurosurgical in ~8ms per image. The model loaded in 0.50s and the results guidance. Nevertheless, the training dataset plays an important were written to the hard drive in 1min 2.5s. The inference time role in these results which should contain a large number of for an average volume with around 300 slices is, therefore, segmented tumours in iUS data to confirm these results. ~3s, not including the time needed to process the volume Future work would include testing on an iUS dataset from before and after it is used in the model. our clinical partners at Ulm hospital university. Besides, evaluating the proposed models using other assessment parameters such as the contour mean distance (CMD) . 3.3 Discussion These findings support the concept that employing a Author Statement Transformer as a feature extractor enables precise localization Research funding: The corresponding author is funded by the of resection cavities. In the case of UNet, feature German Academic Exchange Service (DAAD) (No. representations of the image in the encoder are generated by 91705803). Conflict of interest: Authors state no conflict of convolutions. Later, they are decoded back to the full spatial interest. Informed consent: The patient data included in this resolution by the decoder. Convolution operations are article are from a publicly available dataset. Ethical approval: intrinsically local and UNet, therefore, has problems This article does not contain any studies with human accounting for long-range dependencies in the data. participants or animals performed by the authors. Transformers, on the other hand, with their global self- attention mechanism, can consider the global context. Hence, TransUNet combines the advantages of both techniques, 136 References  Miner RC. Image-Guided Neurosurgery. J Med Imaging Radiat Sci. 2017;48:328-35.  Bastos DCA, Juvekar P, Tie Y, Jowkar N, Pieper S, Wells WM, et al. Challenges and Opportunities of Intraoperative 3D Ultrasound With Neuronavigation in Relation to Intraoperative MRI. Front Oncol. 2021;11:656519.  Zeineldin RA, Karar ME, Coburger J, Wirtz CR, Burgert O. DeepSeg: deep neural network framework for automatic brain tumor segmentation using magnetic resonance FLAIR images. Int J Comput Assist Radiol Surg. 2020;15:909-20.  Isensee F, Jaeger PF, Kohl SAA, Petersen J, Maier-Hein KH. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods. 2021;18:203-11.  Canalini L, Klein J, Miller D, Kikinis R. Segmentation- based registration of ultrasound volumes for glioma resection in image-guided neurosurgery. Int J Comput Assist Radiol Surg. 2019;14:1697-713.  Canalini L, Klein J, Miller D, Kikinis R. Enhanced registration of ultrasound volumes by segmentation of resection cavity in neurosurgical procedures. Int J Comput Assist Radiol Surg. 2020;15:1963-74.  Carton FX, Chabanas M, Le Lann F, Noble JH. Automatic segmentation of brain tumor resections in intraoperative ultrasound images using U-Net. J Med Imaging (Bellingham). 2020;7:031503.  Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention – MICCAI 20152015. p. 234-41.  Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, et al. Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:210204306. 2021.  Xiao Y, Fortin M, Unsgard G, Rivaz H, Reinertsen I. REtroSpective Evaluation of Cerebral Tumors (RESECT): A clinical database of pre-operative MRI and intra-operative ultrasound in low-grade glioma surgeries. Med Phys. 2017;44:3875-82.  Hendrycks D, Gimpel K. Gaussian error linear units (gelus). arXiv preprint arXiv:160608415. 2016.  Munkvold BKR, Bo HK, Jakola AS, Reinertsen I, Berntsen EM, Unsgard G, et al. Tumor Volume Assessment in Low-Grade Gliomas: A Comparison of Preoperative Magnetic Resonance Imaging to Coregistered Intraoperative 3-Dimensional Ultrasound Recordings. Neurosurgery. 2018;83:288-96.  Ilunga-Mbuyamba E, Avina-Cervantes JG, Lindner D, Arlt F, Ituna-Yudonago JF, Chalopin C. Patient-specific model- based segmentation of brain tumors in 3D intraoperative ultrasound images. International Journal of Computer Assisted Radiology and Surgery. 2018;13:331-42.
Current Directions in Biomedical Engineering – de Gruyter
Published: Jul 1, 2022
Keywords: Brain tumour; Deep learning; Image-guided neurosurgery; iUS; Segmentation
Access the full text.
Sign up today, get DeepDyve free for 14 days.