Access the full text.
Sign up today, get DeepDyve free for 14 days.
Özgün Çiçek, A. Abdulkadir, S. Lienkamp, T. Brox, O. Ronneberger (2016)
3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation
K. Simonyan, Andrew Zisserman (2014)
Very Deep Convolutional Networks for Large-Scale Image RecognitionCoRR, abs/1409.1556
(Dodge, J., Gururangan, S., Card, D., Schwartz, R., & Smith, N. A. (2019). Show your work: improved reporting of experimental results. arXiv preprint arXiv:1909.03004)
Dodge, J., Gururangan, S., Card, D., Schwartz, R., & Smith, N. A. (2019). Show your work: improved reporting of experimental results. arXiv preprint arXiv:1909.03004Dodge, J., Gururangan, S., Card, D., Schwartz, R., & Smith, N. A. (2019). Show your work: improved reporting of experimental results. arXiv preprint arXiv:1909.03004, Dodge, J., Gururangan, S., Card, D., Schwartz, R., & Smith, N. A. (2019). Show your work: improved reporting of experimental results. arXiv preprint arXiv:1909.03004
S. Tait, D. Green (2012)
Mitochondria and cell signallingJournal of Cell Science, 125
Xavier Glorot, Yoshua Bengio (2010)
Understanding the difficulty of training deep feedforward neural networks
Jo Schlemper, O. Oktay, M. Schaap, M. Heinrich, Bernhard Kainz, Ben Glocker, D. Rueckert (2018)
Attention Gated Networks: Learning to Leverage Salient Regions in Medical ImagesMedical image analysis, 53
F. Milletarì, N. Navab, Seyed-Ahmad Ahmadi (2016)
V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation2016 Fourth International Conference on 3D Vision (3DV)
(Chaurasia, A., & Culurciello, E. (2017). LinkNet: Exploiting encoder representations for efficient semantic segmentation. In 2017 IEEE Visual Communications and Image Processing (VCIP), IEEE, pp. 1–4.)
Chaurasia, A., & Culurciello, E. (2017). LinkNet: Exploiting encoder representations for efficient semantic segmentation. In 2017 IEEE Visual Communications and Image Processing (VCIP), IEEE, pp. 1–4.Chaurasia, A., & Culurciello, E. (2017). LinkNet: Exploiting encoder representations for efficient semantic segmentation. In 2017 IEEE Visual Communications and Image Processing (VCIP), IEEE, pp. 1–4., Chaurasia, A., & Culurciello, E. (2017). LinkNet: Exploiting encoder representations for efficient semantic segmentation. In 2017 IEEE Visual Communications and Image Processing (VCIP), IEEE, pp. 1–4.
(Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826.)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826.Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826., Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826.
(Cheng, H.-C., & Varshney, A. (2017). Volume segmentation using convolutional neural networks with limited training data. In 2017 IEEE International Conference on Image Processing (ICIP), IEEE, pp. 590–594.)
Cheng, H.-C., & Varshney, A. (2017). Volume segmentation using convolutional neural networks with limited training data. In 2017 IEEE International Conference on Image Processing (ICIP), IEEE, pp. 590–594.Cheng, H.-C., & Varshney, A. (2017). Volume segmentation using convolutional neural networks with limited training data. In 2017 IEEE International Conference on Image Processing (ICIP), IEEE, pp. 590–594., Cheng, H.-C., & Varshney, A. (2017). Volume segmentation using convolutional neural networks with limited training data. In 2017 IEEE International Conference on Image Processing (ICIP), IEEE, pp. 590–594.
(Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556)
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556, Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556
Jifeng Dai, Yi Li, Kaiming He, Jian Sun (2016)
R-FCN: Object Detection via Region-based Fully Convolutional Networks
Abhijit Roy, Nassir Navab, C. Wachinger (2018)
Concurrent Spatial and Channel Squeeze & Excitation in Fully Convolutional Networks
(Minaee, S., Boykov, Y. Y., Porikli, F., Plaza, A. J., Kehtarnavaz, N., & Terzopoulos, D. (2021). Image segmentation using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence.)
Minaee, S., Boykov, Y. Y., Porikli, F., Plaza, A. J., Kehtarnavaz, N., & Terzopoulos, D. (2021). Image segmentation using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence.Minaee, S., Boykov, Y. Y., Porikli, F., Plaza, A. J., Kehtarnavaz, N., & Terzopoulos, D. (2021). Image segmentation using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence., Minaee, S., Boykov, Y. Y., Porikli, F., Plaza, A. J., Kehtarnavaz, N., & Terzopoulos, D. (2021). Image segmentation using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence.
N. Ibtehaz, Mohammad Rahman (2019)
MultiResUNet : Rethinking the U-Net Architecture for Multimodal Biomedical Image SegmentationNeural networks : the official journal of the International Neural Network Society, 121
Angela Poole, Ruth Thomas, Laurie Andrews, H. McBride, Alexander Whitworth, L. Pallanck (2008)
The PINK1/Parkin pathway regulates mitochondrial morphologyProceedings of the National Academy of Sciences, 105
(Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440.)
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440.Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440., Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440.
(Moen, E., Bannon, D., Kudo, T., Graf, W., Covert, M., & Van Valen, D. (2019). Deep learning for cellular image analysis. Nature methods, 1–14.)
Moen, E., Bannon, D., Kudo, T., Graf, W., Covert, M., & Van Valen, D. (2019). Deep learning for cellular image analysis. Nature methods, 1–14.Moen, E., Bannon, D., Kudo, T., Graf, W., Covert, M., & Van Valen, D. (2019). Deep learning for cellular image analysis. Nature methods, 1–14., Moen, E., Bannon, D., Kudo, T., Graf, W., Covert, M., & Van Valen, D. (2019). Deep learning for cellular image analysis. Nature methods, 1–14.
Aurélien Lucchi, Yunpeng Li, P. Fua (2013)
Learning for Structured Prediction Using Approximate Subgradient Descent with Working Sets2013 IEEE Conference on Computer Vision and Pattern Recognition
(Noh, H., Hong, S., & Han, B. (2015). Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1520–1528.)
Noh, H., Hong, S., & Han, B. (2015). Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1520–1528.Noh, H., Hong, S., & Han, B. (2015). Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1520–1528., Noh, H., Hong, S., & Han, B. (2015). Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1520–1528.
(Roy, A. G., Navab, N., & Wachinger, C. (2018). Concurrent Spatial and Channel Squeeze & Excitation in Fully Convolutional Networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 421–429.)
Roy, A. G., Navab, N., & Wachinger, C. (2018). Concurrent Spatial and Channel Squeeze & Excitation in Fully Convolutional Networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 421–429.Roy, A. G., Navab, N., & Wachinger, C. (2018). Concurrent Spatial and Channel Squeeze & Excitation in Fully Convolutional Networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 421–429., Roy, A. G., Navab, N., & Wachinger, C. (2018). Concurrent Spatial and Channel Squeeze & Excitation in Fully Convolutional Networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 421–429.
S. Jégou, M. Drozdzal, David Vázquez, Adriana Romero, Yoshua Bengio (2016)
The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Kaiming He, X. Zhang, Shaoqing Ren, Jian Sun (2015)
Deep Residual Learning for Image Recognition2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
(2012)
ImageNet classification with deep convolutional neural networksAdvances in Neural Information Processing Systems, 25
Evan Shelhamer, Jonathan Long, Trevor Darrell (2014)
Fully convolutional networks for semantic segmentation2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
F. Meyer (1994)
Topographic distance and watershed linesSignal Process., 38
Kaiming He, X. Zhang, Shaoqing Ren, Jian Sun (2016)
Identity Mappings in Deep Residual Networks
(Milletari, F., Navab, N., & Ahmadi, S.-A. (2016). V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 fourth International Conference on 3D Vision (3DV), IEEE, pp. 565–571.)
Milletari, F., Navab, N., & Ahmadi, S.-A. (2016). V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 fourth International Conference on 3D Vision (3DV), IEEE, pp. 565–571.Milletari, F., Navab, N., & Ahmadi, S.-A. (2016). V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 fourth International Conference on 3D Vision (3DV), IEEE, pp. 565–571., Milletari, F., Navab, N., & Ahmadi, S.-A. (2016). V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 fourth International Conference on 3D Vision (3DV), IEEE, pp. 565–571.
Aurélien Lucchi, Yunpeng Li, Kevin Smith, P. Fua (2012)
Structured Image Segmentation Using Kernelized Features
(Oztel, I., Yolcu, G., Ersoy, I., White, T., & Bunyak, F. (2017). Mitochondria segmentation in electron microscopy volumes using deep convolutional neural network. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), IEEE, pp. 1195–1200.)
Oztel, I., Yolcu, G., Ersoy, I., White, T., & Bunyak, F. (2017). Mitochondria segmentation in electron microscopy volumes using deep convolutional neural network. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), IEEE, pp. 1195–1200.Oztel, I., Yolcu, G., Ersoy, I., White, T., & Bunyak, F. (2017). Mitochondria segmentation in electron microscopy volumes using deep convolutional neural network. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), IEEE, pp. 1195–1200., Oztel, I., Yolcu, G., Ersoy, I., White, T., & Bunyak, F. (2017). Mitochondria segmentation in electron microscopy volumes using deep convolutional neural network. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), IEEE, pp. 1195–1200.
(Zhou, Z., Siddiquee, M. M. R., Tajbakhsh, N., & Liang, J. (2018). Unet++: A nested U-Net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer, pp. 3–11.)
Zhou, Z., Siddiquee, M. M. R., Tajbakhsh, N., & Liang, J. (2018). Unet++: A nested U-Net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer, pp. 3–11.Zhou, Z., Siddiquee, M. M. R., Tajbakhsh, N., & Liang, J. (2018). Unet++: A nested U-Net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer, pp. 3–11., Zhou, Z., Siddiquee, M. M. R., Tajbakhsh, N., & Liang, J. (2018). Unet++: A nested U-Net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer, pp. 3–11.
(Casser, V., Kang, K., Pfister, H., & Haehn, D. (2020). Fast mitochondria detection for connectomics. In Medical Imaging with Deep Learning.)
Casser, V., Kang, K., Pfister, H., & Haehn, D. (2020). Fast mitochondria detection for connectomics. In Medical Imaging with Deep Learning.Casser, V., Kang, K., Pfister, H., & Haehn, D. (2020). Fast mitochondria detection for connectomics. In Medical Imaging with Deep Learning., Casser, V., Kang, K., Pfister, H., & Haehn, D. (2020). Fast mitochondria detection for connectomics. In Medical Imaging with Deep Learning.
(Lucchi, A., Li, Y., & Fua, P. (2013). Learning for structured prediction using approximate subgradient descent with working sets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1987–1994.)
Lucchi, A., Li, Y., & Fua, P. (2013). Learning for structured prediction using approximate subgradient descent with working sets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1987–1994.Lucchi, A., Li, Y., & Fua, P. (2013). Learning for structured prediction using approximate subgradient descent with working sets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1987–1994., Lucchi, A., Li, Y., & Fua, P. (2013). Learning for structured prediction using approximate subgradient descent with working sets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1987–1994.
(Wolf, S., Pape, C., Bailoni, A., Rahaman, N., Kreshuk, A., Kothe, U., & Hamprecht, F. A. (2018). The mutex watershed: efficient, parameter-free image partitioning. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 546–562.)
Wolf, S., Pape, C., Bailoni, A., Rahaman, N., Kreshuk, A., Kothe, U., & Hamprecht, F. A. (2018). The mutex watershed: efficient, parameter-free image partitioning. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 546–562.Wolf, S., Pape, C., Bailoni, A., Rahaman, N., Kreshuk, A., Kothe, U., & Hamprecht, F. A. (2018). The mutex watershed: efficient, parameter-free image partitioning. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 546–562., Wolf, S., Pape, C., Bailoni, A., Rahaman, N., Kreshuk, A., Kothe, U., & Hamprecht, F. A. (2018). The mutex watershed: efficient, parameter-free image partitioning. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 546–562.
D. Wallace (2012)
Mitochondria and cancerNature Reviews Cancer, 12
Vincent Casser, Kai Kang, H. Pfister, D. Haehn (2018)
Fast Mitochondria Detection for Connectomics
(Lucchi, A., Becker, C., Neila, P. M., & Fua, P. (2014a). Exploiting enclosing membranes and contextual cues for mitochondria segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 65–72.)
Lucchi, A., Becker, C., Neila, P. M., & Fua, P. (2014a). Exploiting enclosing membranes and contextual cues for mitochondria segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 65–72.Lucchi, A., Becker, C., Neila, P. M., & Fua, P. (2014a). Exploiting enclosing membranes and contextual cues for mitochondria segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 65–72., Lucchi, A., Becker, C., Neila, P. M., & Fua, P. (2014a). Exploiting enclosing membranes and contextual cues for mitochondria segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 65–72.
Aurélien Lucchi, C. Becker, Pablo Márquez-Neila, P. Fua (2014)
Exploiting Enclosing Membranes and Contextual Cues for Mitochondria SegmentationMedical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention, 17 Pt 1
Abhijit Roy, Nassir Navab, C. Wachinger (2018)
Recalibrating Fully Convolutional Networks With Spatial and Channel “Squeeze and Excitation” BlocksIEEE Transactions on Medical Imaging, 38
(Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256.)
Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256.Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256., Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256.
D. Wei, Zudi Lin, Daniel Franco-Barranco, N. Wendt, Xingyu, Liu, Wenjie Yin, Xin Huang, Aarush Gupta, Won-Dong Jang, Xueying Wang, Ignacio Arganda-Carreras, J. Lichtman, H. Pfister (2020)
MitoEM Dataset: Large-Scale 3D Mitochondria Instance Segmentation from EM ImagesMedical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention, 12265
Christian Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, Scott Reed, Dragomir Anguelov, D. Erhan, Vincent Vanhoucke, Andrew Rabinovich (2014)
Going deeper with convolutions2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
(Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., & Bengio, Y. (2017). The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition workshops, pp. 11–19.)
Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., & Bengio, Y. (2017). The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition workshops, pp. 11–19.Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., & Bengio, Y. (2017). The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition workshops, pp. 11–19., Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., & Bengio, Y. (2017). The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition workshops, pp. 11–19.
Jing Liu, Weifu Li, Chi Xiao, Bei Hong, Qiwei Xie, Hua Han (2018)
Automatic Detection and Segmentation of Mitochondria from SEM Images using Deep Neural Network2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)
(Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141.)
Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141.Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141., Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141.
O. Ronneberger, P. Fischer, T. Brox (2015)
U-Net: Convolutional Networks for Biomedical Image SegmentationArXiv, abs/1505.04597
H. Cheng, A. Varshney (2017)
Volume segmentation using convolutional neural networks with limited training data2017 IEEE International Conference on Image Processing (ICIP)
Qiangguo Jin, Zhao-Peng Meng, Tuan Pham, Qi Chen, Leyi Wei, R. Su (2018)
DUNet: A deformable network for retinal vessel segmentationKnowl. Based Syst., 178
I. Oztel, G. Yolcu, I. Ersoy, T. White, F. Bunyak (2017)
Mitochondria segmentation in electron microscopy volumes using deep convolutional neural network2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
G. Litjens, Thijs Kooi, B. Bejnordi, A. Setio, F. Ciompi, Mohsen Ghafoorian, J. Laak, B. Ginneken, C. Sánchez (2017)
A survey on deep learning in medical image analysisMedical image analysis, 42
Irwan Bello, W. Fedus, Xianzhi Du, E. Cubuk, A. Srinivas, Tsung-Yi Lin, Jonathon Shlens, Barret Zoph (2021)
Revisiting ResNets: Improved Training and Scaling StrategiesArXiv, abs/2103.07579
(He, K., Zhang, X., Ren, S., & Sun, J. (2016b). Identity mappings in deep residual networks. In European Conference on Computer Vision, Springer, pp. 630–645.)
He, K., Zhang, X., Ren, S., & Sun, J. (2016b). Identity mappings in deep residual networks. In European Conference on Computer Vision, Springer, pp. 630–645.He, K., Zhang, X., Ren, S., & Sun, J. (2016b). Identity mappings in deep residual networks. In European Conference on Computer Vision, Springer, pp. 630–645., He, K., Zhang, X., Ren, S., & Sun, J. (2016b). Identity mappings in deep residual networks. In European Conference on Computer Vision, Springer, pp. 630–645.
(He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034.)
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034.He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034., He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034.
Zaiwang Gu, Jun Cheng, H. Fu, Kang Zhou, Huaying Hao, Yitian Zhao, Tianyang Zhang, Shenghua Gao, Jiang Liu (2019)
CE-Net: Context Encoder Network for 2D Medical Image SegmentationIEEE Transactions on Medical Imaging, 38
Juntang Zhuang (2018)
LadderNet: Multi-path networks based on U-Net for medical image segmentationArXiv, abs/1810.07810
W. Marsden (2012)
I and J
(He, K., Zhang, X., Ren, S., & Sun, J. (2016a). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778.)
He, K., Zhang, X., Ren, S., & Sun, J. (2016a). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778.He, K., Zhang, X., Ren, S., & Sun, J. (2016a). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778., He, K., Zhang, X., Ren, S., & Sun, J. (2016a). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778.
Fabian Isensee, P. Jaeger, Simon Kohl, Jens Petersen, Klaus Maier-Hein (2020)
nnU-Net: a self-configuring method for deep learning-based biomedical image segmentationNature Methods, 18
Julia Buhmann, R. Krause, Rodrigo Lentini, N. Eckstein, Matthew Cook, Srinivas Turaga, Jan Funke (2018)
Synaptic partner prediction from point annotations in insect brains
Zongwei Zhou, M. Siddiquee, Nima Tajbakhsh, Jianming Liang (2018)
UNet++: A Nested U-Net Architecture for Medical Image SegmentationDeep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support : 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, held in conjunction with MICCAI 2018, Granada, Spain, S..., 11045
H. Fu, Jun Cheng, Yanwu Xu, D. Wong, Jiang Liu, Xiaochun Cao (2018)
Joint Optic Disc and Cup Segmentation Based on Multi-Label Deep Network and Polar TransformationIEEE Transactions on Medical Imaging, 37
I. Haque, J. Neubert (2020)
Deep learning approaches to biomedical image segmentationInformatics in Medicine Unlocked, 18
MD KINAMI, I. Miyazaki, Mdi
AND T
(Liu, J., Li, W., Xiao, C., Hong, B., Xie, Q., & Han, H. (2018). Automatic detection and segmentation of mitochondria from sem images using deep neural network. In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, pp. 628–631.)
Liu, J., Li, W., Xiao, C., Hong, B., Xie, Q., & Han, H. (2018). Automatic detection and segmentation of mitochondria from sem images using deep neural network. In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, pp. 628–631.Liu, J., Li, W., Xiao, C., Hong, B., Xie, Q., & Han, H. (2018). Automatic detection and segmentation of mitochondria from sem images using deep neural network. In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, pp. 628–631., Liu, J., Li, W., Xiao, C., Hong, B., Xie, Q., & Han, H. (2018). Automatic detection and segmentation of mitochondria from sem images using deep neural network. In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, pp. 628–631.
(He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969.)
He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969.He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969., He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969.
Aurélien Lucchi, Kevin Smith, R. Achanta, G. Knott, P. Fua (2012)
Supervoxel-Based Segmentation of Mitochondria in EM Image Stacks With Learned Shape FeaturesIEEE Transactions on Medical Imaging, 31
(Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T., & Ronneberger, O. (2016). 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 424–432.)
Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T., & Ronneberger, O. (2016). 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 424–432.Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T., & Ronneberger, O. (2016). 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 424–432., Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T., & Ronneberger, O. (2016). 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 424–432.
(Dai, J., Li, Y., He, K., & Sun, J. (2016). R-FCN: Object Detection via Region-based Fully Convolutional Networks. In Advances in Neural Information Processing Systems, pp. 379–387.)
Dai, J., Li, Y., He, K., & Sun, J. (2016). R-FCN: Object Detection via Region-based Fully Convolutional Networks. In Advances in Neural Information Processing Systems, pp. 379–387.Dai, J., Li, Y., He, K., & Sun, J. (2016). R-FCN: Object Detection via Region-based Fully Convolutional Networks. In Advances in Neural Information Processing Systems, pp. 379–387., Dai, J., Li, Y., He, K., & Sun, J. (2016). R-FCN: Object Detection via Region-based Fully Convolutional Networks. In Advances in Neural Information Processing Systems, pp. 379–387.
Abhishek Chaurasia, E. Culurciello (2017)
LinkNet: Exploiting encoder representations for efficient semantic segmentation2017 IEEE Visual Communications and Image Processing (VCIP)
S. Fulda, L. Galluzzi, G. Kroemer (2010)
Targeting mitochondria for cancer therapyNature Reviews Drug Discovery, 9
Erick Moen, Dylan Bannon, Takamasa Kudo, William Graf, M. Covert, David Valen (2019)
Deep learning for cellular image analysisNature Methods, 16
(Lucchi, A., Li, Y., Smith, K., & Fua, P. (2012). Structured image segmentation using kernelized features. In European Conference on Computer Vision, Springer, pp. 400–413.)
Lucchi, A., Li, Y., Smith, K., & Fua, P. (2012). Structured image segmentation using kernelized features. In European Conference on Computer Vision, Springer, pp. 400–413.Lucchi, A., Li, Y., Smith, K., & Fua, P. (2012). Structured image segmentation using kernelized features. In European Conference on Computer Vision, Springer, pp. 400–413., Lucchi, A., Li, Y., Smith, K., & Fua, P. (2012). Structured image segmentation using kernelized features. In European Conference on Computer Vision, Springer, pp. 400–413.
(Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9.)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9.Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9., Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9.
Steffen Wolf, Constantin Pape, Alberto Bailoni, Nasim Rahaman, A. Kreshuk, U. Köthe, F. Hamprecht (2018)
The Mutex Watershed: Efficient, Parameter-Free Image Partitioning
Kaiming He, X. Zhang, Shaoqing Ren, Jian Sun (2015)
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification2015 IEEE International Conference on Computer Vision (ICCV)
Aurélien Lucchi, Pablo Márquez-Neila, C. Becker, Yunpeng Li, Kevin Smith, G. Knott, P. Fua (2015)
Learning Structured Models for Segmentation of 2-D and 3-D ImageryIEEE Transactions on Medical Imaging, 34
(Bello, I., Fedus, W., Du, X., Cubuk, E. D., Srinivas, A., Lin, T.-Y., Shlens, J., & Zoph, B. (2021). Revisiting ResNets: Improved Training and Scaling Strategies. arXiv preprint arXiv:2103.07579)
Bello, I., Fedus, W., Du, X., Cubuk, E. D., Srinivas, A., Lin, T.-Y., Shlens, J., & Zoph, B. (2021). Revisiting ResNets: Improved Training and Scaling Strategies. arXiv preprint arXiv:2103.07579Bello, I., Fedus, W., Du, X., Cubuk, E. D., Srinivas, A., Lin, T.-Y., Shlens, J., & Zoph, B. (2021). Revisiting ResNets: Improved Training and Scaling Strategies. arXiv preprint arXiv:2103.07579, Bello, I., Fedus, W., Du, X., Cubuk, E. D., Srinivas, A., Lin, T.-Y., Shlens, J., & Zoph, B. (2021). Revisiting ResNets: Improved Training and Scaling Strategies. arXiv preprint arXiv:2103.07579
E. Meijering (2020)
A bird’s-eye view of deep learning in bioimage analysisComputational and Structural Biotechnology Journal, 18
(Lucchi, A., Márquez-Neila, P., Becker, C., Li, Y., Smith, K., Knott, G., & Fua, P. (2014b). Learning Structured Models for Segmentation of 2-D and 3-D Imagery. IEEE Transactions on Medical Imaging,34(5), 1096–1110.)
Lucchi, A., Márquez-Neila, P., Becker, C., Li, Y., Smith, K., Knott, G., & Fua, P. (2014b). Learning Structured Models for Segmentation of 2-D and 3-D Imagery. IEEE Transactions on Medical Imaging,34(5), 1096–1110.Lucchi, A., Márquez-Neila, P., Becker, C., Li, Y., Smith, K., Knott, G., & Fua, P. (2014b). Learning Structured Models for Segmentation of 2-D and 3-D Imagery. IEEE Transactions on Medical Imaging,34(5), 1096–1110., Lucchi, A., Márquez-Neila, P., Becker, C., Li, Y., Smith, K., Knott, G., & Fua, P. (2014b). Learning Structured Models for Segmentation of 2-D and 3-D Imagery. IEEE Transactions on Medical Imaging,34(5), 1096–1110.
Shervin Minaee, Yuri Boykov, F. Porikli, A. Plaza, N. Kehtarnavaz, Demetri Terzopoulos (2020)
Image Segmentation Using Deep Learning: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence, 44
Jie Hu, Li Shen, Samuel Albanie, Gang Sun, E. Wu (2017)
Squeeze-and-Excitation Networks2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
A. Venkatasubramaniam, B. Mateen, B. Shields, A. Hattersley, A. Jones, Sebastian Vollmer, J. Dennis (2022)
Comparison of causal forest and regression-based approaches to evaluate treatment effect heterogeneity: an application for type 2 diabetes precision medicineBMC Medical Informatics and Decision Making, 23
N. Kasthuri, K. Hayworth, D. Berger, R. Schalek, J. Conchello, Seymour Knowles-Barley, Dongil Lee, A. Vázquez-Reina, V. Kaynig, T. Jones, Mike Roberts, Josh Morgan, J. Tapia, H. Seung, William Roncal, J. Vogelstein, R. Burns, D. Sussman, C. Priebe, H. Pfister, J. Lichtman (2015)
Saturated Reconstruction of a Volume of NeocortexCell, 162
(Wei, D., Lin, Z., Franco-Barranco, D., Wendt, N., Liu, X., Yin, W., Huang, X., Gupta, A., Jang, W.-D., Wang, X. et al. (2020). MitoEM Dataset: Large-Scale 3D Mitochondria Instance Segmentation from EM Images. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 66–76.)
Wei, D., Lin, Z., Franco-Barranco, D., Wendt, N., Liu, X., Yin, W., Huang, X., Gupta, A., Jang, W.-D., Wang, X. et al. (2020). MitoEM Dataset: Large-Scale 3D Mitochondria Instance Segmentation from EM Images. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 66–76.Wei, D., Lin, Z., Franco-Barranco, D., Wendt, N., Liu, X., Yin, W., Huang, X., Gupta, A., Jang, W.-D., Wang, X. et al. (2020). MitoEM Dataset: Large-Scale 3D Mitochondria Instance Segmentation from EM Images. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 66–76., Wei, D., Lin, Z., Franco-Barranco, D., Wendt, N., Liu, X., Yin, W., Huang, X., Gupta, A., Jang, W.-D., Wang, X. et al. (2020). MitoEM Dataset: Large-Scale 3D Mitochondria Instance Segmentation from EM Images. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 66–76.
(Roy, A. G., Navab, N., & Wachinger, C. (2018). Recalibrating fully convolutional networks with spatial and channel “squeeze and excitation” blocks. IEEE Transactions on Medical Imaging 38(2), 540–549.)
Roy, A. G., Navab, N., & Wachinger, C. (2018). Recalibrating fully convolutional networks with spatial and channel “squeeze and excitation” blocks. IEEE Transactions on Medical Imaging 38(2), 540–549.Roy, A. G., Navab, N., & Wachinger, C. (2018). Recalibrating fully convolutional networks with spatial and channel “squeeze and excitation” blocks. IEEE Transactions on Medical Imaging 38(2), 540–549., Roy, A. G., Navab, N., & Wachinger, C. (2018). Recalibrating fully convolutional networks with spatial and channel “squeeze and excitation” blocks. IEEE Transactions on Medical Imaging 38(2), 540–549.
Chi Xiao, Xi Chen, Weifu Li, Linlin Li, Lu Wang, Qiwei Xie, Hua Han (2018)
Automatic Mitochondria Segmentation for EM Data Using a 3D Supervised Convolutional NetworkFrontiers in Neuroanatomy, 12
(Zhuang, J. (2018). LadderNet: Multi-path networks based on U-Net for medical image segmentation. arXiv preprint arXiv:1810.07810)
Zhuang, J. (2018). LadderNet: Multi-path networks based on U-Net for medical image segmentation. arXiv preprint arXiv:1810.07810Zhuang, J. (2018). LadderNet: Multi-path networks based on U-Net for medical image segmentation. arXiv preprint arXiv:1810.07810, Zhuang, J. (2018). LadderNet: Multi-path networks based on U-Net for medical image segmentation. arXiv preprint arXiv:1810.07810
Michelle Moura, Lucas Santos, B. Houten (2010)
Mitochondrial dysfunction in neurodegenerative diseases and cancerEnvironmental and Molecular Mutagenesis, 51
(Buhmann, J., Krause, R., Lentini, R. C., Eckstein, N., Cook, M., Turaga, S., & Funke, J. (2018). Synaptic partner prediction from point annotations in insect brains. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 309–316.)
Buhmann, J., Krause, R., Lentini, R. C., Eckstein, N., Cook, M., Turaga, S., & Funke, J. (2018). Synaptic partner prediction from point annotations in insect brains. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 309–316.Buhmann, J., Krause, R., Lentini, R. C., Eckstein, N., Cook, M., Turaga, S., & Funke, J. (2018). Synaptic partner prediction from point annotations in insect brains. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 309–316., Buhmann, J., Krause, R., Lentini, R. C., Eckstein, N., Cook, M., Turaga, S., & Funke, J. (2018). Synaptic partner prediction from point annotations in insect brains. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 309–316.
Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick (2017)
Mask R-CNN
Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, Noah Smith (2019)
Show Your Work: Improved Reporting of Experimental ResultsArXiv, abs/1909.03004
Ignacio Arganda-Carreras, Srinivas Turaga, D. Berger, D. Ciresan, A. Giusti, L. Gambardella, J. Schmidhuber, D. Laptev, Sarvesh Dwivedi, J. Buhmann, Ting Liu, Mojtaba Seyedhosseini, T. Tasdizen, L. Kamentsky, Radim Burget, V. Uher, Xiao Tan, Changming Sun, T. Pham, Erhan Bas, M. Uzunbas, A. Cardona, J. Schindelin, H. Seung (2015)
Crowdsourcing the creation of image segmentation algorithms for connectomicsFrontiers in Neuroanatomy, 9
Vijay Badrinarayanan, Alex Kendall, R. Cipolla (2015)
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence, 39
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, Z. Wojna (2015)
Rethinking the Inception Architecture for Computer Vision2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Hyeonwoo Noh, Seunghoon Hong, Bohyung Han (2015)
Learning Deconvolution Network for Semantic Segmentation2015 IEEE International Conference on Computer Vision (ICCV)
D. Legland, Ignacio Arganda-Carreras, P. Andrey (2016)
MorphoLibJ: integrated library and plugins for mathematical morphology with ImageJBioinformatics, 32 22
A. Garcia-Garcia, Sergio Orts, Sergiu Oprea, Victor Villena-Martinez, P. Martinez-Gonzalez, J. Rodríguez (2018)
A survey on deep learning techniques for image and video semantic segmentationAppl. Soft Comput., 70
A. Krizhevsky, Ilya Sutskever, Geoffrey Hinton (2012)
ImageNet classification with deep convolutional neural networksCommunications of the ACM, 60
(Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 234–241.)
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 234–241.Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 234–241., Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, pp. 234–241.
Electron microscopy (EM) allows the identification of intracellular organelles such as mitochondria, providing insights for clinical and scientific studies. In recent years, a number of novel deep learning architectures have been published reporting superior performance, or even human-level accuracy, compared to previous approaches on public mitochondria segmenta- tion datasets. Unfortunately, many of these publications make neither the code nor the full training details public, leading to reproducibility issues and dubious model comparisons. Thus, following a recent code of best practices in the field, we present an extensive study of the state-of-the-art architectures and compare them to different variations of U-Net-like models for this task. To unveil the impact of architectural novelties, a common set of pre- and post-processing operations has been implemented and tested with each approach. Moreover, an exhaustive sweep of hyperparameters has been performed, running each configuration multiple times to measure their stability. Using this methodology, we found very stable architectures and training configurations that consistently obtain state-of-the-art results in the well-known EPFL Hippocampus mitochondria segmentation dataset and outperform all previous works on two other available datasets: Lucchi++ and Kasthuri++. The code and its documentation are publicly available at https:// github. com/ danif ranco/ EM_ Image_ Segme ntati on. Keywords Electron microscopy · Mitochondria · Semantic segmentation · Deep learning · Bioimage analysis Introduction in the cell, such as energy production, signaling, differen- tiation, cell growth and death (Tait & Green, 2012). For Recent imaging methods in electron microscopy (EM) that reason, the automated and accurate segmentation of allow scientists to identify subcellular organelles such as mitochondria is especially relevant for basic research in vesicles or mitochondria with nano-scale precision. Mito- neuroscience, but in clinical studies as well, since their chondria play an important role in some crucial functions number and morphology are related to severe diseases such as cancer (De Moura et al., 2010; Fulda et al., 2010; Wallace, 2012), Parkinson (Poole et al., 2008) or Alzhei- * Daniel Franco-Barranco daniel_franco001@ehu.eus mer disease (De Moura et al., 2010). In the past decade, advances in computer vision, espe- Arrate Muñoz-Barrutia mamunozb@ing.uc3m.es cially those based on deep learning (DL), have helped sci- entists to automatically quantify the size and morphology Ignacio Arganda-Carreras ignacio.arganda@ehu.eus of cells and organelles in microscopy images (Moen et al., 2019; Meijering, 2020). However, with an increasing num- Donostia International Physics Center (DIPC), ber of DL-based bioimage segmentation publications every Donostia-San Sebastián, Spain year, there is a lack of enough benchmarks for different Department of Computer Science and Artificial image modalities and segmentation problems to compare Intelligence, University of the Basque Country (UPV/EHU), state-of-the-art methods under the same conditions. Moreo- Donostia-San Sebastian, Spain ver, DL methods are usually too data-specialized, mak- Universidad Carlos III de Madrid, Leganés, Spain ing it difficult to identify those approaches that perform Instituto de Investigación Sanitaria Gregorio Marañón, well on datasets different from those they have been tested Madrid, Spain on (Isensee et al., 2021). On top of that, many of such Ikerbasque, Basque Foundation for Science, Bilbao, Spain Vol.:(0123456789) 1 3 438 Neuroinformatics (2022) 20:437–450 approaches are published without their supporting code of different popular post-processing and output reconstruc- and image data, leading to major reproducibility and relia- tion methods. Finally, based on our findings, we propose bility problems. Such issues have not gone unnoticed. They light encoder-decoder architectures that consistently lead to have become the main target even for recently proposed robust state-of-the-art results in Lucchi as well as in other challenges (https://paper swit hcode. com/ r c2020) where the public mitochondria segmentation datasets. machine learning community aims at reproducing the com- In brief, our main contributions are as follows: putational experiments and verifying the empirical results already published at top venues. 1. We performed a thorough study on the reproducibil- As pointed out by recent works (Bello et al., 2021; Isensee ity and stability of the top-performing DL segmenta- et al., 2021), while many publications insist on presenting tion methods published for the Lucchi dataset, expos- architectural novelties, the overall performance of a network ing major issues to consistently achieve their claimed depends substantially on its corresponding pre-processing, results. training, inference and post-processing strategies. Even 2. We made a comprehensive comparison of the perfor- though such choices play a critical role in the final results, mance of the most popular deep learning architectures very often they tend to be omitted in the method descriptions for biomedical segmentation using the Lucchi dataset, and their comparisons with competing approaches. Another and show their stability under the same training and issue inherent to the use of deep learning architectures (and post-processing conditions. frequently not discussed in publications) is the sometimes 3. We propose different variations of light-weight encoder- not negligible variability of the results produced by different decoder architectures, together with a training/inference executions of the same architecture and training configu- workflow, that lead to stable and robust results across ration. Despite programmatically setting all initial random mitochondria segmentation datasets. seeds, the non-deterministic nature of the graphical process- ing units (GPUs) introduces variations from execution to execution, resulting in slightly different performances. This Related Work variability is usually not taken into account when presenting results, although it could be crucial to select models, train- In the last decade, DL approaches have become domi- ing, and inference strategies that repeatedly lead to stable nant in the most common target applications of com- results. puter vision (Garcia-Garcia et al., 2018; Minaee et al., In the particular task of mitochondria segmentation, the 2021) including semantic segmentation for biomedical de facto benchmark dataset adopted by the community is the images (Haque & Neubert, 2020; Litjens et al., 2017). EPFL Hippocampus dataset (Lucchi et al., 2011) (hereafter Semantic segmentation aims at associating each pixel in an referred to as Lucchi dataset). Published in 2011, it contains image to a class label. The first steps towards resolving this two image volumes (training and test) of the same size, and problem using DL were taken by means of fully convolution their respective semantic segmentation labels are both pub- networks (FCNs) (Long et al., 2015). More specifically, fully lic. As the reference in the field for a decade, many methods connected layers were replaced by convolutional layers in have been published proposing solutions for this dataset. some classic networks (Krizhevsky et al., 2012; Simonyan Unfortunately, most of them suffer from the aforementioned & Zisserman, 2014; Szegedy et al., 2015) and information problems, forcing other scientists to code their own versions from intermediate layers was fused to upsample the feature of the published algorithms, often knowing too few details maps encoded by the network, producing a pixel-wise classi- about their original implementations, training, and inference fication. This idea of encoding the image through a convolu- methodologies. tional neural network (CNN), outputting a vector feature map To address these deficiencies in the field, we first re- (also called bottleneck), and recovering its original spatial implemented the top-performing DL architectures for the shape in a decoding path was further extended in subsequent Lucchi dataset following the descriptions of their original works (Noh et al., 2015; Ronneberger et al., 2015; Milletari publications. After our own modifications, an extensive et al., 2016; Jégou et al., 2017; Badrinarayanan et al., 2017; hyperparameter search, and multiple runs of the same con- Chaurasia & Culurciello, 2017). A major breakthrough was figuration, some of these methods occasionally achieved the U-Net (Ronneberger et al., 2015), which extended the their claimed results. Next, we compared the performance of encoding and decoding idea by making an upsampling path state-of-the-art biomedical semantic segmentation architec- with up-convolutions after the bottleneck to recover the origi- tures in the same dataset, evaluated under the same training nal image size. In addition, the authors proposed skip con- and inference framework. In particular, we focused on the nections between the contracting and the expanding path, stability of the resulting metric values after several execu- allowing the upsampling path to recover fine-grained details. tions of the same configuration and scrutinized the impact The U-Net is the baseline of numerous approaches due to 1 3 Neuroinformatics (2022) 20:437–450 439 its success in multiple biomedical applications (Zhou et al., is the result of the ensemble prediction of the 16 possible 2018; Schlemper et al., 2019; Roy et al., 2018; Arganda- 3D variations (using flips and axis rotations) per each 3D Carreras et al., 2015; Gu et al., 2019; Buhmann et al., 2018; subvolume. They reported an IoU value of 0.900 in the test Ibtehaz & Rahman, 2020; Zhuang, 2018; Jin et al., 2019). set. In a more recent work, Casser et al. (2020) presented In the specific case of mitochondria segmentation, early a light version of a 2D U-Net aiming to achieve real-time works attempting to segment the Lucchi dataset (Lucchi segmentation and reported an IoU value of 0.890 applying et al., 2011) leveraged traditional image processing and median Z-filtering as post-processing method. machine learning techniques (Lucchi et al., 2012, 2013, 2014a, b). In their last two works, Lucchi et al. (2014a, b) proposed alternative methodologies to segment mitochon- Methods dria on their own dataset explicitly modeling their mem- branes. From those results, Casser et al. (2020) inferred Although architectural modifications of a basic U-Net to per - a Jaccard index or intersection over union (IoU) lower form biomedical segmentation are continuously published, it bound value of 0.895 in the test set. The IoU is a com- is usually unclear if their claimed superiority is only due to an mon way of measuring the overlapping area between the incomplete optimization of the basic network for the task at ground truth and the produced segmentation with values hand (Isensee et al., 2021; Bello et al., 2021). We hypothesize that range from 0 to 1, where 1 represents a perfect match that, on top of answering that question, a full optimization (see “Experimental setup”). can also lead to lightweight models that constantly produce More modern approaches made use of DL architectures stable and robust results across datasets. To prove it, we to segment the Lucchi dataset. For instance, Oztel et al. explored basic U-Net configurations together with popular (2017) trained a CNN with four convolutional layers to clas- architectural tweaks such as residual connections (He et al., sify 32 × 32 pixel patches extracted from the training data 2016a) or attention gates (Schlemper et al., 2019). Addi- into mitochondria and background. After that, they fed the tionally, to disentangle the impact of each training choice, network with the full test images to simulate a sliding win- all configurations are run several times and their results are dow process and applied three consecutive post-processing shown in the context of different post-processing and output methods: 1) spurious detection to remove small false blobs, reconstruction methods. 2) marker-controlled watershed transform (Meyer, 1994) for border refinement, and 3) median filtering to smooth labels Proposed Networks along the z-axis. This way, they reported an IoU value of 0.907 in the test set, which is the highest value to date. Liu Building upon the state of the art, we have explored differ - et al. (2018) used instead a modified Mask R-CNN (He et al., ent lightweight U-Net-like architectures in 2D and 3D. The 2017) to detect and segment mitochondria. As post-processing general scheme is represented in Fig. 1, where our basic and methods they performed: 1) a morphological opening to Attention U-Net models use convolutional blocks as process- eliminate small regions and smooth large ones, 2) a multi- ing blocks (two 3 × 3 convolutional layers, Fig. 2a) and our layer fusion operation to exploit 3D mitochondria informa- Residual U-Net is formed by full pre-activation (He et al., tion, and 3) a size-based filtering to remove tiny segments 2016b) residual blocks (two 3 × 3 convolutional layers with that have an IoU score below a given threshold. As a result, a shortcut, Fig. 2b). Both basic and Residual U-Net use con- they reported an IoU value of 0.849 in the test set. Cheng catenation as feature merge operation while our Attention and Varshney (2017) applied both a 2D and a 3D version U-Net introduces there an attention gate (Schlemper et al., of an asymmetric U-Net-like network. They introduced the 2019). Based on a thorough hyperparameter exploration (see stochastic downsampling method, an operation they named supplementary material), we found the following optimal feature level augmentation. More specifically, on that down- configuration for each architecture: sampling layer, they subdivided the image into fixed square regions and picked random rows and columns inside them – Basic U-Net. In 2D, it is a four-level U-Net with 16 filters to select the pixels/voxels that will constitute the downsam- in the initial level that get doubled on each level, drop- pled output. Moreover, they implemented factorized con- out in each block (from 0.1 up to 0.3 in the bottleneck volutions (Szegedy et al., 2016) instead of classical ones to and reversely, from 0.3 to 0.1 in the upsampling layers), drastically reduce the number of network parameters. As ELU activation functions and transposed convolutions their best result, they reported an IoU value of 0.889 in the to perform the upsampling in the decoder. In 3D, the test set using their 3D network. Xiao et al. (2018) employed architecture is very similar, but using 3 levels, with 28, a variant of a 3D U-Net model with residual blocks. In the 36, 48 and 64 (in the bottleneck) 3D filters on each layer. decoder of the network, they included two auxiliary outputs – Residual U-Net. In 2D, this network is identical to our to address the vanishing gradient issue. Their final output best basic U-Net architecture but swapping each convo- 1 3 440 Neuroinformatics (2022) 20:437–450 Fig. 1 Graphical representa- tion of the proposed network Input Output architectures. Depending image image on the model of choice, the processing blocks can be either simply convolutional or residual blocks, while the feature merge operations may imply a single concatenation or an additional attention gate processing block downsampling skip connection upsampling feature merge operation lutional block by a residual block (He et al., 2016a). For tions are undone and the results are averaged into a final the 3D residual approach, we achieved our best results prediction for an ensemble effect. going one level deeper than the non-residual 3D network – Blending overlapped patches. When networks work and 28, 36, 48, 64 and 80 (bottleneck) filters per level. on image patches, the final prediction is reconstructed – Attention U-Net. These networks are the same as Basic as a mosaic of the patches predictions. The presence of U-Net but incorporating attention gates (Schlemper et al., jagged predictions on the borders of the output patches 2019) in the features passed by the skip connections are a recurrent problem (Fig. 4) that can be mitigated by (Fig. 3). Such attention mechanism emphasizes salient creating overlapping patches and smoothly blending the feature maps that are in charge of the class decision and resulting predictions using a second order spline window suppress irrelevant ones endowing the network with the function. Due to its computational cost, we only experi- ability to focus on relevant regions of the image. mented with this technique in 2D. – Median Z-filt ering. A simple median filter along the Post‑Processing Z-axis (Casser et al., 2020; Oztel et al., 2017) can be used to correct label predictions in consecutive image slices. As the network outputs are pixel-wise predictions, it is common practice to apply basic post-processing methods to Output Reconstruction improve the results. We experimented with three techniques and studied their impact in the final segmentation result: During the training of deep networks, the input images are commonly divided into patches due to GPU memory limita- – Test-time data augmentation. Inference is applied on tions. Later, those patches need to be merged back together the multiples of 90 rotations and flipped versions of each to form the final output at full-image size. In some publi- image. Consequently, eight versions are created in 2D cations, the authors specify clearly the way they infer and and 16 versions in 3D. Finally, the individual transforma- merge their predictions (Xiao et al., 2018), while in oth- ers this process is not described (Cheng & Varshney, 2017; Oztel et al., 2017; Casser et al., 2020), hindering a direct comparison between methods’ performance. Following the code of good practices to show deep learning-based results proposed by Dodge et al. (2019), all results presented in this paper state the reconstruction strategy used. Namely, the conv 3x3,,, ELU, drop pout co conv nv 3x 3x3E 3, ELLU U, dr drop opou outt implemented options are as follows: ELU conv 3x3 Add conv 3x3, ELU 1. Per patch. The metric value corresponds to the average (a) (b) value over all patches. 2. Per image (with 50% overlap). The patches are merged Fig. 2 Types of processing blocks. Convolutional blocks (a) are used together using 50% of overlap and the metric value is the in the U-Net and Attention U-Net architectures, and residual blocks (b) are used in the Residual U-Net average overall reconstructed images. 1 3 Neuroinformatics (2022) 20:437–450 441 16 16 32 16 16 conv 3x3, ELU up-conv 2x2 Input Output sigmoid conv 3x3, ELU, dropout AG maxpooling 2x2 ReLU 256x 256x 1 256x 256x 1 conv 1x1 skip connection 64 32 32 32 x multiply + add AG 64 64 128 64 + X AG 2 2 Attentiongate 32 32 Fig. 3 Proposed 2D Attention U-Net architecture. Example with three downsampling levels and a detailed description of the attention gates used in the skip connections 3. Full image. Inference is applied on the full-sized biomedical semantic segmentation and test them in other images. The metric value is the average over all images. public datasets. In all our experiments, we present average This strategy is not always feasible, since it depends on scores obtained running the same configuration 10 times the input image size and the available GPU memory. (hereafter referred as a run) together with the corresponding standard deviation. Experimental Results Datasets To test our hypothesis and focusing on model reproduc- All the experiments performed in this work are based on the ibility and stability, we conducted a thorough study on the following publicly available datasets: top-performing segmentation methods recently published EPFL Hippocampus or Lucchi dataset (Lucchi et al., in the Lucchi dataset. Additionally, we introduce our own 2011). The original volume represents a 5 × 5 × 5 m section solutions, compare them with state-of-the-art approaches in of the CA1 hippocampus region of a mouse brain, with an isotropic resolution of 5 × 5 × 5 nm per voxel. The volume of 2048 × 1536 × 1065 voxels was acquired using focused ion beam scanning electron microscopy (FIB-SEM). The mitochondria of two subvolumes formed by 165 images of 1024 × 768 pixels were manually labeled by experts (Fig. 5 (red)), and are commonly used as training and test data. Lucchi++ dataset (Casser et al., 2020). This is a version of the Lucchi dataset after two neuroscientists and a senior biologist re-labeled mitochondria by fixing misclassifica - tions and boundary inconsistencies. Kasthuri++ dataset (Casser et al., 2020). This is a re-labeling of the dataset by Kasthuri et al. (2015) (Fig. 5 (blue)). The volume corresponds to a part of the soma- tosensory cortex of an adult mouse and was acquired using Fig. 4 Border effect in output image reconstruction. From left to serial section electron microscopy (ssEM). The train and right: output image reconstructed from patches with visible jagged test volume dimensions are 85 × 1463 × 1613 voxels and predictions; and output image reconstructed using both the blending 75 × 1334 × 1553 voxels respectively, with an anisotropic and ensemble techniques. Blue and red boxes show zoomed areas on resolution of 3 × 3 × 30 nm per voxel. both images 1 3 256x 256 256x 256 2 2 64 128 256x 256 256x 256 256x 256 442 Neuroinformatics (2022) 20:437–450 0.99 momentum and no decay, with a learning rate of 0.002, a batch size value of 6 and using a patch size 256 × 256 pixels. The validation set is formed by 10% of the training images selected at random. We use a GeForce GTX 1080 GPU card to train the network for 360 epochs, completing an epoch when all training data is explored, with a patience established at 100 epochs monitoring the validation loss and picking up the model that performs best in the validation set. Moreover, we apply on-the-fly data augmentation (DA) with random rotations and vertical and horizontal flips. For the 3D networks, the same hyperparameters as the 2D are used but we employ elastic transformations as well (in 2D we did not observe an improvement), using a patch size of Fig. 5 Sample images from public mitochondria datasets. From left 80 × 80 × 80 voxels . to right: Lucchi and Kasthuri++ data sample with their correspond- ing binary mask. Blue and red boxes show zoomed areas on both Experiments on Lucchi Dataset images Reproducing Top State‑OfT ‑ heA ‑ rt Methods Experimental Setup We aimed at reproducing the state-of-the-art deep learning- based methods that report top performance in the Lucchi Evaluation metrics. We evaluate our methods using the Jac- card index of the positive class or foreground IoU, den fi ed as dataset published by Cheng and Varshney (2017), Casser et al. (2020), Xiao et al. (2018) and Oztel et al. (2017). Only IoU = TP∕(TP + FP + FN) where TP are the true positives, FP the false positives and FN are the false negatives. As a the code by Casser et al. (2020) is publicly available, so we plugged their network architecture into our training work- convention, the positive class is foreground and the negative class is background. The background IoU is defined likewise flow. The code from the rest of the methods was unsuccess- fully requested to their corresponding authors. by swapping the positive and negative classes. To obtain these values, the probability image returned by the network In all cases, a first implementation attempt was made fol- lowing the methodology and exact parameters described in is binarized using a threshold value of 0.5. Nevertheless, to compare our results with other related works we also define each publication. When finding missing information, we pro- ceeded using the most common practice in the field. In addi- the overall IoU as IoU =(IoU + IoU )∕2 where IoU and O F B F IoU are the foreground and background IoU, respectively. tion, following the same procedure we use for our own mod- els, we modie fi d the original congu fi ration (i.e., architecture Notice the high proportion of background pixels typically inflates the overall IoU score, resulting in greater values than and training workflow) aiming at improving the results and their stability (full details available in the supplementary the foreground IoU. Training setup and data augmentation. To find the material). These configurations are hereafter referred to as original and modified respectively. A systematic search of best solutions, we made an exhaustive search of hyperpa- rameters and training configurations, exploring different the best hyperparameters and training configurations was performed and the results are shown in Table 6. loss functions, optimizers, learning rates, batch sizes, and data augmentation techniques. We explored as well the use The original 2D network configuration by Cheng and Varshney (2017) produces results with high standard devia- of different input patch sizes, their selection method (ran- dom or systematic), and the discarding of image patches tion, probably due to the high learning rate employed (0.05), even though it is reduced when reaching the 50% and 75% of with low foreground class information (Oztel et al., 2017). When selecting a random patch, we define a probability total epochs. Our modified configuration differs in the opti- mizer used (Adam instead of SGD) and learning rate (fixed to map to choose patches with a higher probability of contain- ing mitochondria, therefore addressing the class imbalance 0.0001). Additionally, we performed extra DA with random rotations, removed the dropout layers, reduced the number of problem. Finally, we have also studied the effect of selecting the validation set as either consecutive training images or epochs and extracted 12 random patches per training image instead of just one. Without post-processing (none is used in at random. Here we describe the best training configura- tion found. However, the details of our exhaustive search are the original publication), the foreground IoU value reported (0.865) can only be reached through our modified configura- available in the supplementary material. In particular, for the 2D networks, we minimize the binary cross-entropy (BCE) tion and by taking the maximum values of the 50% overlap or full image reconstruction strategies. Even better values loss using the Stochastic Gradient Descent (SGD) optimizer, 1 3 Neuroinformatics (2022) 20:437–450 443 1 3 Table 1 Foreground IoU (mean±standard deviation) of reproduced state-of-the-art works in Lucchi dataset. Original refers to the exact configurations as reported by the authors, while Modified corresponds to the best configuration found by us. The different output reconstruction and post-processing methods adopted are indicated. More details available in Table S3.1 50% Overlap Full Image Network Param. num- Reported Per Patch +Test-time +Test-time + Blending +Blending +Test-time +Test-time ber aug. aug. + +Test-time +Test-time aug. aug. + Z-Filtering aug. aug. +Z. Z-Filtering Filtering Cheng et. al 0.6M 0.865 (2D) Original 0.59M 0.503±0.233 0.517±0.240 0.517±0.239 0.521±0.243 0.541±0.250 0.548±0.254 0.526±0.244 0.537±0.244 0.543±0.252 Modified 0.59M 0.848±0.012 0.851±0.011 0.863±0.010 0.868±0.010 0.865±0.008 0.871±0.008 0.853±0.011 0.865±0.009 0.871±0.008 Maximum - 0.864 0.865 0.877 0.881 0.878 0.883 0.865 0.878 0.881 Casser et. al 1.96M 0.890 Original 1.96M 0.824±0.014 0.815±0.016 0.825±0.013 0.831±0.013 0.831±0.011 0.838±0.011 0.820±0.016 0.833±0.011 0.839±0.012 Modified 1.96M 0.844±0.014 0.837±0.008 0.846±0.016 0.850±0.017 0.850±0.016 0.855±0.017 0.842±0.006 0.853±0.015 0.858±0.015 Maximum - 0.846 0.846 0.861 0.865 0.862 0.867 0.848 0.865 0.870 Oztel et. al 0.14M 0.907 Original 0.14M - - - - - - 0.425±0.080 0.457±0.060 0.466±0.061 Modified 0.07M - - - - - 0.451±0.042 0.476±0.049 0.487±0.053 Maximum - - - - - - - 0.500 0.531 0.544 Cheng et. al 0.63M 0.889 (3D) Original 0.79M 0.053±0.000 0.053±0.000 0.053±0.000 0.053±0.000 - - - - - Modified 0.79M 0.623±0.039 0.714±0.040 0.053±0.034 0.053±0.034 - - - - - Maximum - 0.694 0.787 0.799 0.800 - - - - - Xiao et. al 1.1M 0.900 Original 1.08M 0.874±0.003 0.863±0.004 0.866±0.004 0.867±0.004 - - - - - Modified 1.08M 0.882±0.002 0.872±0.003 0.874±0.003 0.874±0.003 - - - - - Maximum - 0.885 0.880 0.880 0.880 - - - - - 444 Neuroinformatics (2022) 20:437–450 can be obtained thanks to post-processing. The 3D approach The instructions to reproduce all models can be found at our of the same authors, Cheng and Varshney (2017), produces official documentation site: https:// em- imag e- segme nt ati on. IoU values close to 0 in its original form, since using the readthedocs. io/ en/ lates t/manus cr ipts/s table_mit ochondr ia. html . proposed learning rate (0.1), the network gets easily trapped In addition, the details of each experiment can be found in the in local minima. Moreover, the subvolume shape adopted, supplementary material, with a link to the template that repro- 128 × 128 × 96 pixels , makes train/validation data splitting duces its results. difficult, so we train the network until convergence with no validation data. Our modified configuration produces better results but far from the reported ones and highly unstable Proposed Networks Vs. State‑OfT ‑ heA ‑ rt Networks (0.800 in its best run vs the reported 0.889). for Semantic Segmentation The original configuration proposed by Casser et al. (2020) reaches high IoU values with low standard deviation Here, we introduce the performance of our proposed archi- as well. We modified it by selecting two random patches tectures together with a study in-depth of the main state- per training image instead of one and using a probability of-the-art semantic segmentation networks for natural and map to prioritize patches having mitochondria pixels in the biomedical images. Namely, FCN 8/32 (Long et al., 2015), center, which leads to more stable results. The maximum MultiResUNet (Ibtehaz & Rahman, 2020), MNet (Fu et al., value was obtained by applying Z-filtering to the predictions 2018), Tiramisu (Jégou et al., 2017), U-Net++ (Zhou et al., over full test images, measuring 0.870 of foreground IoU. 2018), 3D Vanilla U-Net (Çiçek et al., 2016) and nnU- In the original code, the authors optimized the training by Net (Isensee et al., 2021). All implementations have been using the test set as validation set, which could explain their obtained or ported from their official sites and all networks better reported value. have been optimized under the same conditions: same train- The work presented by Xiao et al. (2018) provided a ing and validation partitions, DA, optimizers and learning detailed explanation of their training procedure, architecture rate ranges (see supplementary material). The case of the and output reconstruction strategy. Thus, the only modifica - nnU-Net is special since it is designed to optimize the whole tion we made is the use of elastic transformations in DA. As segmentation pipeline. For a fair comparison, we extracted it is shown in Table 1, this change improves substantially the the optimal architecture found following the nnU-Net regular results obtained. They merge the predictions with overlap and processing and plugged it into our own workflow. ensemble, so to be fair, the maximum value of patch merging All 2D networks use an input patch size of 256 × 256 using 50% overlap and ensemble predictions should be used for pixels, while 3D networks use 80 × 80 × 80 voxels subvol- comparison. They reported 0.900 of foreground IoU compared umes to exploit the isotropic resolution of the Lucchi data- to the maximum 0.880 achieved by our modified version. set. The results from the best configuration found for each Finally, the original configuration proposed by Oztel network are shown in Table 3. Notice the 3D networks do et al. (2017) produces very low foreground IoU values. We not have results using full image reconstructions due to reproduced their model and tried modifying their network by GPU memory limitations, as the whole dataset should be adding more non-linearities (ReLU), changing the dropout fed to the network. Similarly, blending estimation was not values or the feature maps used, but the results obtained implemented in 3D networks given their computational are far from those presented by the authors. The number cost. of parameters in the original network compared with other Performance of state-of-the-art biomedical segmenta- state-of-the-art approaches is also relatively low (0.14M). tion networks. The results of Tiramisu (Jégou et al., 2017), Furthermore, we implemented their post-processing pipe- MNet (Fu et al., 2018), nnU-Net (Isensee et al., 2021), line, whose results are presented in Table 2. We adapted it to MultiResUNet (Ibtehaz & Rahman, 2020) and 3D Vanilla specifically improve the segmentation made by the proposed U-Net (Çiçek et al., 2016) are below 0.880 of foreground network. Although the final metric value increased by a large IoU even when using output reconstructions with 50% of margin, our results are far from their reported IoU. overlap and post-processing techniques such as ensemble Table 2 Foreground IoU results by the original and modified configurations of (Oztel et al., 2017) using their consecutive post-processing meth- ods, i.e., Spurious Detection is applied over Full Images, then they are passed through Watershed, and finally through Z-filtering Full Image Spurious Detection Watershed Z-Filtering Original 0.425±0.080 0.426±0.091 0.540±0.100 0.573±0.106 Modified 0.451±0.042 0.449±0.067 0.562±0.057 0.599±0.067 Maximum 0.500 0.539 0.619 0.683 1 3 Neuroinformatics (2022) 20:437–450 445 1 3 Table 3 Performance of proposed and state-of-the-art networks for semantic segmentation in the Lucchi dataset (foreground IoU, mean±standard deviation). Scores are shown using the differ - ent post-processing and output reconstruction methods adopted. 3D patches required a minimum overlap so they are marked with *. Best results of each column and type of network (2D or 3D) are shown in bold. More details in Table S3.2 50% Overlap Full Image Network Param. Number Per Patch *Test-time *Test- *Blending *Blending *Test- *Test-time aug. *Test-time aug. time aug. *Test-time time aug. *Z-Fil- aug. *Z-Filter- *Z-Filtering aug. tering ing FCN 32 (Dai 50.38M 0.040±0.000 0.677±0.005 0.679±0.006 0.680±0.006 0.659±0.004 0.661±0.004 0.657±0.003 0.659±0.003 0.660±0.003 et al., 2016) MultiResUNet (Ibtehaz 7.26M 0.815±0.000 0.814±0.014 0.820±0.010 0.824±0.010 0.834±0.010 0.840±0.009 0.828±0.016 0.833±0.010 0.839±0.010 & Rahman, 2020) Tiramisu (Jégou 9.4M 0.810±0.028 0.833±0.027 0.851±0.018 0.857±0.017 0.850±0.016 0.855±0.016 0.830±0.029 0.846±0.019 0.851±0.018 et al., 2017) MNet (Fu et al., 2018) 8.54M 0.851±0.011 0.865±0.008 0.870±0.007 0.874±0.007 0.874±0.006 0.878±0.006 0.867±0.008 0.872±0.006 0.876±0.008 U-Net++ (Zhou et al., 37.7M 0.734±0.012 0.872±0.005 0.877±0.004 0.881±0.004 0.880±0.003 0.884±0.003 0.875±0.004 0.878±0.003 0.882±0.003 2018) 2D SE U-Net (ours) 1.95M 0.863±0.002 0.873±0.003 0.878±0.003 0.882±0.003 0.880±0.003 0.883±0.003 0.875±0.002 0.881±0.002 0.881±0.002 2D Residual U-Net 2.03M 0.867±0.005 0.873±0.005 0.877±0.004 0.880±0.004 0.878±0.003 0.882±0.003 0.875±0.004 0.877±0.003 0.880±0.004 (ours) FCN 8 (Dai et al. 2016) 50.38M 0.860±0.005 0.880±0.003 0.884±0.002 0.888±0.002 0.887±0.002 0.891±0.002 0.881±0.003 0.886±0.002 0.891±0.002 nnU-Net (Isensee 52.1M 0.867±0.004 0.876±0.004 0.881±0.003 0.884±0.003 0.882±0.003 0.886±0.003 0.861±0.007 0.864±0.009 0.868±0.008 et al., 2021) 2D U-Net (ours) 1.95M 0.874±0.003 0.881±0.002 0.884±0.002 0.888±0.002 0.884±0.000 0.889±0.002 0.882±0.003 0.884±0.002 0.887±0.003 2D Attention U-Net 1.99M 0.875±0.004 0.882±0.003 0.885±0.001 0.890±0.002 0.886±0.001 0.892±0.001 0.884±0.002 0.886±0.001 0.890±0.002 (ours) 3D Vanilla U-Net 19.07M 0.402±0.005(*) 0.851±0.004 0.857±0.006 0.857±0.006 - - - - - (Çiçek et al., 2016) 3D SE U-Net (ours) 0.79M 0.387±0.007(*) 0.867±0.009 0.873±0.007 0.874±0.007 - - - - - 3D Attention U-Net 0.79M 0.389±0.005(*) 0.870±0.003 0.876±0.003 0.876±0.003 - - - - - (ours) 3D U-Net (ours) 0.79M 0.394±0.005(*) 0.871±0.006 0.878±0.004 0.878±0.004 - - - - - 3D Residual U-Net 1.50M 0.394±0.004* 0.877±0.004 0.883±0.002 0.883±0.002 - - - - - (ours) 446 Neuroinformatics (2022) 20:437–450 predictions or Z-filtering. On top of these networks, the output reconstruction, and post-processing strategies for U-Net++ achieved the best results, scoring 0.881 ± 0.004 of each method. The availability of original code, including foreground IoU. The 3D Vanilla U-Net, nnU-Net, U-Net++ that of the present paper, is also indicated. Notice the gap and MNet seem to produce stable results (low standard between the averaged IoU and the reported values increases deviation), while Tiramisu and MultiResUNet have larger with the standard deviation, underling the importance of variability within their results. Besides that, the difference in finding stable configurations so as not to depend on a large their number of trainable parameters is remarkable. The 3D computation budget (Dodge et al., 2019). Vanilla U-Net, nnU-Net and U-Net++ models have between Our proposed 2D U-Net and Attention U-Net models, 2× and 5× more parameters than the other state-of-the-art together with the FCN8 model reached the highest repro- approaches. Concerning the FCN networks (Long et al., ducible foreground IoU score with a value of 0.893. In par- 2015), the FCN32 reports low IoU values while the FCN8 ticular, the 2D Attention U-Net achieved a slightly higher achieves results comparable with our best 2D U-Net configu- average score in a very consistent manner. Best values were ration. Nevertheless, the number of trainable parameters in obtained using blending and ensemble for output reconstruc- FCN8 is 50.38M compared to less than 2M in our proposed tion and Z-filtering as post-processing (see Fig. S1.1 for an 2D models. example of some of the proposed networks’ predictions). As Performance of our proposed networks. Regarding our opposed to other approaches, the standard deviation of our proposed approaches (“Proposed networks”), the best val- results is consistently low, guaranteeing good performance ues were obtained with the 2D U-Net and its version with and reducing the number of experiments needed to reach attention gates: 0.888 ± 0.002 and 0.890 ± 0.002 applying optimal segmentation. test-time data augmentation and Z-filtering post-processing As expected, the lack of code associated with a publi- respectively. Our 3D networks do not reach the perfor- cation enormously hinders the reproduction of the claimed mance obtained with 2D versions. This may be explained results. Interestingly, in the case of the 2D approach by by inspecting mitochondria labels in 3D, we observed they Cheng and Varshney (2017), our implementation improved frequently lose shape continuity through slices, penalizing over their published results, stressing the benefits of optimiz- the learning capacity of 3D networks (see Fig. S2.2). ing the whole segmentation workflow. Notice there are two Remarkably, our 3D networks have three times fewer table entries for results with nnU-Net (Isensee et al., 2021): training parameters than our 2D approaches, leading to more one using their entire training framework, and one plugging computationally efficient models. To complete the overview the best architecture found by their framework into ours. of the state-of-the-art networks and architectures, we experi- mented with Squeeze-and-Excitation (SE) blocks (Hu et al., Ablation Study 2018) in our proposed 2D and 3D models. These blocks perform dynamic channel-wise feature recalibration by To investigate the relevance of each component in our pro- squeeze and excite operations. The Squeeze operation con- posed networks, we performed an ablation study of our 2D sists of collecting global spatial information into a channel U-Net baseline architecture. We compared six ablated versions descriptor using global average pooling. After that, features with incremental changes: 1) a baseline four-level 2D U-Net are recalibrated by the excite operation, which emphasizes model containing ReLU activations, Glorot uniform kernel channel-wise features with a simple gating mechanism based initialization (Glorot & Bengio, 2010), 16 feature maps in on a ReLU and a Sigmoid activation. Their best results are the first level of the network that are doubled on each level, obtained with SE blocks everywhere except the bottleneck, and no regularization or DA; 2) the baseline with basic DA as suggested by Roy et al. (2018). Nevertheless, we experi- (random rotations and horizontal and vertical flips); 3) add- mented as well with inserting SE blocks after every convolu- ing batch normalization 4) adding dropout as regularization tional layer. As shown in Table 3, these blocks do not imply method; 5) using ELU as activation function ( = 1 ); 6) using a boost in performance in this case. He normal (He et al., 2015) as kernel initialization; 7) adding A full description of the configurations tested can be attention gates (Schlemper et al., 2019) in the skip connections. found in Section S3 (supplementary material). The evaluation results on the Lucchi dataset for each case are shown in Table 5. Notice the IoU values vary signifi- Comparison with Reported Results cantly if they are provided by patch or by reconstructing the final output, highlighting once more the need of specify - We have summarized in Table 4 the reported results of the ing the framework chosen when presenting the results. The top-performing published methods, together with those use of DA together with dropout clearly outperforms the of state-of-the-art approaches and our proposed networks. baseline architecture by a large margin. Batch normalization All reproduced values correspond to the best configuration decreases the performance, so it was not included in succes- found, i.e., using the optimal pre-processing, architecture, sive models. In the same way, the usage of ELU improves 1 3 Neuroinformatics (2022) 20:437–450 447 Table 4 Reported vs. reproduced scores in the Lucchi dataset. The values refer to the maximum, mean and standard deviation obtained Reported values correspond to the scores claimed by authors of each while reproducing each corresponding method. Best scores of each publication or the maximum score obtained by us. The Reproduced column are presented in bold Foreground IoU Overall IoU Description Implementation Code Reported Reproduced Reported Reproduced FCN 32 Ours using (Dai et al., 2016) ✓ 0.688 0.688 (0.680±0.006) 0.835 0.835 (0.831±0.003) MultiResUNet Ours using (Ibtehaz & Rahman, 2020) ✓ 0.847 0.847 (0.824±0.010) 0.919 0.919 (0.902±0.007) 2D CNN (Cheng & Varshney, 2017) 0.865 0.883 (0.871±0.008) - 0.938 (0.932±0.004) 3D Vanilla U-Net Ours using (Çiçek et al., 2016) ✓ 0.866 0.866 (0.857±0.006) 0.929 0.929 (0.924±0.003) Tiramisu Ours using (Jégou et al., 2017) ✓ 0.872 0.872 (0.857±0.017) 0.932 0.932 (0.924±0.009) 2D U-Net (Casser et al., 2020) ✓ 0.878 0.865 (0.853±0.015) 0.935 0.930 (0.922±0.007) 3D SE U-Net Ours ✓ 0.879 0.879 (0.874±0.007) 0.936 0.936 (0.933±0.004) 3D Attention U-Net Ours ✓ 0.880 0.880 (0.876±0.003) 0.936 0.936 (0.934±0.002) nnU-Net framework (Isensee et al., 2021) ✓ 0.882 - 0.938 - MNet Ours using (Fu et al., 2018) ✓ 0.883 0.883 (0.874±0.007) 0.938 0.938 (0.929±0.004) 2D Residual U-Net Ours ✓ 0.885 0.885 (0.880±0.004) 0.939 0.939 (0.937±0.002) 3D U-Net Ours ✓ 0.885 0.885 (0.878±0.004) 0.939 0.939 (0.935±0.002) nnU-Net Ours using (Isensee et al., 2021) ✓ 0.888 0.888 (0.881±0.005) 0.941 0.941 (0.937±0.003) 3D Residual U-Net Ours ✓ 0.888 0.888 (0.883±0.002) 0.941 0.941 (0.938±0.001) 2D SE U-Net Ours ✓ 0.888 0.888 (0.882±0.003) 0.941 0.941 (0.937±0.002) U-Net++ Ours using (Zhou et al., 2018) ✓ 0.888 0.888 (0.884±0.003) 0.941 0.941 (0.938±0.001) 3D CNN (Cheng & Varshney, 2017) 0.889 0.800 (0.738±0.034) - 0.894 (0.860±0.018) 2D U-Net+Z-filtering (Casser et al., 2020) ✓ 0.890 0.870 (0.858±0.015) 0.942 0.931 (0.925±0.007) FCN 8 Ours using (Dai et al., 2016) ✓ 0.893 0.893 (0.888±0.002) 0.943 0.943 (0.941±0.001) 2D U-Net Ours ✓ 0.893 0.893 (0.888±0.002) 0.942 0.942 (0.941±0.001) 2D Attention U-Net Ours ✓ 0.893 0.893 (0.890±0.002) 0.943 0.943 (0.942±0.001) 3D U-Net (Xiao et al., 2018) 0.900 0.881 (0.875±0.003) - 0.937 (0.934±0.002) CNN+3 Post-proc. (Oztel et al., 2017) 0.907 0.683 (0.599±0.067) - 0.800 (0.757±0.106) over the use of ReLU activation functions. Conversely, A comprehensive study on how the different IoU values changing the kernel initialization from Glorot uniform to of the ablation results relate to the segmented size and shape He normal has marginal effects in the final result, so either of the reconstructed mitochondria is presented in Section S4 can be used. Finally, introducing attention in the skip con- (supplementary material). nections, as suggested by Schlemper et al. (2019), helped increasing the network performance and maintaining results Results on Lucchi++ and Kasthuri++ stability. To test how well the best solutions found for Lucchi would generalize in other datasets, we applied the same configu- Table 5 Ablation study of our full 2D model. From the top to the bot- tom, on each row, incremental modifications are applied based on the rations to Lucchi++ and Kasthuri++ and compared their previous configuration, except batch normalization, which was dis- performance with that reported by Casser et al. (2020). In carded as it decreases the performance Table 6, we can see our models outperform all previously Foreground IoU reported results by a large margin. Since these datasets cor- rected the mitochondria label continuity through the slices, Method Per Patch 50% Overlap Full Image the best performance is obtained with 3D networks. This Baseline - 2D U-Net 0.725±0.020 0.748±0.027 0.739±0.002 supports the hypothesis that the Lucchi dataset labeling + DA 0.859±0.007 0.872±0.003 0.871±0.004 inconsistencies hinder the learning capacity of the 3D net- (+ Batch norm.) 0.856±0.005 0.864±0.004 0.869±0.002 works, which are usually expected to perform better than 2D + Dropout 0.870±0.003 0.880±0.002 0.881±0.002 networks in such a context (Wolf et al., 2018). Moreover, + ELU activation 0.873±0.003 0.880±0.001 0.881±0.002 the Kasthuri++ dataset is anisotropic (lower resolution in + He initializer 0.873±0.003 0.880±0.002 0.881±0.003 the z-axis). Therefore, we modified our proposed 3D net- + Attention Gates 0.875±0.003 0.882±0.003 0.884±0.002 works by removing the z-axis downsampling in their pooling 1 3 448 Neuroinformatics (2022) 20:437–450 Table 6 Results obtained in the Lucchi++ and Kasthuri++ datasets. All our model scores correspond to optimal architectures found in Lucchi Foreground IoU Overall IoU Dataset Description Author Maximum (mean±std) Maximum (mean±std) Lucchi++ 2D U-Net Casser et al. (2020) 0.888 - 0.940 - 2D U-Net+Z Filtering Casser et al. (2020) 0.900 - 0.946 - 2D Residual U-Net (*) Ours 0.908 0.904±0.004 0.943 0.948±0.002 2D U-Net (*) Ours 0.916 0.911±0.006 0.955 0.952±0.003 2D Attention U-Net (*) Ours 0.919 0.914±0.003 0.956 0.954±0.001 3D U-Net (a) Ours 0.923 0.915±0.007 0.958 0.954±0.004 3D Attention U-Net (a) Ours 0.923 0.912±0.008 0.959 0.953±0.004 3D Residual U-Net (a) Ours 0.926 0.919±0.005 0.960 0.957±0.003 Kasthuri++ 2D U-Net Casser et al. (2020) 0.845 - 0.920 - 2D U-Net+Z Filtering Casser et al. (2020) 0.846 - 0.920 - 2D Residual U-Net (a) Ours 0.908 0.906±0.001 0.953 0.950±0.001 2D Attention U-Net (a) Ours 0.915 0.913±0.001 0.956 0.954±0.001 2D U-Net (a) Ours 0.916 0.913±0.002 0.955 0.954±0.001 3D U-Net (a) Ours 0.934 0.932±0.001 0.965 0.965±0.001 3D Residual U-Net (a) Ours 0.934 0.933±0.001 0.966 0.966±0.000 3D Attention U-Net (a) Ours 0.937 0.934±0.001 0.967 0.966±0.001 (*) 0% overlap output reconstruction, blended ensemble and z-filtering post-processing (a) 50% overlap output reconstruction and ensemble post-processing operations and using shallower architectures (three levels original ground truth was 0.885 (dilatation) and 0.904 (ero- instead of four). sion). Thus, this enforces the idea that the dataset is not pixel- level accurate, so it could be argued that all the methods with IoU values within a range of 0.009 or less can be considered Conclusion to have similar performance. The same experiment was done with the ground truth labels of Lucchi++ (foreground IoU: By a complete experimental study of state-of-the-art DL 0.898, 0.919) and Kasthuri++ (foreground IoU: 0.927, 0.922). models with modern training workflows, we have revealed Indeed, even the average score of many of our models outper- significant problems of reproducibility in the domain of form those values (Table 6). This suggests the performance on mitochondria segmentation in EM data. Moreover, by dis- all three datasets has probably saturated, as new architectures entangling the effects of novel architectures from those of and training frameworks cannot improve beyond the limits the training choices (i.e., pre-processing, data augmentation, inherent to semantic segmentation and the size of the datasets. output reconstruction, and post-processing strategies) over In closing, we believe further progress in mitochondria a set of multiple executions of the same configurations, we segmentation in EM will require (1) larger and more com- have found stable lightweight models that consistently lead plex datasets (Wei et al., 2020), and (2) the adoption of a to state-of-the-art results on the existing public datasets. reproducibility checklist or set of best practices (Dodge Have novel methods reached human performance? To et al., 2019) to report more comprehensive results and allow answer that question, Casser et al. (2020) compared the results robust future comparisons. of human annotators in the Lucchi dataset, producing a fore- ground IoU value of 0.884. This would suggest that many of the models presented in Table 4 outperform indeed humans Information Sharing Statement in this task. Nevertheless, all methods fall short of the 0.907 threshold for foreground IoU red reported by Oztel et al. The datasets utilized for the training and testing of the mod- (2017), which could be due to the annotation inconsistencies els presented in this work are freely available. discussed in “Proposed networks vs. state-of-the-art networks Supplementary Information The online version contains supplemen- for semantic segmentation”. To investigate further, we cre- tary material available at https://doi. or g/10. 1007/ s12021- 021- 09556-1 . ated two slightly different versions of the mitochondria ground truth labels by 1-pixel morphological dilation and erosion. Acknowledgements None. The foreground IoU value of the resulting labels against the 1 3 Neuroinformatics (2022) 20:437–450 449 Funding Open Access funding provided thanks to the CRUE-CSIC Chaurasia, A., & Culurciello, E. (2017). LinkNet: Exploiting encoder agreement with Springer Nature. This work was supported by Min- representations for efficient semantic segmentation. In 2017 IEEE isterio de Ciencia, Innovación y Universidades, Agencia Estatal de Visual Communications and Image Processing (VCIP), IEEE, Investigación, under Grants TEC2016-78052 and PID2019-109820RB- pp. 1–4. I00, MCIN/AEI/10.13039/501100011033/, co-finance by European Cheng, H.-C., & Varshney, A. (2017). Volume segmentation using Regional Development Fund (ERDF), “A way of making Europe.” convolutional neural networks with limited training data. In I.A-C would like to acknowledge the support of the Leonardo Grant 2017 IEEE International Conference on Image Processing for Researchers and Cultural Creators, BBVA Foundation. (ICIP), IEEE, pp. 590–594. Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T., & Ronneberger, O. (2016). 3D U-Net: Learning Dense Volumetric Segmenta- Code Availability The developed software that support the findings of this tion from Sparse Annotation. In International Conference on study are publicly available from Github https:// github. com/ danif ranco/ Medical Image Computing and Computer-Assisted Intervention, EM_ Image_ Segme ntati on. Springer, pp. 424–432. Wolf, S., Pape, C., Bailoni, A., Rahaman, N., Kreshuk, A., Kothe, Data Availability Lucchi dataset is available at https:// www. epfl. ch/ U., & Hamprecht, F. A. (2018). The mutex watershed: efficient, labs/ cvlab/ data/ data- em/. Lucchi++ and Kasthuri++ datasets can be parameter-free image partitioning. In Proceedings of the European downloaded from https:// sites. google. com/ view/ conne ctomi cs/. Conference on Computer Vision (ECCV), pp. 546–562. Dai, J., Li, Y., He, K., & Sun, J. (2016). R-FCN: Object Detection Declarations via Region-based Fully Convolutional Networks. In Advances in Neural Information Processing Systems, pp. 379–387. De Moura, M. B., dos Santos, L. S., & Van Houten, B. (2010). Mito- Conflicts of Interest The authors declare that they have no competing chondrial dysfunction in neurodegenerative diseases and cancer. interest. Environmental and Molecular Mutagenesis, 51(5), 391–405. Dodge, J., Gururangan, S., Card, D., Schwartz, R., & Smith, N. A. Consent for Publication Not applicable. (2019). Show your work: improved reporting of experimental results. arXiv preprint arXiv:1909.03004 Ethics Approval and Consent to Participate Not applicable. Fu, H., Cheng, J., Xu, Y., Wong, D. W. K., Liu, J., & Cao, X. (2018). Joint Optic Disc and Cup Segmentation Based on Multi-label Open Access This article is licensed under a Creative Commons Attri- Deep Network and Polar Transformation. IEEE Transactions on bution 4.0 International License, which permits use, sharing, adapta- Medical Imaging, 37(7), 1597–1605. tion, distribution and reproduction in any medium or format, as long Fulda, S., Galluzzi, L., & Kroemer, G. (2010). Targeting mitochondria as you give appropriate credit to the original author(s) and the source, for cancer therapy. Nature Reviews Drug Discovery, 9(6), 447–464. provide a link to the Creative Commons licence, and indicate if changes Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, were made. The images or other third party material in this article are V., Martinez-Gonzalez, P., & Garcia-Rodriguez, J. (2018). A included in the article's Creative Commons licence, unless indicated survey on deep learning techniques for image and video seman- otherwise in a credit line to the material. If material is not included in tic segmentation. Applied Soft Computing, 70, 41–65. the article's Creative Commons licence and your intended use is not Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of permitted by statutory regulation or exceeds the permitted use, you will training deep feedforward neural networks. In Proceedings of need to obtain permission directly from the copyright holder. To view a the Thirteenth International Conference on Artificial Intelli- copy of this licence, visit http://cr eativ ecommons. or g/licen ses/ b y/4.0/ . gence and Statistics, pp. 249–256. Gu, Z., Cheng, J., Fu, H., Zhou, K., Hao, H., Zhao, Y., et al. (2019). CE-Net: Context Encoder Network for 2D Medical Image Seg- mentation. IEEE Transactions on Medical Imaging, 38(10), 2281–2292. References Haque, I. R. I., & Neubert, J. (2020). Deep learning approaches to biomedical image segmentation. Informatics in Medicine Unlocked, 18, 100297. Arganda-Carreras, I., Turaga, S. C., Berger, D. R., Cireşan, D., Giusti, He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving Deep into A., Gambardella, L. M., et al. (2015). Crowdsourcing the creation Rectifiers: Surpassing Human-Level Performance on ImageNet of image segmentation algorithms for connectomics. Frontiers in Classification. In Proceedings of the IEEE International Confer - Neuroanatomy, 9, 142. ence on Computer Vision, pp. 1026–1034. Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). SegNet: A deep He, K., Zhang, X., Ren, S., & Sun, J. (2016a). Deep residual learning convolutional encoder-decoder architecture for image segmenta- for image recognition. In Proceedings of the IEEE Conference tion. IEEE Transactions on Pattern Analysis and Machine Intel- on Computer Vision and Pattern Recognition, pp. 770–778. ligence, 39(12), 2481–2495. He, K., Zhang, X., Ren, S., & Sun, J. (2016b). Identity mappings in Bello, I., Fedus, W., Du, X., Cubuk, E. D., Srinivas, A., Lin, T.-Y., deep residual networks. In European Conference on Computer Shlens, J., & Zoph, B. (2021). Revisiting ResNets: Improved Vision, Springer, pp. 630–645. Training and Scaling Strategies. arXiv preprint arXiv:2103.07579 He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask Buhmann, J., Krause, R., Lentini, R. C., Eckstein, N., Cook, M., R-CNN. In Proceedings of the IEEE International Conference Turaga, S., & Funke, J. (2018). Synaptic partner prediction from on Computer Vision, pp. 2961–2969. point annotations in insect brains. In International Conference on Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. Medical Image Computing and Computer-Assisted Intervention, In Proceedings of the IEEE Conference on Computer Vision and Springer, pp. 309–316. Pattern Recognition, pp. 7132–7141. Casser, V., Kang, K., Pfister, H., & Haehn, D. (2020). Fast mitochon- Ibtehaz, N., & Rahman, M. S. (2020). MultiResUNet: Rethinking the dria detection for connectomics. In Medical Imaging with Deep U-Net architecture for multimodal biomedical image segmenta- Learning. tion. Neural Networks, 121, 74–87. 1 3 450 Neuroinformatics (2022) 20:437–450 Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J., & Maier-Hein, K. Noh, H., Hong, S., & Han, B. (2015). Learning deconvolution network H. (2021). nnU-Net: a self-configuring method for deep learning- for semantic segmentation. In Proceedings of the IEEE Interna- based biomedical image segmentation. Nature Methods, 18(2), tional Conference on Computer Vision, pp. 1520–1528. 203–211. Oztel, I., Yolcu, G., Ersoy, I., White, T., & Bunyak, F. (2017). Mito- Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., & Bengio, Y. chondria segmentation in electron microscopy volumes using (2017). The one hundred layers tiramisu: Fully convolutional deep convolutional neural network. In 2017 IEEE International densenets for semantic segmentation. In Proceedings of the IEEE Conference on Bioinformatics and Biomedicine (BIBM), IEEE, Conference on Computer Vision and Pattern Recognition work- pp. 1195–1200. shops, pp. 11–19. Poole, A. C., Thomas, R. E., Andrews, L. A., McBride, H. M., Whitworth, Jin, Q., Meng, Z., Pham, T. D., Chen, Q., Wei, L., & Su, R. (2019). A. J., and Pallanck, L. J. The pink1/parkin pathway regulates mito- DUNet: A deformable network for retinal vessel segmentation. chondrial morphology. Proceedings of the National Academy of Sci- Knowledge-Based Systems, 178, 149–162. ences 105, 5 (2008), 1638–1643. Kasthuri, N., Hayworth, K. J., Berger, D. R., Schalek, R. L., Conchello, Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional J. A., Knowles-Barley, S., et al. (2015). Saturated reconstruction networks for biomedical image segmentation. In International of a volume of neocortex. Cell, 162(3), 648–661. Conference on Medical Image Computing and Computer-Assisted Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet clas- Intervention, Springer, pp. 234–241. sification with deep convolutional neural networks. Advances in Roy, A. G., Navab, N., & Wachinger, C. (2018). Concurrent Spatial and Neural Information Processing Systems, 25, 1097–1105. Channel Squeeze & Excitation in Fully Convolutional Networks. Litjens, G., Kooi, T., Bejnordi, B. E., Setio, A. A. A., Ciompi, F., Ghafoorian, In International Conference on Medical Image Computing and M., et al. (2017). A survey on deep learning in medical image analysis. Computer-Assisted Intervention, Springer, pp. 421–429. Medical Image Analysis, 42, 60–88. Roy, A. G., Navab, N., & Wachinger, C. (2018). Recalibrating fully Liu, J., Li, W., Xiao, C., Hong, B., Xie, Q., & Han, H. (2018). Auto- convolutional networks with spatial and channel “squeeze and matic detection and segmentation of mitochondria from sem excitation” blocks. IEEE Transactions on Medical Imaging 38(2), images using deep neural network. In 2018 40th Annual Inter- 540–549. national Conference of the IEEE Engineering in Medicine and Schlemper, J., Oktay, O., Schaap, M., Heinrich, M., Kainz, B., Glocker, Biology Society (EMBC), IEEE, pp. 628–631. B., & Rueckert, D. (2019). Attention gated networks: Learning to Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks leverage salient regions in medical images. Medical Image Analy- for semantic segmentation. In Proceedings of the IEEE Conference sis, 53, 197–207. on Computer Vision and Pattern Recognition, pp. 3431–3440. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional Lucchi, A., Smith, K., Achanta, R., Knott, G., & Fua, P. (2011). networks for large-scale image recognition. arXiv preprint Supervoxel-based segmentation of mitochondria in em image arXiv:14091556 stacks with learned shape features. IEEE Transactions on Medi- Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., cal Imaging, 31(2), 474–486. Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper Lucchi, A., Li, Y., Smith, K., & Fua, P. (2012). Structured image seg- with convolutions. In Proceedings of the IEEE Conference on mentation using kernelized features. In European Conference on Computer Vision and Pattern Recognition, pp. 1–9. Computer Vision, Springer, pp. 400–413. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Lucchi, A., Li, Y., & Fua, P. (2013). Learning for structured predic- Rethinking the inception architecture for computer vision. In Pro- tion using approximate subgradient descent with working sets. ceedings of the IEEE Conference on Computer Vision and Pattern In Proceedings of the IEEE Conference on Computer Vision and Recognition, pp. 2818–2826. Pattern Recognition, pp. 1987–1994. Tait, S. W., & Green, D. R. (2012). Mitochondria and cell signalling. Lucchi, A., Becker, C., Neila, P. M., & Fua, P. (2014a). Exploiting Journal of Cell Science, 125(4), 807–815. enclosing membranes and contextual cues for mitochondria seg- Wallace, D. C. (2012). Mitochondria and cancer. Nature Reviews Cancer, mentation. In International Conference on Medical Image Com- 12(10), 685–698. puting and Computer-Assisted Intervention, Springer, pp. 65–72. Wei, D., Lin, Z., Franco-Barranco, D., Wendt, N., Liu, X., Yin, W., Lucchi, A., Márquez-Neila, P., Becker, C., Li, Y., Smith, K., Knott, G., Huang, X., Gupta, A., Jang, W.-D., Wang, X. et al. (2020). & Fua, P. (2014b). Learning Structured Models for Segmentation MitoEM Dataset: Large-Scale 3D Mitochondria Instance Segmen- of 2-D and 3-D Imagery. IEEE Transactions on Medical Imaging, tation from EM Images. In International Conference on Medical 34(5), 1096–1110. Image Computing and Computer-Assisted Intervention, Springer, Meijering, E. (2020). A bird’s-eye view of deep learning in bioimage anal- pp. 66–76. ysis. Computational and Structural Biotechnology Journal, 18, 2312. Xiao, C., Chen, X., Li, W., Li, L., Wang, L., Xie, Q., & Han, H. (2018). Meyer, F. (1994). Topographic distance and watershed lines. Signal Automatic mitochondria segmentation for EM data using a 3D super- Processing, 38(1), 113–125. vised convolutional network. Frontiers in Neuroanatomy, 12, 92. Milletari, F., Navab, N., & Ahmadi, S.-A. (2016). V-net: Fully convo- Zhou, Z., Siddiquee, M. M. R., Tajbakhsh, N., & Liang, J. (2018). lutional neural networks for volumetric medical image segmenta- Unet++: A nested U-Net architecture for medical image seg- tion. In 2016 fourth International Conference on 3D Vision (3DV), mentation. In Deep Learning in Medical Image Analysis and IEEE, pp. 565–571. Multimodal Learning for Clinical Decision Support. Springer, Minaee, S., Boykov, Y. Y., Porikli, F., Plaza, A. J., Kehtarnavaz, N., & pp. 3–11. Terzopoulos, D. (2021). Image segmentation using deep learning: Zhuang, J. (2018). LadderNet: Multi-path networks based on U-Net for A survey. IEEE Transactions on Pattern Analysis and Machine medical image segmentation. arXiv preprint arXiv:1810.07810 Intelligence. Moen, E., Bannon, D., Kudo, T., Graf, W., Covert, M., & Van Valen, Publisher’s Note Springer Nature remains neutral with regard to D. (2019). Deep learning for cellular image analysis. Nature meth- jurisdictional claims in published maps and institutional affiliations. ods, 1–14. 1 3
Neuroinformatics – Springer Journals
Published: Apr 1, 2022
Keywords: Electron microscopy; Mitochondria; Semantic segmentation; Deep learning; Bioimage analysis
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.