MEDnet, a neural network for automated detection of avascular area in OCT angiography

Yukun Guo; Acner Camino; Jie Wang; David Huang; Thomas S. Hwang; Yali Jia

doi:10.1364/BOE.9.005147

1. Introduction

Diabetic retinopathy (DR) is a leading cause of blindness [1,2]. Capillary damage from hyperglycemia causes vision loss through downstream effects, such as retinal ischemia, edema, and neovascularization. Structural optical coherence tomography (OCT) has been used to objectively guide the treatment of diabetic macular edema, however, the disease severity and treatment threshold are still largely dependent on subjective interpretation of fluorescein angiography (FA) [3]. The recently developed OCT angiography (OCTA) [4,5] provides label-free, three-dimensional images of retinal and choroidal circulation with capillary detail. Not only is it safer, faster, and less expensive than conventional dye-based angiography, OCTA provides the potential of giving clinicians objective tools for determining severity of disease by detecting and quantifying neovascularization and non-perfusion (avascular) areas.

Our prior studies [6–9] have demonstrated that the avascular area of the superficial capillary complex in the retina is an important indicator of DR stage and progression. In order to accurately measure the avascular area, OCTA flow pixels need to be classified properly in order to correctly identify regions with abnormal inter-vascular space. However, discriminating vascular signal in OCTA is a challenging task owing to the dependence of background flow signal on local tissue reflectivity and the confounding effects of eye motion [10,11].

In our previous work, we proposed a method to automatically quantify the avascular area on 3 × 3 mm² OCTA image of the macula [6,9]. Today, high-speed OCT systems [12] and efficient OCTA algorithms [13] have made possible the acquisition of considerably larger fields of view (6 × 6 mm² or more); unfortunately, larger fields of view introduce new image processing challenges to the classification of flow pixels. For instance, 6 × 6 mm² OCT angiograms are more likely to contain shadows caused by vitreous floaters or pupil vignetting, and are more vulnerable to shadowing effects due to lower sampling rates. Moreover, the 6 × 6 mm² area encompasses vasculature on two sides of the fovea (optic disc vs temporal side) in which the normal inter-vascular space is significantly different, demanding a more sophisticated detection/segmentation algorithm.

This segmentation task can be considered a pixel-wise classification problem and is amenable to machine learning approaches. Semantic image segmentation with deep convolutional net is an active research field and a lot of deep network solutions have been proposed. Fully convolutional networks [14] were proposed to transform fully connected layers in CNNs into convolutional layers, in order to convert the network output into a heat map. Because the encoding module reduces the resolution of the input by a factor of 32, it is difficult for the decoding module to produce a fine segmentation map. To solve the loss of resolution, across-layer connections have been used in fully convolutional solutions. A successful FCN called U-net [15] added a contracting path to capture context and a symmetric expanding path to identify the location of objects with precision. In order to improve the segmentation accuracy, Deeplab used atrous convolution kernels [16,17], which not only can reduce the loss of resolution but also the number of trainable parameters. Using the state of the art network structure (e.g. VGG [18], ResNet [19], and Inception [20]) as a part of semantic segmentation network can streamline the design process of network and can take the advantage of the superior performance of existing networks. For example, Segnet [21] borrowed VGG16 network structure and built an efficient semantic segmentation network.

Recently, several machine learning solutions have successfully segmented pathological areas with abnormal tissue reflectance characteristics in OCT images [22–27]. However, the metrics based on OCTA image analysis can complement OCT for earlier assessment of ocular diseases with a vascular component, such as DR. Pavle et al. had proved deep convolution network can segment the foveal microvasculature [28] on OCTA images. In this study, we developed a novel deep learning architecture, named multi-scaled encoder-decoder neural network (MEDnet), with a powerful multi-scale feature extraction capability incorporated to segment the non-perfusion area in 6 × 6 mm² angiograms of superficial retinal flow.

2. Methods

The segmentation task consists of pixel-wise classification into two classes: vascular versus avascular area. Our solution, called MEDnet, is a fully convolutional network containing a layer with multi-scale atrous convolutions of different dilation rates aimed at generating feature maps sensitive to the different scales of non-perfusion. This is necessary because the size of avascular areas can be variable and because pixels contained in them might encounter noise in their vicinity, potentially confounding the classification process. The inclusion of this type of layer also helps to reduce the size of the network and, consequently, the number of model parameters to only 890,000.

2.1 Data acquisition

OCTA scans were acquired over a 6 × 6 mm² region using a 70-kHz OCT commercial AngioVue system (RTVue-XR; Optovue, Inc.) centered at 840nm with a full-width half maximum bandwidth of 45nm. Two repeated B-scans were taken at each of 304 raster positions and each B-scan comprised 304 A-lines. The commercial version of the split-spectrum amplitude decorrelation angiography (SSADA) algorithm [13] was used to calculate OCTA data. Then, the retinal layers (Fig. 1(A)) and the boundaries of retinal vascular plexuses (Fig. 1(B)) were segmented using a graph search method [29]. The reflectance (Fig. 1(C-D)) and angiographic images (Fig. 1(E-F)) of the superficial vascular complex (SVC) angiogram were obtained by average and maximum projection of the corresponding volumetric data within the slab of interest, respectively.

Fig. 1 Image processing for the generation of en face visualization of the OCT tissue reflectance and OCT angiography of the superficial vascular complex (SVC) slab. (A) Results of the layer segmentation on a B-scan, delineating seven retinal interfaces. (B) Definition of the SVC boundaries. (C-D) The mean projection of the OCT data within the SVC produce an en face visualization of the retinal reflectivity. (E-F) The maximum projection of the OCTA data within the SVC slab produces an en face image of the superficial retinal flow in the macular region. SVC – Superficial vascular complex. DVC – Deep vascular complex. B – Boundary.

Download Full Size | PDF

2.2 Network architecture

The architecture of the proposed network is illustrated in Fig. 2(A). It can be divided into two parts, encoder and decoder. In the encoder section, a multi-scale block formed by three atrous convolutions with different dilation rates (Fig. 2(B)) was employed to extract multi-scale features from input images. The outputs of the blocks containing the atrous convolutions were concatenated across the depth dimension into a single tensor before being fed to the next layer. After that, each “Conv block” consisted of successive convolutions with 3 × 3-pixel kernel followed by a batch normalization stage and a max pooling operation. The batch normalization was applied to accelerate deep network training and reduce overfitting. The role of the convolution blocks (conv_2 and conv_3) was to encode the image whereas dConv_5, dConv_6 and dConv_7 made up the decoder. The decoder blocks received the outputs of encoder blocks through across-layer connections that allowed the resolution of the output to be preserved and stabilized the training phase [15,30]. A Sigmoid activation function was used in the output layer for pixel-wise classification.

Fig. 2 (A) Network architecture of multi-scaled encoder-decoder neural network (MEDnet). (B) Kernel sizes of atrous convolution blocks with different dilation rates.

Download Full Size | PDF

2.3 Network parameters

The parameters of each layer are listed in Table 1. Convolutional layers had kernels of size 3 × 3 pixels except for the 1 × 1 pixel convolution operations included in the atrous convolution block with the purpose of reducing the depth of the output, and hence, the computational cost. The default atrous convolution itself had a 3 × 3-pixel kernel size. The dilation rate n meant that n-1 zeros have been padded in between the rows and columns of the original filter (Fig. 2(B)).

Table 1. Parameters of MEDnet’s layers

View Table | View all tables in this article

2.4 Training

2.4.1 Training data

The training data consisted of en face angiograms of the SVC (Fig. 3(A)), OCT reflectance of the same slab (Fig. 3(B)), and the corresponding manually segmented non-perfusion binary map (Fig. 3(C)). During the imaging process, occlusion of the back-scattered signal by anterior objects (eyelashes, vitreous floaters, pupil vignetting) might cause loss of the flow signal at the corresponding position and this may be responsible of pixel misclassification. In order to prevent the potentially confounding impact of shadowed areas, we incorporated the corresponding OCT reflectance images in the training stage. Three expert graders manually delineated non-perfusion area maps, and the ground truth maps were generated by the pixel-wise vote on the three manually labeled maps. In order to alleviate the limitations associated with a small training data set, data augmentation techniques [16,17,31] (flipping, noise addition, and rotation) were used to increase the amount of training data.

Fig. 3 Representative data used for training. (A) The en face angiogram of the superficial vascular complex from a patient with diabetic retinopathy. (B) The reflectance image acquired by projecting the reflectance OCT data within the same slab used in (A). (C) The ground truth map of the avascular area. (D) Manually segmented avascular area, overlaid on the superficial vascular complex angiogram.

Download Full Size | PDF

2.4.2 Loss function and optimizer

The loss function used in training stage was the mean square error (Eq. (1)) with L2 regularization loss (Eq. (2)). Mean square error can provide the distance between the actual label and the predicted value whereas L2 regularization loss can measure the scale of the model and avoid overfitting [31].The total loss is the sum of the mean square error and L2 regularization loss (Eq. (3)).

E = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}

R = \sum_{i = 1}^{p} w_{i}^{2}

T = E + R

where

E

is the mean square error,

N

is the number of samples in a training batch,

y

is the label,

\hat{y}

is the predicted value,

w

is weight factor of the model,

p

is the total number of weight factor of the model,

R

is L2 regularization loss and

T

is the total loss.

In the training phase, we used the stochastic gradient descent optimizer (SGD) with an exponential decay learning rate (Eq. (4)) to optimize the total loss.

l_{t} = l * a^{\frac{t}{s}}

where

t

is the training step (250 per epoch),

l_{t}

is the learning rate of

t

-th training step,

a

is the decay factor,

l

is the initial learning rate and

s

is the step decay responsible for reducing the learning rate.

3. Results

3.1 Data set

To train MEDnet we collected OCTA volume data from 76 healthy eyes and 104 eyes with DR, for a total of 180 en face angiograms of SVC (304 × 304 pixels). The DR eyes were arranged by disease severity into three sub-groups, severe DR (include severe nonproliferative diabetic retinopathy (NPDR) and proliferative diabetic retinopathy (PDR)), mild to moderate NPDR, and diabetes without retinopathy (Table 2). These images were annotated for the avascular area by three expert graders. We randomly divided the data set into two groups, 140 samples in the training set and 40 samples in the test set. Both training set and test set have same disease severity distribution. After application of randomized data augmentation operations (Gaussian noise (mean = 0, sigma = 0.5), salt and pepper (salt = 0.001, pepper = 0.001), horizontal flipping, vertical flipping and 180° rotation) the training set was increased to 750 images. During the training phase, 10% of the images were isolated for cross-validation.

Table 2. Data distribution of data set for MEDnet training (the number of subjects)

View Table | View all tables in this article

3.2 Implementation

MEDnet was implemented in Python 3.6 with Keras (Tensorflow-backend) and run on a PC with an Intel i7 CPU, GTX 1080Ti GPU, and 32G RAM. The hyper-parameters are specified in Table 3. We stop training when the accuracy rate became stable in the learning curve, and found that the network can achieve the best generalization performance by the 15-th training epoch. Three samples per training batch were suitable for the memory space available in our hardware. We used a large initial learning rate (L = 0.1) to acquire a high convergence speed, a decay factor a = 0.9 and a step decay s = 200 to obtain a smooth decline in learning rates.

Table 3. The hyper-parameters of MEDnet

View Table | View all tables in this article

3.3 Performance evaluation

To evaluate the performance of MEDnet we applied the trained model on the test set. Several factors can affect the performance of the network, principally a low OCT signal strength index (SSI) and the severity of the disease. To evaluate these dependencies we separated the test set into two groups of twenty eyes each. In the first group we divided images into 4 different sub-groups with different SSI ranges; each group contained five images (Table 3). The other group was arranged by disease severity into another four sub-groups of five scans, containing healthy control subjects, diabetes without retinopathy, mild to moderate NPDR, and severe DR respectively. Since the output of MEDnet consists of a probability map with a value ranging from 0 to 1, a threshold of 0.5 is set to represent the non-perfusion area from the background to compare it with the manual avascular area map. The training phase takes less than 30 minutes on a single NVidia 1080ti GPU, and segment one image takes 2.5 seconds on Intel Core i7 CPU.

The accuracy, precision, recall and F1-score (Eq. (5)) were evaluated in Table 4,

\begin{matrix} A c c u r a c y = \frac{T P + T N}{T P + F P + T N + F N} \\ P r e c i s i o n = \frac{T P}{T P + F P} \\ \begin{matrix} R e c a l l = \frac{T P}{T P + F N} \\ F 1 = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} \end{matrix} \end{matrix}

where TP are true positives (as in correctly predicted non-perfusion area pixels), TN are true negatives, FP are false positives and FN are false negatives. The evaluation shows that classification accuracy, precision (the ability to not classify normal areas as diseased), and F1-score deteriorated for high disease severity and low SSI, which was expected. The recall was very close to one, indicating excellent sensitivity, as almost all of the avascular area pixels were detected. The peculiarity that precision was lower than recall indicated that the inaccuracies found were mostly caused by the avascular area size being overestimated with respect to the ground truth. Because the network did not perform equally in avoiding false positive and false negative pixels, the F1-score was a better metric to describe network performance as it took both observations into consideration.

Table 4. Agreement (in pixels) between MEDnet and Ground Truth of the Avascular Area (mean ± standard deviation)

View Table | View all tables in this article

The avascular area in the healthy controls is concentrated in the macular area (Fig. 4(A4, B4)), while in the DR groups, there are more avascular areas outside the FAZ and randomly distributed over the SVC angiograms (Fig. 4(C4, D4)). Therefore, the cumulative classification error on the severe DR group was larger than for healthy controls, as it was more likely to exhibit mismatch with the subjective manual segmentation (Table 4). With regards to data with different signal strengths, our method could achieve good accuracy for all scans within the range of SSI values recommended by the manufacturer (SSI>55) but better accuracy for high quality scans (Table 4). This is owing to the fact that low quality scans have a larger prevalence of artifacts causing artificially high OCTA signal (such as in motion artifacts) or signal loss due to pupil vignetting (Fig. 4(D2-D3)). Moreover, the low quality scans usually exhibited deteriorated vascular integrity and might have biased the classification towards larger avascular areas.

Fig. 4 Results of the avascular area detection. (A1-D1) En face superficial vascular complex (SVC) angiograms of healthy and non-proliferative diabetic retinopathy (NPDR) subjects. (A2-D2) Probability maps of the avascular areas generated by MEDnet overlaid on the en face angiograms. (A3-D3) Region detected as avascular area overlaid on the en face angiograms. (A4-D4) Ground truth of the avascular areas generated manually by an expert grader, overlaid on the en face angiograms.

Download Full Size | PDF

3.4 Performance on wide field of view OCTA

In a disease like DR it is useful to gain access to the largest available field of view (FOV). Although the FOV of a single OCTA scan is limited by hardware capabilities, software registration and stitching of images acquired in different retinal locations can assist to generate ultra-wide FOV OCTA. For this purpose, a DR subject was imaged on the optic disc, macula, and temporal retina using a wide-field OCTA prototype housed at the Center for ophthalmic optics and lasers of Oregon Health and Science University and reported previously [12]. For wide-field OCTA, 800 OCT B-scans were acquired at 400 positions over an area of 8 × 10 mm² (vertical priority) and the SSADA algorithm was used to generate the angiograms. Detection of the avascular area was applied on the 8 × 10 mm² images and they were later rigidly montaged to represent an area of 20 × 10 mm² (Fig. 5). The network was able to detect the avascular area apparent to a human grader- despite the large prevalence of noise and without having been trained for images from this OCT instrument- over a considerably larger FOV.

Fig. 5 (A) Ultra-wide field OCTA of an eye with diabetic retinopathy obtained by montaging three 8 × 10mm² wide field OCTA en face angiograms of the superficial vascular complex (SVC). (B) Avascular area detected on the eye represented in (A) overlaid on the en face angiogram of the SVC.

Download Full Size | PDF

4. Discussion

We designed a deep convolutional neural network named MEDnet for automated quantification of retinal non-perfusion in DR using OCTA in 6 × 6 mm² macular angiograms of the SVC. To the best of our knowledge this is the first deep learning solution used to segment pathological areas in the retina using OCTA. The network uses a multi-scale block with atrous convolutions to enhance the multi-scale feature extraction capabilities [16,17] and across-layer connections to preserve lateral resolution. The features of retinal vasculature in OCTA en face images are very complex and difficult to describe using traditional computer vision methods. However, deep convolutional networks have strong feature representation capabilities. Before us Pavle et al. [28] had used deep learning based method to successfully segment the foveal microvasculature on OCTA en face images. In this work, MEDnet proved deep learning can generalize well to the complex vasculature of OCTA images and accurately localize the regions with loss of perfusion.

The experimental results indicated that MEDnet has good performance (F1-score>80%) for scans of different disease severity and image quality. Although the performance is satisfying, it should be interpreted with care, as the manual segmentation is done using subjective criteria. The threshold that determines whether an inter-vascular space is an avascular area is arbitrary. Moreover, owing to the complexity of retinal angiograms and the amount of detail available, expert graders are unlikely to segment the whole extent of the avascular area. For this reason, the area calculated by the network tends to be larger than the area delineated manually.

Another limitation of this method is related to the confounding factor introduced by low OCT signal artifacts. Since both angiographic and reflectance en face images are fed to the network, it has the ability to identify OCTA signal loss owing to vignetting and vitreous floaters and prevent their classification as avascular areas (Fig. 6(A1-A4)). In some cases, the avascular area could extend to areas affected by vignetting (Fig. 6(B1-B4)). Potential misclassification can occur there, which it is difficult to evaluate, given that the difference between vignetting and avascular areas are not very evident to human graders either. Although some work has been done in reflectance-adjusted thresholding [10,32] and boosting of OCTA signal underneath occluding material [33], the OCT signal level at which OCTA data is irretrievable is still unknown. A mathematical model is still necessary to identify these areas and further exclude them from analysis. Such model could be constructed from scans acquired from healthy eyes and containing local shadows from vitreous floaters or pupil vignetting, but with signal strength above the threshold recommended by the manufacturer.

Fig. 6 Results of the avascular area detection. (A1-C1) reflectance images generated by the mean projection of the reflectance OCT data within the superficial vascular complex (SVC) of (A) one diabetic subject without diabetic retinopathy (DR) and (B-C) two subjects with severe DR. (A2-C2) SVC angiograms. (A3-C3) Probability maps of the avascular areas generated by MEDnet overlaid on the corresponding en face angiograms. (A4-D4) Region detected as avascular area overlaid on the corresponding en face angiograms. The positions of arrows in A1-B4 are shadow area caused by vignetting or vitreous floaters. The positions of arrows in C1-C4 are motion artifacts that cause detection errors.

Download Full Size | PDF

Besides optical signal occlusion, other artifacts are caused by microsaccadic eye motion during scanning. These artifacts are very common in OCTA [11] and are apparent in en face angiograms as continuous lines crossing entire B-scans. As expected, microsaccadic artifacts can confuse the network into classifying these pixels as vascular, owing to their large OCTA signal level. (Fig. 6(C1-C4)).

The data presented in this manuscript corresponds to the superficial vascular complex. However, DR effects on ocular circulation are not restrained to superficial retinal flow. In fact, the capillaries forming the intermediate and deep vascular plexuses have been reported to be strongly affected by DR disease progression [8]. Integrated with projection-resolved (PR) OCTA [34], there is potential in the future to extend the algorithm’s capabilities to all three retinal plexuses found in the macular region.

5. Conclusions

In summary, we have reported a deep learning solution for the segmentation of avascular areas in the retina of DR eyes using OCTA. The network could classify pixels with confidence owing to access to multi-scale context and preservation of the lateral resolution. The inclusion of atrous convolutions with different dilations allowed it to generate features with different receptive fields without increasing the computational load. Consequently, the multi-scale feature maps offered more accurate decision making in the classification process, despite the prevalence of noise in avascular areas. Moreover, the excellent performance on ultra-wide field OCTA highlights the potential clinical applications of this deep learning configuration for the early detection and progression assessment of DR.

Funding

National Institutes of Health (R01 EY027833, R01 EY024544, R01 EY023285, DP3 DK104397, P30 EY010572); Unrestricted Departmental Funding Grant and William & Mary Greve Special Scholar Award from Research to Prevent Blindness (New York, NY).

Disclosures

Oregon Health & Science University (OHSU), David Huang and Yali Jia, have a significant financial interest in Optovue, Inc. These potential conflicts of interest have been reviewed and managed by OHSU.

References

1. A. M. Joussen, V. Poulaki, M. L. Le, K. Koizumi, C. Esser, H. Janicki, U. Schraermeyer, N. Kociok, S. Fauser, B. Kirchhof, T. S. Kern, and A. P. Adamis, “A central role for inflammation in the pathogenesis of diabetic retinopathy,” FASEB J. 18(12), 1450–1452 (2004). [CrossRef] [PubMed]

2. D. A. Antonetti, R. Klein, and T. W. Gardner, “Diabetic Retinopathy,” N. Engl. J. Med. 366(13), 1227–1239 (2012). [CrossRef] [PubMed]

3. M. M. Wessel, N. Nair, G. D. Aaker, J. R. Ehrlich, D. J. D’Amico, and S. Kiss, “Peripheral retinal ischaemia, as evaluated by ultra-widefield fluorescein angiography, is associated with diabetic macular oedema,” Br. J. Ophthalmol. 96(5), 694–698 (2012). [CrossRef] [PubMed]

4. Y. Jia, S. T. Bailey, T. S. Hwang, S. M. McClintic, S. S. Gao, M. E. Pennesi, C. J. Flaxel, A. K. Lauer, D. J. Wilson, J. Hornegger, J. G. Fujimoto, and D. Huang, “Quantitative optical coherence tomography angiography of vascular abnormalities in the living human eye,” Proc. Natl. Acad. Sci. U.S.A. 112(18), E2395–E2402 (2015). [CrossRef] [PubMed]

5. N. Hussain, A. Hussain, M. Zhang, J. P. Su, G. Liu, T. S. Hwang, S. T. Bailey, and D. Huang, “Diametric measurement of foveal avascular zone in healthy young adults using Optical Coherence Tomography Angiography,” Int. J. Retina Vitreous 2(OCT), 27–36 (2016). [CrossRef] [PubMed]

6. M. Zhang, T. S. Hwang, C. Dongye, D. J. Wilson, D. Huang, and Y. Jia, “Automated Quantification of Nonperfusion in Three Retinal Plexuses Using Projection-Resolved Optical Coherence Tomography Angiography in Diabetic Retinopathy,” Invest. Ophthalmol. Vis. Sci. 57(13), 5101–5106 (2016). [CrossRef] [PubMed]

7. T. S. Hwang, Y. Jia, S. S. Gao, S. T. Bailey, A. K. Lauer, C. J. Flaxel, D. J. Wilson, and D. Huang, “Optical Coherence Tomography Angiography Features of Diabetic Retinopathy,” Retina 35(11), 2371–2376 (2015). [CrossRef] [PubMed]

8. T. S. Hwang, M. Zhang, K. Bhavsar, X. Zhang, J. P. Campbell, P. Lin, S. T. Bailey, C. J. Flaxel, A. K. Lauer, D. J. Wilson, D. Huang, and Y. Jia, “Visualization of 3 distinct retinal plexuses by projection-resolved optical coherence tomography angiography in diabetic retinopathy,” JAMA Ophthalmol. 134(12), 1411–1419 (2016). [CrossRef] [PubMed]

9. T. S. Hwang, A. M. Hagag, J. Wang, M. Zhang, A. Smith, D. J. Wilson, D. Huang, and Y. Jia, “Automated quantification of nonperfusion areas in 3 vascular plexuses with optical coherence tomography angiography in eyes of patients with diabetes,” JAMA Ophthalmol. 136(8), 929–936 (2018). [CrossRef] [PubMed]

10. A. Camino, Y. Jia, G. Liu, J. Wang, and D. Huang, “Regression-based algorithm for bulk motion subtraction in optical coherence tomography angiography,” Biomed. Opt. Express 8(6), 3053–3066 (2017). [CrossRef] [PubMed]

11. A. Camino, M. Zhang, S. S. Gao, T. S. Hwang, U. Sharma, D. J. Wilson, D. Huang, and Y. Jia, “Evaluation of artifact reduction in optical coherence tomography angiography with real-time tracking and motion correction technology,” Biomed. Opt. Express 7(10), 3905–3915 (2016). [CrossRef] [PubMed]

12. G. Liu, J. Yang, J. Wang, Y. Li, P. Zang, Y. Jia, and D. Huang, “Extended axial imaging range, widefield swept source optical coherence tomography angiography,” J. Biophotonics 10(11), 1464–1472 (2017). [CrossRef] [PubMed]

13. Y. Jia, O. Tan, J. Tokayer, B. Potsaid, Y. Wang, J. J. Liu, M. F. Kraus, H. Subhash, J. G. Fujimoto, J. Hornegger, and D. Huang, “Split-spectrum amplitude-decorrelation angiography with optical coherence tomography,” Opt. Express 20(4), 4710–4725 (2012). [CrossRef] [PubMed]

14. J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence 39(4), 3431–3440 (2015)

15. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, N. Navab, J. Hornegger, W. M. Wells, and A. F. Frangi, eds. (Springer International Publishing, Cham, 2015), pp. 234–241.

16. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs,” IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018). [CrossRef] [PubMed]

17. L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking Atrous Convolution for Semantic Image Segmentation,” CoRR abs/1706.05587 (2017).

18. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint https://arxiv.org/abs/1409.1556 (2014).

19. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition(2016), pp. 770–778.

20. C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4, inception-resnet and the impact of residual connections on learning,” in AAAI (2017), p. 12.

21. V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep convolutional encoder-decoder architecture for image segmentation,” arXiv preprint, https://arxiv.org/abs/1511.00561 (2015).

22. L. Fang, D. Cunefare, C. Wang, R. H. Guymer, S. Li, and S. Farsiu, “Automatic segmentation of nine retinal layer boundaries in OCT images of non-exudative AMD patients using deep learning and graph search,” Biomed. Opt. Express 8(5), 2732–2744 (2017). [CrossRef] [PubMed]

23. P. P. Srinivasan, L. A. Kim, P. S. Mettu, S. W. Cousins, G. M. Comer, J. A. Izatt, and S. Farsiu, “Fully automated detection of diabetic macular edema and dry age-related macular degeneration from optical coherence tomography images,” Biomed. Opt. Express 5(10), 3568–3577 (2014). [CrossRef] [PubMed]

24. S. P. Karri, D. Chakraborty, and J. Chatterjee, “Transfer learning based classification of optical coherence tomography images with diabetic macular edema and dry age-related macular degeneration,” Biomed. Opt. Express 8(2), 579–592 (2017). [CrossRef] [PubMed]

25. A. Camino, Z. Wang, J. Wang, M. E. Pennesi, P. Yang, D. Huang, D. Li, and Y. Jia, “Deep learning for the segmentation of preserved photoreceptors on en face optical coherence tomography in two inherited retinal diseases,” Biomed. Opt. Express 9(7), 3092–3105 (2018). [CrossRef] [PubMed]

26. Z. Wang, A. Camino, A. M. Hagag, J. Wang, R. G. Weleber, P. Yang, M. E. Pennesi, D. Huang, D. Li, and Y. Jia, “Automated detection of preserved photoreceptor on optical coherence tomography in choroideremia based on machine learning,” J. Biophotonics 11(5), e201700313 (2018). [CrossRef] [PubMed]

27. Z. Wang, A. Camino, M. Zhang, J. Wang, T. S. Hwang, D. J. Wilson, D. Huang, D. Li, and Y. Jia, “Automated detection of photoreceptor disruption in mild diabetic retinopathy on volumetric optical coherence tomography,” Biomed. Opt. Express 8(12), 5384–5398 (2017). [CrossRef] [PubMed]

28. P. Prentašic, M. Heisler, Z. Mammo, S. Lee, A. Merkur, E. Navajas, M. F. Beg, M. Šarunic, and S. Lončarić, “Segmentation of the foveal microvasculature using deep learning networks,” J. Biomed. Opt. 21(7), 75008 (2016). [CrossRef] [PubMed]

29. M. Zhang, J. Wang, A. D. Pechauer, T. S. Hwang, S. S. Gao, L. Liu, L. Liu, S. T. Bailey, D. J. Wilson, D. Huang, and Y. Jia, “Advanced image processing for optical coherence tomographic angiography of macular diseases,” Biomed. Opt. Express 6(12), 4661–4675 (2015). [CrossRef] [PubMed]

30. A. G. Roy, S. Conjeti, S. P. K. Karri, D. Sheet, A. Katouzian, C. Wachinger, and N. Navab, “ReLayNet: retinal layer and fluid segmentation of macular optical coherence tomography using fully convolutional networks,” Biomed. Opt. Express 8(8), 3627–3642 (2017). [CrossRef] [PubMed]

31. A. Y. Ng, “Feature selection, L 1 vs. L 2 regularization, and rotational invariance,” presented at the Proceedings of the twenty-first international conference on Machine learning, Banff, Alberta, Canada2004. [CrossRef]

32. S. S. Gao, Y. Jia, L. Liu, M. Zhang, H. L. Takusagawa, J. C. Morrison, and D. Huang, “Compensation for Reflectance Variation in Vessel Density Quantification by Optical Coherence Tomography Angiography,” Invest. Ophthalmol. Vis. Sci. 57(10), 4485–4492 (2016). [CrossRef] [PubMed]

33. Q. Zhang, F. Zheng, E. H. Motulsky, G. Gregori, Z. Chu, C.-L. Chen, C. Li, L. de Sisternes, M. Durbin, P. J. Rosenfeld, and R. K. Wang, “A Novel Strategy for Quantifying Choriocapillaris Flow Voids Using Swept-Source OCT Angiography,” Invest. Ophthalmol. Vis. Sci. 59(1), 203–211 (2018). [CrossRef] [PubMed]

34. M. Zhang, T. S. Hwang, J. P. Campbell, S. T. Bailey, D. J. Wilson, D. Huang, and Y. Jia, “Projection-resolved optical coherence tomographic angiography,” Biomed. Opt. Express 7(3), 816–828 (2016). [CrossRef] [PubMed]

Blocks	Layers	Filter size	Stride	Dilation rate	Filter number	Padding mode	activation
Conv_1	Convolution	3x3	1	1	64	same	ReLU
AConv_(1,2,3)	Convolution	1x1	1	1	32	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	Convolution	3x3	1	(1,2,4)	64	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	Convolution	1x1	1	1	64	same	ReLU
	Batch Normalization	-	-	-	-	-	-
Merge	Merge_concat	Concatenate the outputs of AConv_(1,2,3)
Conv_(2,3)	Convolution	3x3	1	1	32	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	Convolution	3x3	1	1	64	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	Convolution	3x3	1	1	64	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	MaxPooling	2x2	2	-	-	valid	-
Conv_4	Convolution	3x3	1	1	64	same	ReLU
Conv_4	Batch Normalization	-	-	-	-	-	-
dConv_5	Convolution	3x3	1	1	64	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	Merge_concat	Concatenate the outputs of Batch Normalization and Conv_3
	Convolution	3x3	1	1	64	same	ReLU
dConv_(6,7)	UpSampling	2x2	-	-	-	-	-
	Convolution	3x3	1	1	64	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	Merge_concat	Concatenate the outputs of Batch Normalization and (Conv_2, Merge)
	Convolution	3x3	1	1	64	same	ReLU
Output	Convolution	3x3	1	1	32	same	ReLU
	Convolution	3x3	1	1	16	same	ReLU
	Convolution	3x3	1	1	1	same	sigmoid

		Accuracy	Precision	Recall	F1-score	Detected Avascular area(mm²)
Disease	Control	0.89 ± 0.04	0.84 ± 0.05	1.00 ± 0.00	0.91 ± 0.03	0.37 ± 0.18
	Diabetes without retinopathy	0.79 ± 0.11	0.77 ± 0.11	0.99 ± 0.00	0.87 ± 0.07	0.69 ± 0.59
	Mild to moderate NPDR	0.87 ± 0.07	0.85 ± 0.09	0.99 ± 0.00	0.91 ± 0.05	1.08 ± 0.69
	Severe DR	0.76 ± 0.06	0.68 ± 0.06	1.00 ± 0.00	0.81 ± 0.04	2.03 ± 1.11
SSI	55-61	0.76 ± 0.09	0.75 ± 0.07	1.00 ± 0.00	0.85 ± 0.06	2.84 ± 1.88
	61-66	0.77 ± 0.11	0.75 ± 0.14	1.00 ± 0.01	0.85 ± 0.08	1.90 ± 1.27
	66-71	0.86 ± 0.06	0.83 ± 0.08	0.99 ± 0.01	0.90 ± 0.05	1.12 ± 0.94
	>71	0.85 ± 0.05	0.80 ± 0.07	0.99 ± 0.00	0.89 ± 0.04	0.65 ± 0.60

Blocks	Layers	Filter size	Stride	Dilation rate	Filter number	Padding mode	activation
Conv_1	Convolution	3x3	1	1	64	same	ReLU
AConv_(1,2,3)	Convolution	1x1	1	1	32	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	Convolution	3x3	1	(1,2,4)	64	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	Convolution	1x1	1	1	64	same	ReLU
	Batch Normalization	-	-	-	-	-	-
Merge	Merge_concat	Concatenate the outputs of AConv_(1,2,3)
Conv_(2,3)	Convolution	3x3	1	1	32	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	Convolution	3x3	1	1	64	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	Convolution	3x3	1	1	64	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	MaxPooling	2x2	2	-	-	valid	-
Conv_4	Convolution	3x3	1	1	64	same	ReLU
Conv_4	Batch Normalization	-	-	-	-	-	-
dConv_5	Convolution	3x3	1	1	64	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	Merge_concat	Concatenate the outputs of Batch Normalization and Conv_3
	Convolution	3x3	1	1	64	same	ReLU
dConv_(6,7)	UpSampling	2x2	-	-	-	-	-
	Convolution	3x3	1	1	64	same	ReLU
	Batch Normalization	-	-	-	-	-	-
	Merge_concat	Concatenate the outputs of Batch Normalization and (Conv_2, Merge)
	Convolution	3x3	1	1	64	same	ReLU
Output	Convolution	3x3	1	1	32	same	ReLU
	Convolution	3x3	1	1	16	same	ReLU
	Convolution	3x3	1	1	1	same	sigmoid

		Accuracy	Precision	Recall	F1-score	Detected Avascular area(mm²)
Disease	Control	0.89 ± 0.04	0.84 ± 0.05	1.00 ± 0.00	0.91 ± 0.03	0.37 ± 0.18
	Diabetes without retinopathy	0.79 ± 0.11	0.77 ± 0.11	0.99 ± 0.00	0.87 ± 0.07	0.69 ± 0.59
	Mild to moderate NPDR	0.87 ± 0.07	0.85 ± 0.09	0.99 ± 0.00	0.91 ± 0.05	1.08 ± 0.69
	Severe DR	0.76 ± 0.06	0.68 ± 0.06	1.00 ± 0.00	0.81 ± 0.04	2.03 ± 1.11
SSI	55-61	0.76 ± 0.09	0.75 ± 0.07	1.00 ± 0.00	0.85 ± 0.06	2.84 ± 1.88
	61-66	0.77 ± 0.11	0.75 ± 0.14	1.00 ± 0.01	0.85 ± 0.08	1.90 ± 1.27
	66-71	0.86 ± 0.06	0.83 ± 0.08	0.99 ± 0.01	0.90 ± 0.05	1.12 ± 0.94
	>71	0.85 ± 0.05	0.80 ± 0.07	0.99 ± 0.00	0.89 ± 0.04	0.65 ± 0.60

MEDnet, a neural network for automated detection of avascular area in OCT angiography

Abstract

1. Introduction

2. Methods

2.1 Data acquisition

2.2 Network architecture

2.3 Network parameters

2.4 Training

2.4.1 Training data

2.4.2 Loss function and optimizer

3. Results

3.1 Data set

3.2 Implementation

3.3 Performance evaluation

3.4 Performance on wide field of view OCTA

4. Discussion

5. Conclusions

Funding

Disclosures

References

Cited By

Figures (6)

Tables (4)

Equations (5)

Biomedical Optics Express

	Patient	Eye	Gender		Age
	Patient	Eye	Male	Female	1931-1950s	1951-1970s	1971-1990s	≥1991
Healthy control	59	76	26	32	38	5	14	2
Severe DR	39	39	24	15	8	21	10	0
Mild to moderate NPDR	31	31	10	21	12	18	1	0
Diabetes without retinopathy	34	34	17	17	10	18	4	2
Total	163	180	77	85	68	62	29	4