Lymphatic vessel segmentation in optical coherence tomography by adding U-Net-based CNN for artifact minimization

Pei-Yu Lai; Chung-Hsing Chang; Chung-Hsing Chang; Hong-Ren Su; Wen-Chuan Kuo

doi:10.1364/BOE.389373

1. Introduction

The lymphatic system maintains tissue fluid and plays an essential role in immune cell trafficking and surveillance [1,2]. Visualizing the lymphatic system can be helpful in various applications, such as in the diagnosis of infectious diseases, dermatology treatment, and cancer diagnosis and care. Non-invasive visualization techniques currently used in clinics include computed tomography (CT), positron emission tomography (PET), magnetic resonance imaging (MRI), and near-infrared (NIR) fluorescence imaging [3,4]. Each of these techniques, however, has certain limitations, such as poor spatial resolution, ionizing contrast agent or tracer dye have to be used, and possible phototoxicity effect. Therefore, several novel approaches remain under development. Optical coherence tomography (OCT) is an emerging technique for noninvasive and label-free imaging of lymphatic capillaries [5]. The principle of OCT is based on the white light interferometry, and it can provide real-time cross-sectional tomography images and three-dimensional (3D) reconstruction images with adequate tissue penetration depth (1–2 mm) and up to 10 times higher resolution than ultrasonography. The diameter of lymphatic capillaries is 10 - 60 µm in human skin and 20 - 30 µm in mouse skin [6,7]. OCT has the capability of imaging the lymphatic capillaries without contrast agent. Thus, it has a competitive edge over other techniques like PET, CT, and MRI in observation of tumor microenvironment. Lymphatic vessels appear as low scattering areas in OCT are owing to the transparency of the lymph fluid.

Since attenuation is along with the cross-section depth, and occurrence of artifacts caused by low scattering areas [8,9], segmentation approaches that rely on thresholding of the OCT cross-sectional image have certain limitations. Other techniques employ an adaptive threshold [10] to perform the thresholding inside a predefined window. Nevertheless, the window size parameter can affect the segmentation performance, making it challenging to accommodate various lymphatic vessel sizes and thus decreasing the accuracy.

Many improvements are required for intensity-based methods. Peijun et al. [11] proposed a segment-joining algorithm to mitigate partial-volume effects with small vessels. However, the stability of this method may be impacted by other low scattering structures. To this end, Yousefi et al. proposed a lymphatic vessel segmentation method using a Hessian vesselness filter [12]. This method searches for geometrical structures that can be considered tubular vessels by analyzing the second-order derivative of the OCT signal. The second-order derivative is calculated from convolving the cross-sectional OCT image with Gaussian derivatives. The second derivative of a Gaussian kernel can measure the intensity fluctuation both within and outside the range (−σ, σ), where σ denotes the standard deviation of the Gaussian kernel. However, the lymphatic vessel diameter can vary widely in a 3D volume. Moreover, it is challenging to choose parameter σ that fits with all lymphatic vessels in the OCT cross-sectional image. Consequently, some of the vessels may be distorted. Methods mentioned above may be problematic in tissue that contains other low scattering structures, such as mouse-ear cartilage, thus involving extra preprocessing steps to remove them. Therefore, more refined processing techniques should be developed to distinguish such artifacts from vessels.

The deep-learning-based neural network has been determined to be a robust tool in the computer vision field. The convolutional neural network (CNN) is a neural network variant that is widely employed in different tasks, such as classification, segmentation, and object detection. Previous works have applied CNN to medical image analysis, such as retinal layer and fluid segmentation [13], 2D/3D medical image registration [14], and lung image patch classification [15]. Among them, U-Net [16] is a renowned network for segmentation in 2D images and has been applied in many segmentation tasks [17–19]. U-net composed in two parts: a contracting part of computing features and an expanding part to spatially localize patterns in the image. The network consists of a down-sampling (encoding) path and an up-sampling (decoding) path to localize features in the image spatially. Feature maps from the down-sampling path are concatenated with the upsampling part, and this can avoid losing spatial information. Since U-net does not contain any fully-connected layer, the amount of model parameters is reduced, and it can be trained with a small labeled dataset with proper data augmentation.

The objective of the present study was to segment the lymphatic vessel while minimizing other low scattering structures and noise. A mouse ear was used as an example because it contains low scattering cartilage. In the proposed method, a Hessian vesselness filter is employed to detect the lymphatic vessel, and the cartilage artifacts are removed by employing U-Net-based architecture. After cartilage artifact removal, an Ostu based intensity thresholding step is incorporated to handle the distorted shape problem in the Hessian vesselness filter result. To the best of the authors’ knowledge, the proposed approach is the first to implement CNN in the segmentation of mouse ear cartilage. This method can be regarded as a combination of the Hessian vesselness filter and modified intensity-based methods.

2. Experimental setup

2.1 OCT setup description and scan pattern

The OCT system was developed in-house [20,21]. In brief, the supercontinuum source (NKT Photonics, Denmark) has a central wavelength of 1275 nm and a spectral bandwidth of 240 nm. The light is coupled to a fiber-based Michelson interferometer. The reference arm of the interferometer consists of a mirror. The sample arm of the interferometer comprises a dual-axis scanning mirror and an objective lens (LSM03-BB, Thorlabs Inc.). The axial spatial resolution in the air was measured as ∼5 µm by using a mirror with an attenuator as a sample. A 1951 USAF resolution test target (Edmund Optics) was used as the standard resolution test sample for measurement of the lateral resolution, confirming that the lateral resolution of the OCT setup was approximately 7 µm. The x-scanner in the galvo system is steered using a saw-tooth waveform, and the y-scanner is steered using a step function. The incident power at the sample is around 10 mW. The angle between the imaging objective lens and the sample is around 90 degrees. Backscattered light from the sample and reference arm were captured using a line-scan spectrometer (Wasatch photonics). In this study, four repeated frames in each position were obtained; each frame contained 900 A-lines. The imaging speed was 33 frames per second. Each 3D data set was constructed by 400 cross-section images (x-z plane) using Image J [22] and covered an image volume of 4 × 4 × 2 mm³ (in x-y-z axis). 400 vascular images (x-z plane) and 400 lymphatic vessel images (x-z plane) were obtained after signal processing (see section 2.3) and can be constructed to a 3D volume or reduced to a maximum intensity projection (MIP, x-y en face view).

2.2 Animal preparation

All animal procedures were reviewed and approved by the Institutional Animal Care and Use Committee (IACUC, number: 1060612r) of National Yang-Ming University, where these experiments were performed. Tyr::CreER; BRAF^CA; Pten^lox/lox mice from a breeding pair at the National Laboratory Animal Center (Tainan, Taiwan) were obtained. Mice were treated topically on the ear with 4-Hydroxytamoxifen (4-HT) to elicit BRAF^V600E and to silence Pten expression. For the topical treatment, 4-HT (H6278; Sigma) was dissolved in 99.9% alcohol (32205; Sigma), and 1.5 mg/500 µL was administered to the dorsal part of ears over 2 weeks [23]. This protocol generates spontaneous induced melanoma with 100% in the skin at week 6, histopathology showed pigmented or nonpigmented S-100-positive tumor cells in the dermis [24]. A total of 10 mice from three different mouse models were used in this study, including 5 BRAF ^V600E/V600E;Pten ^-/- mice, 3 BRAF ^CA/CA;Pten ^loxp/loxp mice, and 2 wild type mice (WT).

During the OCT experiment, the mouse was anesthetized using 1% isoflurane. Body temperature was maintained at 37 degrees Celsius with a heating pad. The mouse’s body was not restrained; only its ears be fixed on the stage to reduce motion effects. One to three regions were chosen per ear for OCT scanning; each imaging session took approximately 50 seconds. After images acquisition, the mouse was sacrificed, and the ear was severed and stained with H&E for cartilage validation.

An in vivo fluorescence lymphangiography [25] was used for lymphatic vessel validation. By cutaneous injecting 10% w/v solution of FITC-dextran, Mw ∼ 150,000 (Sigma, Poole, UK) into the skin of a mouse ear, using a needle with syringe size of 12.7mm x 30G (BD Ultra-Fine^TM II), a network of functioning dermal lymphatic vessels is visualized using a fluorescent imaging system (LT-9900 Illumatool bright light) and a camera recorder (Canon EOS 6D). This method reveals functioning dermal lymphatic vessels as the lymphatic capillaries take up fluorescent tracer and transport of a fluorescent macromolecular marker from an injected interstitial depot.

2.3 Signal processing and network learning

After signals captured by the spectrometer, axial signals were retrieved through the Fourier transform of the spectral interferogram. The complex scattering signal first underwent phase correction and motion correction [26], and angiography was determined using the optical microangiography algorithm [27]. Simultaneously, the magnitude of the complex number was calculated for the construction of cross-sectional images. Total 100 OCT cross-sectional images were used for human-annotated cartilage, 28 images acquired from the WT mice ear, and 72 images from BRAF^V600E-induced and PTEN-deficient metastatic melanoma mice. Here, additional data collected from the ear of melanoma mice is because the ears of such mice are often thicker and uneven, making it difficult to distinguish each tissue layer in the OCT B-scan. The following data augmentations were randomly applied during training: horizontal flipping, scaling, and elastic deforming [28]. Eventually, the training set was extended to 17,280 B-scans.

A representative OCT cross-section image was shown in Fig. 1(A). The histology (Fig. 1(B)) was used for cartilage validation. Figure 1(C) shows that in the training phase, the OCT B-scans and human-annotated ground truth were fed into the network. The network began to learn the data features and adjusted the network weight. In the inference phase, only the OCT B-scans were fed into the network. The trained U-Net-based CNN made the predictions, where the ratio of the training set (%) to the validation set (%) is 80:20. The experimental environment was a Windows system with a 3.30 GHz Intel Core i9-7900X@ CPU, 32 GB of memory, and an Nvidia GeForce GTX 1080 GPU (8 GB).

Fig. 1. (A) OCT cross-sectional image and (B) histology of mouse ear. (C) Deep learning training compared to inference.

Download Full Size | PDF

Figure 2 shows the network architecture (Fig. 2(A)), mean loss (Fig. 2(B)), mean Dice coefficient (Fig. 2(C)), and the Precision vs. Recall curve (Fig. 2(D)) obtained over five cross-validation folds. The network consists of a down-sampling (encoding) path and an up-sampling (decoding) path. The down-sampling path consists of three convolutional blocks. Each block has three convolutional layers with a filter size of 3×3, a stride of one along with batch normalization and ReLU layers (indicated by the grey arrow). For the down-sampling, max pooling with a 2×2 stride is applied to the end of each block except the last one (indicated by the green arrow). In the up-sampling path, each block begins with an up-sampling layer (indicated by the purple arrow). Unlike the original U-Net architecture, each block in the up-sampling path has three convolutional layers with a filter size of 3×3, a stride of one, and batch normalization and ReLU layers. We have reduced the number of convolution block to 3 and convolution channels to 8,16,32,64. These can reduce the parameters in the model and speed up the inference process. The size input image is 512×512. We implemented the training in keras. U Net is trained with RMSProp optimizer. The training process took about 8.7 hours for 60 epochs.

Fig. 2. (A) Architecture of the U-Net-based CNN used in the experiments. (B) Mean training loss curve (C) Mean training Dice coefficient obtained over five cross-validation folds (D) PR (Precision-Recall) curve of the proposed approach based on the 5-fold cross-validation, corresponding AP values are given in parenthesis.

Download Full Size | PDF

The Dice coefficient [29] is widely used for evaluating image segmentation tasks. Given segmented image ${\boldsymbol {Im}}$ and annotated ground truth ${\boldsymbol {Ig}}$, the Dice coefficient is defined by:

(1)$${\textbf {Dice coefficient}} = \; \frac{{2\ast |{{\boldsymbol {Im}} \cap {\boldsymbol {Ig}}} |}}{{|{{\boldsymbol {Im}}} |+ |{{\boldsymbol {Ig}}} |}}$$

The dice coefficient can be used to quantify the similarity between manual segmentation and model predictions. Here, the network was trained with the Dice coefficient loss and a batch size of four. The mean validation Dice coefficient obtained over five cross-validation folds achieved 84.28%. The average precision (AP) obtained over validation images is 0.9224. The learning process stopped when the validation accuracy did not improve within 20 iterations.

2.4 The procedure of segmentation method

Figure 3 presents a flowchart of the proposed method (left). The results of each step, represented by (A)–(F), are shown on the right. The cross-sectional image contrast is first improved using the deshadowing method proposed by Girard et al. [30] (Fig. 3(B)). Then, it is processed using the Hessian vesselness filter (Fig. 3(C)). In the image in Fig. 3(C), not only the lymphatic vessels are present; artifacts caused by the air-epidermis and dermis-cartilage junction (denoted by blue arrows) are apparent. The noisy region caused by the air-epidermis junction can be removed once the first boundary between air and the mouse ear is identified. As the air-tissue boundary contains a higher amount of intensity value change than other layer boundaries in tissue, it is relatively easy to determine the top boundary. We calculate the vertical dark-to-light gradient using the Sobel Operator, then the method described in [31] is used to find the top boundary automatically. Here, we represented OCT cross-sectional images like a graph consisting of nodes (i.e., image pixels) and edges. Weights were assigned to each edge based on the gradients, and the shortest path that traverses edges were found from the leftmost column of the image to the rightmost column of the image.

Fig. 3. Flowchart of the proposed method. The results of each step, represented by (A)–(F). The details of intensity thresholding steps are shown in (G)–(K). (G) Enlarged image of Fig. 3(E). (H) Binarized image of (G). (I) overlapped the enlarge (B) with (H), (B) is color red. (J) Enlarged image of the yellow square area in (I). (K) Final segmentation after applying the search window in (J).

Download Full Size | PDF

However, the noise caused by the cartilage is challenging to remove because the cartilage often has a curved structure and poor contrast, especially when the ear becomes thicker in some conditions, such as inflammation or infection. Therefore, U-Net-based CNN is used to predict the cartilage region (Fig. 3(D)). After finding the first boundary and cartilage layer, the background subtraction step is performed. The result is shown in Fig. 3(E). In Fig. 3(E), although the lymphatic vessel had been detected and the background is less noisy, the vessel shape is not precise because of the parameter chosen in the Hessian vesselness filter.

To overcome this problem, a modified intensity-thresholding step is added in the final step, as shown in Figs. 3(G)–(K), which leads to the result in Fig. 3(F). Figure 3(G) is an enlarged image of the result after Hessian vessel filtering and background subtracting (Fig. 3(E)). Figure 3(H) is a binarized image of Fig. 3(G), and Fig. 3(I) is an enlarged image of the result of the deshadowing process (Fig. 3(B)) overlapped with the Hessian binary mask (Fig. 3(H)). It is observed that segmenting the lymphatic vessel by only using the Hessian vesselness filter is not adequate to fit the shape well, as shown in Figs. 3(I) and (J). Therefore, a search window is centered on the “true value” in the Hessian binary mask (Fig. 3(H)) to determine if a pixel exists within the search window that has an intensity value close to that of the central pixel in Fig. 3(B). If such a pixel exists, it is added to the binary mask. The result is shown in Fig. 3(K). The size of the search window used in the experiment is 11×11. Because the lymphatic vessels are connected across a 3D volume, the neighboring pixels are searched based on the previous four Hessian binary masks.

3. Results

Figure 4 shows a photograph of a mouse with the orientation of base-periphery axis, OCT cross-sectional images, top boundary and cartilage prediction from (A) WT mice ears (B) BRAF^CA/CA;Pten ^loxp/loxp and(C) BRAF ^V600E/V600E;Pten ^-/- mouse ear. Green lines represent the scanned lines. The results demonstrate that the trained model can generalize well in images captured in a different area of mouse ear. The top boundary can also be found in the OCT cross-sectional image from three different animal models.

Fig. 4. Photograph of mouse with orientation of base-periphery axis, OCT cross-sectional images, top boundary and cartilage prediction from (A) WT mice ears (B) BRAF ^CA/CA;Pten ^loxp/loxp and (C) BRAF ^V600E/V600E;Pten ^-/- mouse ear. The green lines represent the scanned lines.

Download Full Size | PDF

Representative OCT cross-sectional images of WT and BRAF ^V600E/V600E;Pten ^-/- mice ear are shown in Fig. 5(A) and (B) where magenta line represents human-annotated result. The same line was also shown in the segmentation images of our proposed method (Figs. 5(C), (D)), the Hessian vesselness filter (Figs. 5(E), (F)), and the intensity-threshold method (Figs. 5(G), (H)) for comparison. In the cross-sectional image of the BRAF ^V600E/V600E;Pten ^-/- mouse ear (Fig. 5(B)), numerous lymphatic vessels vary in shape and diameter, thus making it difficult for the Hessian vessel filter to precisely detect all the vessels. Furthermore, the low scattering cartilage appears as artifacts in the resulting image. The image processed with the intensity also suffers from a noisy background. The Dice coefficient, Precision, and Recall between Fig. 5(A) and Fig. 5(C), Fig. 5(E) and Fig. 5(G) are (0.842, 0.826, 0.859), (0.425, 0.387, 0.477), and (0.104, 0.051, 0.758) respectively. The Dice coefficient, Precision, and Recall between Fig. 5(B), and Fig. 5(D), Fig. 5(F) and Fig. 5(H) are (0.854, 0.86, 0.846), (0.354, 0.29, 0.446), and (0.24, 0.14, 0.82) respectively.

Fig. 5. Cross-sectional images of (A) WT and (B) BRAF ^V600E/V600E;Pten ^-/- mouse processed with (C, D) the proposed method, (E, F) the Hessian vesselness filter, and (G, H) the intensity-threshold method. Magenta lines in Fig. 5(A)- Fig. 5(H) represent human-annotated result.

Download Full Size | PDF

We compared thresholding, Hessian vesselness filter with our method, and examined the effect of subtracting the cartilage, as shown in Fig. 6. All images in Fig. 6 are presented in the projection view (x-y en face), as illustrated in Fig. 6(A). Figure 6(B) is a maximum intensity projection (MIP) reconstructed from the human-annotated images. Figure 6(C), Fig. 6(D) and Fig. 6(E) are the MIP of segmented lymphatic vessels using the proposed method, Hessian vesselness filter, and intensity thresholding along with cartilage removal, respectively. Figure 6(F) and Fig. 6(G) are the MIP of segmented lymphatic vessels using the Hessian vesselness filter and intensity thresholding without cartilage removal. Compared Fig. 6(C) with Fig. 6(D), when there is a vessel that has a bigger diameter than the surrounding vessel presented (pointed by red arrow), the vessel in Fig. 6(D) is distorted. Although the cartilage has been removed in the thresholding method (Fig. 6(E)), the background is still noisy compared to Fig. 6(C) and Fig. 6(D). Hessian vesselness filters take the vessel shape into account, so it is more robust to noise compared to the intensity-threshold method. However, when there is no cartilage removal step involved, the segmented results contain lots of artifacts, as demonstrated in Fig. 6(F) and Fig. 6(G).

Fig. 6. (A) Illustration showing x,y, and z direction. (B) maximum intensity projection (MIP) of human-annotated lymphangiography. MIP of segmented lymphatic vessels using the proposed method (C), the Hessian vesselness filter (D), and the intensity thresholding along with cartilage removal (E). (F, G) MIP of segmented lymphatic vessels using the Hessian vesselness filter and the intensity thresholding method but without cartilage removal.

Download Full Size | PDF

Table 1 summarized the mean Dice coefficient, mean Precision, and mean Recall, averaged from the segmentation results of 400 cross-sectional images from our proposed method, the Hessian vesselness filter, and the intensity-threshold method. We found that the segmented result of the Hessian vesselness filter has the highest Precision of 0.86; however, it suffers from a poor Recall value of 0.664, which means it is not able to find as many vessels as our method does. The segmented result of the Intensity-threshold method has the lowest Dice Coefficient, Precision, and Recall compared to the Hessian vesselness filter and our proposed method. Aside from the cartilage, the intensity-threshold method is also easily affected by other low scattering noise (as shown in Figs. 5(G)&(H), Fig. 6(E)&(G). Using only the Hessian vesselness filter and intensity-threshold without manual background subtraction resulted in a low Dice Coefficient of 0.592 and 0.406, and a reduced precision albeit the Recall value is high. This results showed the importance of proper background subtraction. Our proposed method seeks a balance between precision and recall value and has the highest Dice Coefficient of 0.83, Precision of 0.859, and Recall of 0.803.

Table 1. Performance comparison.

View Table

Figure 7(A) demonstrates an in vivo fluorescence lymphangiography for lymph vessel validation, where a network of functioning dermal lymphatic vessels is visualized as the lymphatic capillaries take up fluorescent tracer. Figures 7(B) and (C) show a merged OCT lymphangiography (green) and angiography (red) of a WT mouse; the lymph vessels were segmented by the proposed method (Fig. 7(B)) and Hessian vesselness filter (Fig. 7(C), with manual cartilage removal). Compare the segmented lymphatic vessel in Fig. 7(B) and Fig. 7(C), we observed that the shape of the lymphatic vessel obtained by our proposed method was more close to that in Fig. 7(A) and more continuous (pointed by the yellow arrow).

Fig. 7. (A) In vivo fluorescence lymphangiography. FITC-dextran is cutaneous injected as a depot. Scale bar = 1 mm. (B) In vivo OCT lymphangiography. lymphatic vessels (green) segmented by the proposed method merged with OCT angiography (red). (C) Lymphatic vessels (green) segmented by Hessian vesselness filter (with manual cartilage removal) merged with OCT angiography (red).

Download Full Size | PDF

Figure 8(A) shows an illustration of the image reconstruction process. After the OCT scan, we had 400 OCT cross-section images, 400 processed blood vessel images, and 400 lymphatic vessel images. We reduced the 3D reconstructed image of the ear to an en face maximum intensity projection (MIP, x-y view). 3D reconstruction images were obtained using ImageJ [22]. Photograph, en face MIP map, and 3D reconstruction of the merged lymphatic (grey) and blood vessel (red) image of healthy skin, skin with amelanotic melanoma (lack of melanin), and skin with melanoma in mice ear are shown in Fig. 8(B), Fig. 8(C) and Fig. 8(D), respectively.

Fig. 8. (A) Illustration of the image processing procedure. Photograph, MIP en face map, and 3D reconstruction of the merged lymphatic (grey) and vascular (red) image of (B) health skin, (C) skin with amelanotic melanoma (lack of melanin) and (D) skin with pigmented melanoma in mice ear.

Download Full Size | PDF

The lymphatic vessels are coded with grey; blood vessels are coded with red. The proposed segmentation technique for lymphatic vessels, along with OCT angiography, enables simultaneous visualizing of blood and lymphatic vessels in vivo. Figure 8(C) shows disorganized vessels surrounding a non-pigmented tumor, while the hemorrhagic area inside the tumor with static features presents no OCT angiography signal. Figure 8(D) represents densely distributed lymphatic vessels on a pigmented melanoma. The vessel pattern was different from what we observed in healthy skin (Fig. 8(B)).

4. Discussion

The method of image segmentation has been developed for many years, from the earliest methods [32], such as histogram-based thresholding [33], region growing [34], k-means clustering [35], watersheds [36], to more advanced algorithms such as active contours [37], mean shift [38,39], graph cuts [40], conditional and Markov random fields [41], and sparsity-based [42][43] methods. For the past few years, Deep-learning based image segmentation method has become the primary development trend due to providing the highest accuracy rates performance [44]. We choose U-Net as our segmentation architecture because the amount of model parameters is reduced, and it is suitable for training medical images with a small dataset.

Hessian vesselness filters account for the vessel shape and measure how likely it will be a tubular structure. Based on our user experience, although it is more robust to noise compared to the conventional intensity-threshold method, the segmented lymphatic vessel shape is not precise under the condition that there are lymphatic vessels of various diameters present in the tissue. In some cases, we also found that it was unable to differentiate well between the lymphatic vessel and the cartilage. Carefully and repeatedly manual tuning of the Hessian filter parameters is necessary to obtain the optimized segmentation results in different conditions. To this end, the cartilage artifacts are removed by employing U-Net-based architecture in the presented method. Intensity-thresholding based on Otsu’s method [33] was added to the final step to automatically search the nearby pixels based on the binarized Hessian mask. Thus, the proposed method is more robust when it comes to lymphatic vessel diameter variations compared to the Hessian vessel filter only, and it is also robust to noise.

We showed that the trained model generalizes well in cross-sectional images that vary in structure and contrast (Fig. 4). After we performed background subtraction, the result is free from artifacts caused by the cartilage (Fig. 5(C), 5(D)). In the cross-sectional image of the BRAF ^V600E/V600E;Pten ^-/- mouse ear (Fig. 5(B)), the tissue is thicker than usual normal mice, and there is an elongated lymphatic vessel present in the tissue. Under this condition, the result of the Hessian filter likewise suffers from the inaccurate shape problem (Fig. 5(F)). The same situation happed when we presented a lymphatic vessel in the projection map (x-y view), as shown in Fig. 6(D) (pointed by a red arrow) and Fig. 7(C) (pointed by the yellow arrow).

The threshold parameter in a conventional intensity-threshold method can significantly affect the performance. Setting the threshold too high results in a noisy background, whereas setting the threshold too low will affect the shapes of the segmented lymphatic vessels. Since the attenuation is along with the depth, it is difficult for an intensity-based method to distinguish the lymphatic vessel from the noise. The segmented result of the Intensity-threshold method has the lowest Dice Coefficient (see Table 1). In this study, the intensity-thresholding step is added in the final step of our proposed method to ensure that the shape of the vessel more precise. Besides, this step was done by Otsu's method [33], which search the nearby pixels based on the binarized Hessian mask. No operator-dependent threshold determination step in our proposed method. The proposed method combining CNN, a Hessian filter, and modified intensity thresholding provide excellent performance in both healthy and cancerous conditions. Using only 100 cartilage-annotated images along with appropriate data augmentation to achieve the Dice Coefficient of 0.83, Precision of 0.859, and Recall of 0.803.

Although we successfully applied U-Net-based CNN for segmenting the ear’s cartilage from 10 mice (including three different mouse models). Further studies to validate the trained model in other tissues or human are still needed. In the future, this can be performed by doing Transfer learning [45], where the model is fine-tuned with data of the target domain with only a small number of annotated images, which will enable the proposed method to achieve clinical utility. The other limitation of our proposed method is that we noticed some failure predictions occurred when the contrast of the OCT image was shallow (Fig. 9(A)), or there was a severe distortion in the mouse ear (Fig. 9(B)). These conditions may affect accuracy.

Fig. 9. (A, B) OCT cross-sectional image. (C, D) cartilage prediction.

Download Full Size | PDF

Regarding other tissues may contain optically low scattering areas like hair follicles, extracellular fluids, and nerves epineurium. We used deshadow algorithm to remove the shadow caused by hair follicles. The lymphatic vessels can be differentiated from extracellular fluids by considering its connectivity across different neighboring frames and searching the neighboring pixels based on the previous four Hessian binary masks. Moreover, the structure of nerves epineurium has been reported [8], which is very different from beaded like lymphatic vessels [46]. Still, further processing steps could be refined to tackle with this property.

5. Summary

In this paper, a technique for lymphatic vessel segmentation in the OCT image was proposed that combines both a Hessian vesselness filter and intensity-based methods. The advantages of the Hessian vesselness filter were leveraged, including robustness to noise and the ability to detect the lymphatic vessel. A modified U-Net architecture was used to detect the cartilage and perform background subtraction because the mouse ear cartilage has a low scattering structure and appears as an artifact in the result of the Hessian vesselness filter. Finally, the vessel shape was made more precise by applying the search window as centered on each pixel in the result of the Hessian vesselness filter. This method accounts for both the shape and intensity features and minimizing other low scattering areas in the tissue, and there is no need for users to identify the region of interest. The capability of the proposed segmentation method is evaluated on both normal and cancerous skin tissue. In future work, more data will be collected, and a model will be trained for various applications. Therefore, the segmentation method proposed in this paper may serve as a potential tool for realizing OCT lymphangiography in future clinical applications. Combining this approach with numerical analyses can further advance the possibility of quantitative evaluation of lymphangiogenesis.

Funding

Ministry of Education, Taiwan (108RSB0013).

Disclosures

The authors declare no conflicts of interest.

References

1. G. J. Randolph, S. Ivanov, B. H. Zinselmeyer, and J. P. Scallan, “The Lymphatic System: Integral Roles in Immunity,” Annu. Rev. Immunol. 35(1), 31–52 (2017). [CrossRef]

2. K.-W. Kim and J.-H. Song, “Emerging Roles of Lymphatic Vasculature in Immunity,” Immune. Netw. 17(1), 68–76 (2017). [CrossRef]

3. L. L. Munn and T. P. Padera, “Imaging the lymphatic system,” Microvasc. Res. 96, 55–63 (2014). [CrossRef]

4. J. C. Rasmussen, I. C. Tan, M. V. Marshall, C. E. Fife, and E. M. Sevick-Muraca, “Lymphatic Imaging in Humans with Near-Infrared Fluorescence,” Curr. Opin. Biotechnol. 20(1), 74–82 (2009). [CrossRef]

5. B. J. Vakoc, R. M. Lanning, J. A. Tyrrell, T. P. Padera, L. A. Bartlett, T. Stylianopoulos, L. L. Munn, G. J. Tearney, D. Fukumura, R. K. Jain, and B. E. Bouma, “Three-dimensional microscopy of the tumor microenvironment in vivo using optical frequency domain imaging,” Nat. Med. 15(10), 1219–1223 (2009). [CrossRef]

6. R. Paduch, “The role of lymphangiogenesis and angiogenesis in tumor metastasis,” Cell Oncol. 39(5), 397–410 (2016). [CrossRef]

7. H. Y. Lim, J. M. Rutkowski, J. Helft, S. T. Reddy, M. A. Swartz, G. J. Randolph, and V. Angeli, “Hypercholesterolemic mice exhibit lymphatic vessel dysfunction and degeneration,” Am. J. Pathol. 175(3), 1328–1337 (2009). [CrossRef]

8. V. Demidov, L. A. Matveev, O. Demidova, A. L. Matveyev, V. Y. Zaitsev, C. Flueraru, and I. A. Vitkin, “Analysis of low-scattering regions in optical coherence tomography: applications to neurography and lymphangiography,” Biomed. Opt. Express 10(8), 4207–4219 (2019). [CrossRef]

9. J. M. Schmitt, A. Knuttel, M. Yadlowsky, and M. A. Eckhaus, “Optical-coherence tomography of a dense tissue: statistics of attenuation and backscattering,” Phys. Med. Biol. 39(10), 1705–1720 (1994). [CrossRef]

10. W. Qin, U. Baran, and R. Wang, “Lymphatic response to depilation-induced inflammation in mouse ear assessed with label-free optical lymphangiography,” Lasers Surg. Med. 47(8), 669–676 (2015). [CrossRef]

11. P. Gong, R. McLaughlin, Y. Liew, P. Munro, F. Wood, and D. Sampson, “Assessment of human burn scars with optical coherence tomography by imaging the attenuation coefficient of tissue after vascular masking,” J. Biomed. Opt. 19(2), 021111 (2013). [CrossRef]

12. S. Yousefi, J. Qin, Z. Zhi, and R. K. Wang, “Label-free optical lymphangiography: development of an automatic segmentation method applied to optical coherence tomography to visualize lymphatic vessels using Hessian filters,” J. Biomed. Opt. 18(8), 086004 (2013). [CrossRef]

13. A. G. Roy, S. Conjeti, S. P. K. Karri, D. Sheet, A. Katouzian, C. Wachinger, and N. Navab, “ReLayNet: retinal layer and fluid segmentation of macular optical coherence tomography using fully convolutional networks,” Biomed. Opt. Express 8(8), 3627–3642 (2017). [CrossRef]

14. J. Zheng, S. Miao, Z. Jane Wang, and R. Liao, “Pairwise domain adaptation module for CNN-based 2-D/3-D registration,” J. Med. Imag. 5(2), 021204 (2018). [CrossRef]

15. Q. Li, W. Cai, X. Wang, Y. Zhou, D. D. F. Feng, and M. Chen, “Medical image classification with convolutional neural network,” 2014 13th International Conference on Control Automation Robotics and Vision, ICARCV 2014, 844–848 (2015).

16. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” (2015).

17. Y. Gordienko, P. Gang, J. Hui, W. Zeng, Y. Kochura, O. Alienin, O. Rokovyi, and S. Stirenko, “Deep Learning with Lung Segmentation and Bone Shadow Exclusion Techniques for Chest X-Ray Analysis of Lung Cancer,” (2019), pp. 638–647.

18. K. Sirinukunwattana, J. P. W. Pluim, H. Chen, X. Qi, P.-A. Heng, Y. B. Guo, L. Y. Wang, B. J. Matuszewski, E. Bruni, U. Sanchez, A. Böhm, O. Ronneberger, B. B. Cheikh, D. Racoceanu, P. Kainz, M. Pfeiffer, M. Urschler, D. R. J. Snead, and N. M. Rajpoot, “Gland segmentation in colon histology images: The glas challenge contest,” Med. Image Anal. 35, 489–502 (2017). [CrossRef]

19. B. S. Lin, K. Michael, S. Kalra, and H. R. Tizhoosh, “Skin lesion segmentation: U-Nets versus clustering,” in 2017 IEEE Symposium Series on Computational Intelligence (SSCI), 2017), 1–7.

20. P.-H. Chen, Y.-J. Chen, Y.-F. Chen, Y.-C. Yeh, K.-W. Chang, M.-C. Hou, and W.-C. Kuo, “Quantification of structural and microvascular changes for diagnosing early-stage oral cancer,” Biomed. Opt. Express 11(3), 1244–1256 (2020). [CrossRef]

21. W. C. Kuo, Y. M. Kuo, and S. Y. Wen, “Quantitative and rapid estimations of human sub-surface skin mass using ultra-high-resolution spectral domain optical coherence tomography,” J. Biophotonics 9(4), 343–350 (2016). [CrossRef]

22. C. A. Schneider, W. S. Rasband, and K. W. Eliceiri, “NIH Image to ImageJ: 25 years of image analysis.”

23. C. H. Chang, C. J. Kuo, T. Ito, Y. Y. Su, S. T. Jiang, M. H. Chiu, Y. H. Lin, A. Nist, M. Mernberger, T. Stiewe, S. Ito, K. Wakamatsu, Y. A. Hsueh, S. Y. Shieh, I. Snir-Alkalay, and Y. Ben-Neriah, “CK1alpha ablation in keratinocytes induces p53-dependent, sunburn-protective skin hyperpigmentation,” Proc. Natl. Acad. Sci. U. S. A. 114(38), E8035–E8044 (2017). [CrossRef]

24. D. Dankort, D. P. Curley, R. A. Cartlidge, B. Nelson, A. N. Karnezis, W. E. Damsky Jr., M. J. You, R. A. DePinho, M. McMahon, and M. Bosenberg, “Braf(V600E) cooperates with Pten loss to induce metastatic melanoma,” Nat. Genet. 41(5), 544–552 (2009). [CrossRef]

25. A. W. Stanton, P. Kadoo, P. S. Mortimer, and J. R. Levick, “Quantification of the initial lymphatic network in normal human forearm skin using fluorescence microlymphography and stereological methods.”

26. J. Lee, V. Srinivasan, H. Radhakrishnan, and D. A. Boas, “Motion correction for phase-resolved dynamic optical coherence tomography imaging of rodent cerebral cortex,” Opt. Express 19(22), 21258–21270 (2011). [CrossRef]

27. R. K. Wang, “Optical Microangiography: A Label Free 3D Imaging Technology to Visualize and Quantify Blood Circulations within Tissue Beds in vivo,” IEEE J. Sel. Top. Quantum Electron. 16(3), 545–554 (2010). [CrossRef]

28. P. Y. Simard, D. Steinkraus, and J. C. Platt, “Best practices for convolutional neural networks applied to visual document analysis,” in Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings., 2003), 958–963.

29. L. R. Dice, “Measures of the Amount of Ecologic Association Between Species,” Ecology 26(3), 297–302 (1945). [CrossRef]

30. M. J. Girard, N. G. Strouthidis, C. R. Ethier, and J. M. Mari, “Shadow removal and contrast enhancement in optical coherence tomography images of the human optic nerve head,” Invest. Ophthalmol. Visual Sci. 52(10), 7738–7748 (2011). [CrossRef]

31. P. P. Srinivasan, S. J. Heflin, J. A. Izatt, V. Y. Arshavsky, and S. Farsiu, “Automatic segmentation of up to ten layer boundaries in SD-OCT images of the mouse retina with and without missing layers due to pathology,” Biomed. Opt. Express 5(2), 348–365 (2014). [CrossRef]

32. S. Minaee, Y. Boykov, F. Porikli, A. Plaza, N. Kehtarnavaz, and D. Terzopoulos, Image Segmentation Using Deep Learning: A Survey (2020).

33. N. Otsu, “A Threshold Selection Method from Gray-Level Histograms,” IEEE Trans. Syst., Man, Cybern. 9(1), 62–66 (1979). [CrossRef]

34. R. Nock and F. Nielsen, “Statistical Region Merging,” IEEE Trans. Pattern Anal. Machine Intell. 26(11), 1452–1458 (2004). [CrossRef]

35. D. Nameirakpam, K. Singh, and Y. Chanu, “Image Segmentation Using K -means Clustering Algorithm and Subtractive Clustering Algorithm,” Procedia Computer Science 54, 764–771 (2015). [CrossRef]

36. L. Najman and M. Schmitt, “Watershed of a continuous function,” Signal Process. 38(1), 99–112 (1994). [CrossRef]

37. M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Active contour models,” Int. J. Comput. Vision 1(4), 321–331 (1988). [CrossRef]

38. D. Comaniciu and P. Meer, “Mean shift: a robust approach toward feature space analysis,” IEEE Trans. Pattern Anal. Machine Intell. 24(5), 603–619 (2002). [CrossRef]

39. K. Zhang, J. T. Kwok, and M. Tang, “Accelerated Convergence Using Dynamic Mean Shift,” in Computer Vision – ECCV 2006, (Springer, 2006), 257–268.

40. Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Anal. Machine Intell. 23(11), 1222–1239 (2001). [CrossRef]

41. N. Plath, M. Toussaint, and S. Nakajima, “Multi-class image segmentation using conditional random fields and global classification,” in Proceedings of the 26 th International Conference on Machine Learning (2009), p. 103.

42. J. Starck, M. Elad, and D. L. Donoho, “Image decomposition via the combination of sparse representations and a variational approach,” IEEE Trans. on Image Process. 14(10), 1570–1582 (2005). [CrossRef]

43. S. Minaee and Y. Wang, “An ADMM Approach to Masked Signal Decomposition Using Subspace Representation,” IEEE Trans. on Image Process. 28(7), 3192–3204 (2019). [CrossRef]

44. M. H. Hesamian, W. Jia, X. He, and P. Kennedy, “Deep Learning Techniques for Medical Image Segmentation: Achievements and Challenges,” J. Digit. Imaging 32(4), 582–596 (2019). [CrossRef]

45. R. V. M. d. Nóbrega, S. A. Peixoto, S. P. P. d. Silva, and P. P. R. Filho, “Lung Nodule Classification via Deep Transfer Learning in CT Lung Images,” in 2018 IEEE 31st International Symposium on Computer-Based Medical Systems (CBMS, 2018), 244–249.

46. J. W. Breslin, Y. Yang, J. P. Scallan, R. S. Sweat, S. P. Adderley, and W. L. Murfee, “Lymphatic Vessel Network Structure and Physiology,” Compr Physiol 9, 207–299 (2010). [CrossRef]

	The proposed method	Hessian w/ Background Subtraction	Intensity w/ Background Subtraction	Hessian w/o Background Subtraction	Intensity w/o Background Subtraction
Dice Coefficient	0.83	0.749	0.581	0.592	0.406
Precision	0.859	0.86	0.559	0.442	0.255
Recall	0.803	0.664	0.6	0.893	0.99

Lymphatic vessel segmentation in optical coherence tomography by adding U-Net-based CNN for artifact minimization

Abstract

1. Introduction

2. Experimental setup

2.1 OCT setup description and scan pattern

2.2 Animal preparation

2.3 Signal processing and network learning

2.4 The procedure of segmentation method

3. Results

4. Discussion

5. Summary

Funding

Disclosures

References

Cited By

Figures (9)

Tables (1)

Equations (1)

Biomedical Optics Express