Optically guided level set for underwater object segmentation

Zhe Chen; Nan Qiu; Hong Song; Lizhong Xu; Yunbo Xiong

doi:10.1364/OE.27.008819

1. Introduction

Object segmentation from underwater images offers condensed and informative details that can be used for many aspects of underwater research (e.g., underwater object recognition [1] and image classification [2]) and applications (docking control [3], structure flaw detection, and marine resources exploration [4], etc.). The aim of underwater object segmentation is to accurately identify the contour of underwater objects. In contrast to numerous successful methods employed on the ground, few object segmentation methods have been proposed for underwater images due to the well-known difficulties associated with underwater optical imaging [5]. In general, underwater optical imaging is seriously influenced by challenging underwater optical environments from two aspects: (i) the wavelength-selective absorption and scattering effects significantly reduce and distort the contrast between the object and the background [6], and (ii) the inhomogeneous illumination effect generates false and deceptive appearances, giving the impression of being underwater objects themselves [7]. Moreover, underwater object segmentation can be difficult due to the existence of complicated and variable contextual textures in the scene background. In such situations, none of the existing techniques can accurately segment objects in underwater scenes [8,9]. However, discerning object morphology details is quite important for many underwater applications.

An obvious idea for underwater object segmentation would be to apply the level set method to underwater images, given its excellent performance and strong theoretical basis. The basic concept of the level set method is to represent the object contour as a zero level set of the level set function (LSF), and the contour evolution process is formulated by an energy function [10].

Ε (ϕ) = u ℜ_{p} (ϕ) + λ L_{g} (ϕ) + α A_{g} (ϕ),

where

ϕ

is the level set;

u

,

λ

, and

α

are the coefficients of the energy function;

ℜ_{p} (ϕ)

is the level set regularization term; and

L_{g} (ϕ)

and

A_{g} (ϕ)

are the external energies. Energy

Ε (ϕ)

is designed such that it will reach the minimum when the level set

ϕ

stops at the desired position (object contour). Intrinsically, the level set method transforms the object segmentation to a mathematical optimization issue, wherein we can find a solution with mathematical convergence. This optimization process is motivated and controlled by an edge stop function (ESF) g.

g = \frac{1}{1 + {| \nabla G_{σ} \times I |}^{2}},

where

G_{σ}

is the Gaussian kernel with standard deviation

σ

and the convolution calculation

\nabla G_{σ} \times I

is used to smooth images.

Consequently, let $ϕ$ be an LSF defined on a region $Ω$ . The LSF can be analytically expressed as

Ε (ϕ) = u \int_{Ω} \frac{1}{2} ({| \nabla ϕ - 1 |}^{2}) d x d y + λ \int_{Ω} g δ (ϕ) | \nabla ϕ | d x d y + α \int_{Ω} g H (ϕ) d x d y,

where

H (\cdot)

is the Heaviside function and

δ (\cdot)

is the Dirac function.

H_{ω} (x) = {\begin{cases} 0, x < - ω \\ 1, x > ω \\ \frac{1}{2} [1 + \frac{x}{ω} + \frac{1}{π} \sin (\frac{π x}{ω})], | x | \leq ω \end{cases}

and

δ_{ω} (x) = {\begin{cases} 0, | x | > ω \\ \frac{1}{2 ω} [1 + \cos (\frac{π x}{ω})], | x | < ω \end{cases}

where

ω

is the smooth moderation parameter.

In theory, the success of Eq. (3) depends on a strict correspondence between our desired edges and the ESF calculated using Eq. (2). This is the case for images on the ground, whereas for underwater images, this correspondence is no longer obeyed, which makes the ESF ill-posed and causes the LSF to converge at a distance from our desired objects. To demonstrate this problem, a qualitative comparison is presented with our carefully designed calibrations (see the first row in Fig. 1) and natural images (see the second row in Fig. 1). The only difference between the samples in the first rows of Figs. 1 (a) and (c) is that they indicate whether or not the imaging light transmits through the water column. Equation (2) is employed to generate the deviation maps for all samples (Figs. 1(b) and 1(d)). From the experimental results shown in Fig. 1, we conclude that the water medium has a significant effect on the Gaussian deviation, which is degenerated on our desired edges but obvious on contextual textures, thus yielding distorted and noisy deviation maps for underwater images.

Fig. 1 Comparison of image deviation using Eq. (2). (a) Original images in the water, (b) deviation maps in the water, (c) original images on the ground, and (d) deviation maps on the ground. (b) and (d) are the results given by Eq. (2).

Download Full Size | PDF

Another practical issue is the initialization for the level set. Although the random level set initialization can converge on desired edges with a forward-and-backward (FAB) diffusion mechanism [10] in theory, this can only be realized with a simple background. Hence, in most cases, a man–machine interaction is required for level set initialization, making the initial contour closer to our desired regions. However, this trick seriously reduces the automation and feasibility of the level set method. To solve this problem, an image-clustering preprocessing method was introduced to replace the manual process [11,12]. However, it is still difficult to ensure that the initial level set correctly includes the object of interest, because the clustering methods can only provide clusters with homogeneous intensity, and in some cases, objects are likely to be split into different clusters. This difficulty is aggravated under water in that the inhomogeneous illumination inevitably splits an underwater image into regions with different intensities.

These difficulties are systematically handled by our novel optically guided level set method. Active imaging technologies have been widely adopted in underwater environments to compensate for imaging light attenuation [13]. To maximize the visibility of underwater images, it is necessary to shorten the underwater imaging distance and ensure that the optical axis of the light source collimates the center of the underwater objects. This underwater imaging type generates an optical collimation effect, which causes the intensity bias over underwater images. In previous methods, the intensity bias over underwater images was considered as a problem. However, in our method, the optical collimation effect is transformed into valuable guidance for underwater object segmentation. By recognizing optical collimation, we can roughly identify the candidate region of underwater objects. Furthermore, the structures of the objects can be recognized from the irradiation distribution. In general, the optical collimation guidance can benefit underwater object segmentation from the following three aspects:

(i) Evolution: The optical collimation guidance can help the LSF to evolve correctly toward the object regions.
(ii) Initialization: The optical collimation guidance can ensure a correct level set initialization that is spatially close to the underwater objects.
(iii) Automation: The optical collimation guidance can eliminate the need for man–machine interaction and segment the underwater objects with a completely automated pattern.

In general, the aim of our method can be deemed a combination of underwater object detection and underwater image segmentation as it can identify both the location and the contours of objects from underwater images. To achieve this, our novel optical guided level set method combines the underwater optical principles (optical collimation guidance) and mathematical optimization process (level set method), which is the fundamental novelty of our method. Based on this novelty, our method can transform optical challenges in underwater environments into valuable guidance for underwater object segmentation. This advantage is the key reason behind the success of our method.

The remainder of this paper is organized as follows. Section 2 reviews the related literature. The optical models and optical collimation guidance in underwater environments are analyzed in Section 3. Section 4 presents the optically guided level set. The experimental results and analysis are provided in Section 5. Section 6 concludes the paper.

2. Related works

Few underwater object segmentation methods exist at this time owing to the difficulties caused by the water medium. At present, most underwater object detection or segmentation tasks utilize methods that have been proved successful on the ground. However, this strategy is questionable as it overlooks challenges posed by underwater environments in particular. In general, underwater object segmentation can be categorized as follows with regard to two object classifications: man-made calibrations and natural objects.

2.1 Man-made calibrations

In ocean engineering, deploying carefully designed calibrations is the most feasible locating method as man-made calibrations are designed with distinguishable appearances and previously designed templates may be used to locate them. For instance, Yu et al. designed four orange and yellow LED lights to indicate specific landmark points. The signboard segmentation results were utilized to support the short-range navigation for an autonomous underwater vehicle platform [14]. The brightness feature was also considered for detecting man-made calibrations. For instance, Lee made docking marks using an underwater light-emitting diode ring with five large lights [3]. In Lee’s study, the power of the lights was adjustable to obtain an optimal threshold for image segmentation.

Empirically, we can obtain a high segmentation accuracy using man-made calibrations. However, these methods can adapt to special engineering tasks only in controlled underwater environments; apart from these carefully designed calibrations, underwater objects do not have such distinguishable appearances.

2.2 Natural objects

In contrast to man-made calibration segmentation, natural objects are more difficult to segment due to special challenges under water. Moreover, difficulties are also posed by the limited prior regarding strange objects. Many researchers tried to eliminate the haze and color distortion effects caused by water, their aim being to extract distinguishable features of natural objects. In practice, this idea is realized by a multiple-phase framework in which objects are segmented after image preprocessing. For instance, Lee et al. found that color restoration can improve detection of corners from underwater images [15]. Recently, a similar strategy was realized using the method proposed by Kim et al. [16]. The experimental results demonstrated the satisfying performance of SIFT corner for extracting sharp structures. However, this method cannot be generalized to most underwater objects, which commonly have smooth contours but lack enough corners. Other two-phase frameworks were established with a coarse-to-fine process, where the image saliency is first detected to initially identify the candidate regions for underwater objects, and the object segmentation results are then obtained by a refining process. For instance, Edgington et al. introduced the famous Itti model to initially identify pop-out in underwater scenes. Then, the segmentation results were obtained using a thresholding method [17]. However, the use of the Itti model is problematic, because it is believe that the inputs of the model, such as the color, intensity, and orientation, are not salient in underwater scenes [17]. This problem was acknowledged in recent research, where color correction was performed before saliency detection [1,18]. Recently, our team proposed a method based on intensity–color–depth feature fusion for object segmentation [19], which can efficiently provide comparable results without any preprocessors. However, this method can only detect rough object regions rather than their accurate contours.

2.3 Background of our method

Our method segments strange natural objects in underwater images and thus relates closer to the second aforementioned category. However, our method differs from existing methods for natural object segmentation in that it investigates the potential of underwater optical principles for improving underwater object segmentation. In theory, our method is based on a novel strategy: segmenting natural underwater objects under optical collimation guidance. This is a radically new strategy for underwater object segmentation, which combines practical and accepted implementation in the field of underwater optics and mathematical optimization research. In practice, our method can transform optical challenges in underwater environments into valuable guidance for underwater object segmentation. Thus, our method can accurately find and describe the structures of underwater objects.

3. Underwater active optical imaging model and optical collimation

Active optical imaging is widely adopted in underwater applications as additive light sources can efficiently compensate for the attenuation of imaging light under water. The process of underwater active optical imaging is illustrated in Fig. 2. Light is emitted from the underwater light source, reflected by the object, and enters the camera, where the images of the underwater object are captured.

Fig. 2 Illustration of underwater optical imaging. Light is emitted by the light source (denoted as A), reflected by the object (e.g., P), and enters the camera (C). The angle between the illuminating ray AP and the optical axis of the light source is denoted as φ. The angle between the imaging ray PC and the optical axis of the camera (i.e., DC) is denoted as θ. The projection of ray AP in the optical axis is denoted as AB and the length is denoted as z. During the imaging process, light scattered by particles in the water column (e.g., Q) enters the camera as well, thus blurring the image.

Download Full Size | PDF

3.1 Underwater light field

With regard to the light field generated by the underwater light source (see Fig. 3), the irradiance of light along the optical axis (line AB in Fig. 3) decreases as the distance from the light source increases. Without losing generality, we assume that the decrease in the irradiance of light at the optical axis follows a quadratic law, and thus consider that light is attenuated by the water. Then, the irradiance along the optical axis (denoted as $E (z, λ)$ ) can be represented as

E (z, λ) = E_{0} (λ) \frac{1}{{(z + a)}^{2}} e^{- α (λ) z},

where z represents the distance from a point on the optical axis to the light source, and λ is the wavelength of light.

E_{0} (λ)

is the initial irradiance (z = 0). The quadratic term

\frac{1}{{(z + a)}^{2}}

represents that the irradiance changes with the distance z in a quadratic manner. Parameter a is added in the denominator to avoid infinite irradiance when the distance z approaches zero. The exponential term

e^{- α (λ) z}

accounts for the light attenuation by water with the attenuation coefficient of water α(λ).

Fig. 3 Illustration of underwater light field distribution. Plane S is perpendicular to the optical axis of the underwater light source, the distance between them being z. The distance between points P and B is denoted as $\vec{r}$ . For any perpendicular plane located at a distance z = z₀ from the underwater light source, the irradiance E(λ) at any point generally decreases with its distance to the optical axis.

Download Full Size | PDF

Consider a point P in the underwater object, which is not necessarily in the optical axis. The irradiance at P (denoted as $E_{P} (λ)$ ) can be represented as

E_{P} (λ) = E (z, λ) f (\vec{r}),

where the function

f (\vec{r})

represents the distribution of irradiance in the plane S perpendicular to the optical axis, and the vector

\vec{r}

denotes the coordinates of point P in the plane. Considering that the light field is generally symmetric around the optical axis, the function

f (\vec{r})

can be approximated, for instance, by a Gaussian or quadratic function.

3.2 Underwater light field

We denote the reflectance of point P as $R_{P} (λ)$ . Then, the radiance reflected from P (denoted as $L_{P} (λ)$ ) can be represented as

L_{P} (λ) = R_{P} (λ) E_{P} (λ) / 4 π .

According to the underwater imaging models [13,20], the irradiance at the image point P in the image sensor (denoted as $E_{P'} (λ)$ ) can be represented as

E_{P'} (λ) = A (λ) G (l) e^{- α (λ) l} L_{P} (λ) + E_{b} (λ) e^{- β (λ) l} + E_{s} (λ),

where

A (λ)

is used to represent the combined transitivity of the lens, optical window, and color filter; and

G (l)

represents the geometric factor of the imaging system. The exponential term

e^{- α (λ) l}

represents the light loss due to water attenuation by the Beer–Lambda law [21], where

l

is the underwater distance from the object to the camera.

The term $E_{b} (λ) e^{- β (λ) l}$ represents the influence of optical scattering on underwater images with parameters $E_{b} (λ)$ and $β (λ)$ . Due to optical scattering under water, light emitted from all points except P’ also enters the camera and increases the brightness of the image point P’, leading to an undesirable brightness increment in P’. It’s worth noting that the scattered light contributes to the brightness increment not only in P’, but also over the whole image. Thus, the contrast of underwater images is significantly degraded and the image is blurred. $E_{s} (λ)$ represents the influence of stray light under water, which adds to the haze in the underwater images as well.

From Eq. (9), it is clear that the light reflected from the object is attenuated by the water, leading to information degeneration for the image point P’. Simultaneously, scattered light from other points enters the camera and leads to information distortion. These two factors jointly cause severe difficulties in underwater object segmentation.

3.3 Underwater optical collimation

To tackle challenging underwater environments and compensate for the light attenuation caused by the water during the underwater imaging processes, light sources always positively collimate objects. This process is called optical collimation. On the one hand, the optical collimation of the light source can significantly increase the visibility of underwater images. On the other hand, the optical collimation presents more opportunities to correctly detect and segment underwater objects due to the overlap between the object region and the region of optical collimation.

Given an underwater image, an intuitive idea for optical collimation recognition is to use a thresholding method under the assumption that the light from the additive sources displays the highest intensity over underwater images. This is true in pure environments where the scene is covered only by the parallel ambient light, but this is not the case for most underwater environments, where the scene radiation is also derived from a mixture of skylight and self-illuminations. As the result, the area including the highest intensity in underwater images may deviate far from the optical collimation.

However, the optical collimation exhibits other special characteristics in underwater images. First, the optical collimation will distort the smoothness of underwater images as they provide additional irradiance $E_{P} (λ)$ at point P. Hence, the optical collimation region will be locally inhomogeneous and have a large contrast (not intensity) compared to other regions. Second, according to Eq. (7), the intensity of $E_{P} (λ)$ decreases with an increasing distance between point P and the optical axis. Hence, the intensity of any point inside the optical collimation region depends on its position, and the intensity will increase if the point is located closer to the light axis and vice versa. However, such an effect cannot be presented in other regions where parallel light dominates. Third, spatially, additive light sources are much closer to the object, in contrast to the ambient light and skylight. In this situation, according to the Beer–Lambda law in Eq. (9), the attenuation of the light from underwater light sources is relatively much lower, especially in the red channel. As the result, light collimation regions will present a large contrast to other regions in the red channel. Fourth, the low wavelength-selective attenuation of additive light sources will significantly reduce the channel variation inside the optical collimation region.

The above four optical principles can be measured with the following four image features.

(i) Global intensity contrast or the point-to-point differences in the intensity: It measures the inhomogeneity in the intensity between the optical collimation region and other regions.
(ii) Intensity–position relation or position-dependent intensity: It measures the underwater imaging light field in different positions.
(iii) Red-channel contrast or the point-to-point differences in the red channel: It measures the difference in the light attenuation in the red channel between the optical collimation region and other regions.
(iv) Channel variation or the intensity difference between channels: It measures the variation between channels over images.

In underwater images, the optical collimation of underwater light sources locates in the region where all the above optical principles are satisfied simultaneously. This nature can be formally expressed by following four relationships.

(i) Inverse relationship between the global intensity contrast and intensity–position relation: Inside the optical collimation region, the points with a higher global intensity contrast must be spatially closer to the point with the maximum intensity in local regions, and vice versa.
(ii) Positive relationship between the global intensity contrast and the red-channel contrast: Inside the optical collimation region, a correspondence is observed between the global contrast in the intensity and the red channel.
(iii) Inverse relationship between the global intensity contrast and channel variation: Inside the optical collimation region, the points with a higher intensity contrast must have lower channel variation, and vice versa.
(iv) Positive relationship between the intensity–position relation and channel variation: Inside the optical collimation region, the points with a lower channel variation must be spatially closer to the point with the maximum intensity in local regions, and vice versa.

We mathematically scale these four relationships using two-dimensional correlation, as shown in Table 1.

Table 1. Correlations between different features.

View Table | View all tables in this article

Here, $c o r r 2 ()$ is the operator for the two-dimensional correlation calculation, and $C_{}^{i}$ , $D_{}^{d}$ , $C_{}^{r}$ , and $V_{}^{c}$ are the numerical matrixes of the intensity contrast, intensity–position relation, red-channel contrast, and channel variation. The elements for each matrix can be calculated as follows.

C_{x}^{i} = \sum_{\forall y \in Ν} C (I_{x}^{i}, I_{y}^{i}) = \sum_{\forall y \in Ν} ‖ I_{x}^{i} - I_{y}^{i} ‖,

where

C (I_{x}^{i}, I_{y}^{i})

is the absolute intensity difference between points x and y, N is the formatted presentation of all points in the images, and the superscript i indicates the intensity.

D_{x}^{d} =exp (D (x, m)) = \exp (\sqrt{{(ξ_{1} - ξ_{2})}^{2} + {(γ_{1} - γ_{2})}^{2}}),

where

D (x, m)

is the Euclidean distance from the point x to the point m, which has the highest intensity in local regions, and

(ξ_{1}, γ_{1})

and

(ξ_{2}, γ_{2})

are the coordinates of points x and m.

C_{x}^{r} = \sum_{\forall y \in Ν} C (I_{x}^{r}, I_{y}^{r}) = \sum_{\forall y \in Ν} ‖ I_{x}^{r} - I_{y}^{r} ‖ .

C (I_{x}^{r}, I_{y}^{r})

is the absolute intensity difference between points x and y in the red channel, and the superscript r denotes the red channel.

V_{x}^{c} = \sqrt{\sum V (I_{x}^{c}, I_{x}^{i})} = \sqrt{{(I_{x}^{r} - I_{x}^{i})}^{2} + {(I_{x}^{g} - I_{x}^{i})}^{2} + {(I_{x}^{b} - I_{x}^{i})}^{2}},

where

V (I_{x}^{c}, I_{x}^{i})

is the channel variance of point x in the

r * g * b

color space.

Finally, the optical collimation region can be identified by comprehensively integrating the above correlations.

S = c o r r 2 (C_{}^{i}, (1 - D_{}^{d})) c o r r 2 (C_{}^{i}, C_{}^{r}) c o r r 2 (C_{}^{i}, (1 - V_{}^{c})) c o r r 2 (D_{}^{d}, V_{}^{c}) .

A desirable advantage of S is that it can identify the optical collimation via its numerical values. Here, we use the thresholding method to recognize the optical collimation.

L = {\begin{cases} 1 i f S > T \\ 0 o t h e r w i s e \end{cases},

where T is the threshold given by the OTSU method [22]. Consequently, the optical collimation region is determinatively located at points where L = 1.

Moreover, the irradiation distribution inside the optical collimation region can be estimated.

W = L (C_{}^{i} + C_{}^{r} - D_{}^{d} - V_{}^{c}) .

4. Optically guided level set

Recall that our method segments underwater objects with the level set. In theory, the performance of the level set depends on two issues: the level set initialization and the level set evolution. As mentioned in Section 1, these two issues pose serious difficulties in underwater environments. These issues can be systematically overcome using our optical collimation guidance.

4.1 Optically guided level set initialization

Initialization and configuration of the level set are desired to enclose the underwater object as tightly as possible. Doing so can verify and speed up the convergence of the zero level set on the object edges, which is more important as underwater scenes have numerous textures. In this study, we initialize the level set according to the overlap between the object region and optical collimation region under water.

ϕ_{0} (x, y) = - 4 ω (0.5 - L),

where

ω

is a constant regulating the Dirac function in Eq. (5), and L is the normalized determinative value obtained from Eq. (8). We define the LSF that takes negative values inside the initial level set and positive values outside it. As the initial contour detected by the optical collimation region necessarily encloses the object, the parameter

α

in Eq. (3) should be positive to maintain a shrunk initialized level set during the evolution process.

4.2 Optically guided level set evolution

Evolution of the level set is desired to rapidly move the level set toward object edges. Then, the evolution process stops and the LSF converges to the minimum. Here, the ESF g in Eq. (2) plays a critical role in controlling the evolution process. Because our method involves segmenting objects in underwater images, the ESF should appropriately identify the deviation between the object and background, while remaining smooth inside or outside the objects. This indicates that our ESF must have an active discriminability for underwater object edges.

As mentioned in Section 3, the light source necessarily collimates objects during underwater imaging. This means that the irradiation inside the optical collimation reaches the maximum at the center of the object body, decays at the edge, and gradually disappears. Hence, the object edge must present the mean value of the irradiation intensity. Hence, it is convenient to identify the object edges after normalizing the optical collimation distribution.

ρ (W) = β {(2 (W - 0.5))}^{2},

where

W

is the optical collimation distribution given by Eq. (16), and

β

is the weighting coefficient.

ρ

is a convex function, which will reach the global minimum nearby object edges, and thus, it mathematically ensures the convergence of the level set evolution. Our novel optically guided ESF can be modeled as follows.

g^{'} = g ρ .

In contrast to $g$ in Eq. (2), $g^{'}$ in Eq. (19) shows better performance for accurate identification of the deviations at object edges. A sample of the optically guided ESF calculated for a real underwater image is shown in Fig. 4. We can see that the textual background leads to serious clutter in original image deviations g (Fig. 4(b)). However, this type of clutter is largely smoothed in the map of $ρ$ (Fig. 4(c)), and our optically guided ESF can accurately identify the object edge. This is a desired property as it ensures that our optically guided ESF can force the initial level set to ultimately converge on the exact object edges. In our model, the standard ESF is replaced with our optically guided ESF to segment underwater objects.

Fig. 4 Optically guided ESF $g^{'}$ , (a) original underwater image, (b) map of image deviations g, (c) map of the coefficient optical guidance $ρ$ , and (d) map of $g^{'}$ .

Download Full Size | PDF

4.3 Optically guided parameter tuning

Our optically guided level set method involves several adjustable parameters (Table 2). Each of them plays a special role in model behaviors. Therefore, it is necessary to properly adjust these parameters, which unfortunately vary from case to case. For example, it is known that for a large-sized image, a large time step $τ$ is preferred. However, this empirical rule is unsuitable for a situation with small-sized objects, especially when the level set approaches the object. Moreover, underwater images present some peculiarities unlike common images. For example, underwater images are smoother, and thus, we prefer a smaller $σ$ for the Gaussian kernel to retain the details of the underwater images. The desired threshold $T$ must create an optical collimation region (initial LSF) that encloses the object accurately. According to the underwater active optical imaging model in Eq. (7), the optical collimation distribution mostly satisfies the Gaussian mode. Thus, an adaptive threshold $T$ can be set.

T = ave (S) + 2 var (S),

where

ave (S)

and

var (S)

are the average and variance of the optical collimation distribution. The threshold

T

given by Eq. (20) can adapt to all underwater environments. The parameter

β

is used to refine the edges between the object and background. Due to the low contrast of the underwater image, a relatively large

β

is preferred; thus,

β = 2

. Given the initial LSF

ϕ_{0}

from the optical collimation, it is easy to calculate the contour length

l

and the area

ℑ

.

l = \int δ (ϕ_{0}) d x d y

and

ℑ = \int H^{'} (ϕ_{0}) d x d y,

where the Heaviside function

H^{'} (ϕ_{0})

is defined as

Table 2. Parameters controlling optically guided level set segmentation.

View Table | View all tables in this article

H^{'} (ϕ_{0}) = {\begin{cases} 1, ϕ_{0} \leq 0 \\ 0, ϕ_{0} > 0 \end{cases} .

We find that if the underwater objects are large and the level set contour is circle-like, the LSF evolution will be faster. This case can be mathematically described by the isoperimetric inequality as a ratio between the area and length.

ς = ℑ / l .

The penalty weight for the regularization term $u$ is set accordingly.

u = 0.2 / ς .

As $u \times τ \leq 0.25$ is necessary to maintain a stable evolution. From Eqs. (24) and 25, a large-sized object will lead to a large $ς$ and a small $u$ , and the interaction time step $τ$ is simultaneously and adaptively adjusted to a large value. As our new level set initialization is spatially close to the object edge, the weight for the edge description term $λ$ is set as

λ = 0.5 ς .

The weight parameter $α$ for the area description term is responsible for two effects in the level set evolution process. First, the sign of $α$ determines the direction of the LSF evolution, namely, positive for shrinkage and negative for expansion. Second, a larger $α$ can move the level set faster, and vice versa. Notice that our new level set initialization necessarily encloses the object region. Thus, a positive $α$ is used to control the evolution direction at the start point. In a previous study on level set methods [10], $α$ is assigned as a global constant value during the evolution process. However, this strategy is questionable as it is desirable to speed up the evolution when the level set is located far away from the object and lessen its speed near object edges. Recall that the optical collimation distribution is symmetric around the optical axis that focuses on the object. Hence, the optical collimation distribution can be regarded as a non-scale distance from the object. Based on this measurement, an adaptive parameter for the area description term can be obtained.

α = 2 (0.5 - W)

As $W$ is normalized from 0 to 1, the speedup force $α$ ( $\in [- 1, 1]$ ) controls the shrinking or expanding force at each time step. In other words, when the level set contour locates outside the object region ( $0 \leq W \leq 0.5$ ), $0 \leq α \leq 1$ and the LSF will be attracted toward the objects. On the contrary, $- 1 \leq α < 0$ once the level set locates inside the object region ( $0.5 < W \leq 1$ ).

Our optical collimation guidance provides several contributions to the level set method, which can be concluded from two aspects: initialization and convergence. These contributions jointly yield a novel optically guided level set method that is adaptable to automatically segmenting objects from diverse underwater backgrounds. To be specific, this adaptability can initialize the level set to correctly enclose the underwater object, and then, the direction and speed of the level set evolution are adaptively determined with its distance from the object. This allows a large iteration of evolution $Κ$ , and prevents insufficient or excessive segmentation.

5. Experimental results and discussion

We evaluate our optically guided level set method and compare it with the state-of-the-art methods for underwater benchmark data sets. The compared methods include the DRLSE method [10], the local intensity clustering-based level set method [23], the edge-region-based level set method [24], the spatial fuzzy clustering-based level set method [11], the fuzzy region competition-based level set method [12], and the localized region-based active contour [25]. These diverse comparisons can comprehensively demonstrate the performance of our new level set method from different aspects.

Compared with the DRLSE, which is the basis of our proposed method, the contribution of our optical collimation guidance for underwater object segmentation can be clearly demonstrated. In theory, the second method (local intensity clustering-based level set method) can address the inhomogeneity in intensity over images by using the local intensity clustering property. This is a desired property for underwater image segmentation as intensity inhomogeneity is ubiquitous over underwater images. The third method (edge-region-based level set method) is a typical method that segments images, jointly considering image edges and regions, and thus, is excellent for degenerated images. By introducing the fuzzy clustering method as the preprocessor, the fourth and fifth methods (spatial fuzzy clustering and fuzzy region competition-based level set method) likely show better performance when initializing the level set. Specifically, the man–machine interaction mechanism in the fifth method can actively select the region where the level set evolves. The reason for comparing our method with the last active contours-based method lies in their similar aims for object segmentation.

Moreover, we evaluate our method by comparing it to the salient object detection methods, which have proven successful in pop-out object identification. Hence, these comparisons can fairly demonstrate the usefulness of our method for object identification. The compared salient object detection methods include a filter-based method, such as the Itti method [26]; three frequency spectrum analysis methods, namely the spectral residual (SR) [27], phase Fourier spectrum (PFT) [28], and the hypercomplex Fourier transform (HFT) [29], three statistics-based methods, namely the attention based on information maximization (AIM) [30], boolean map saliency (BMS) [31], and saliency using natural statistics (SUN) [32]; and a graph-based method, namely Graph-based Visual Saliency (GBVS) [33]. For all tests, color underwater images were first converted to their grayscale form.

5.1 Details of our object segmentation process

Our optically guided level set method is first tested on underwater images with plenty textural context, as shown in Fig. 5(a). These results present the details of the initialization and the evolution processes of our optically guided level set method. For example, in Figs. 5(b) and 5(c), our method can consistently evolve the initial LSF toward our desired object edges, providing accurate underwater object segmentation results, as shown in Fig. 5(d).

Fig. 5 Segmentation process and results of our method: (a) Original image, (b) initial LSF, (c) final LSF, and (d) segmentation results.

Download Full Size | PDF

5.2 Comparison with DRLSE

Our results are compared with those obtained by DRLSE method. Notice that the basis of our method is DRLSE, and the only difference between our method and the DRLSE approach is whether our optical guidance is introduced. Hence, this experimental comparison can clearlydemonstrate the contribution of our optical collimation guidance for underwater object segmentation. In the classic DRLSE method, the initial contour is set by a window with a fixed center and size. For example, the default values in the DRLSE demo are maintained as x = 25:35, y = 40:50. However, in practice, this initialization may be distant from the object, seriously affecting the convergence process of the level set. This is demonstrated by the results in Fig. 6. Here, we can see that the inhomogeneous intensity—rather than object edges—controls the evolution process of the level set. As the result, the final contour results are meaningless.

Fig. 6 Segmentation results of the DRLSE method with a fixed initial window (x = 25:35, y = 40:50).

Download Full Size | PDF

The results in Fig. 7 demonstrate the contributions of our optical guidance for underwater object segmentation in steps. From Fig. 7, the initial contour guided from the optical collimation region is close to the underwater objects themselves and excludes a large amount of background textures. This effect practically benefits the level set evolution in that only a small number of iterations are enough for convergence. However, in spite of this advantage, more efforts are necessary because combining our new initialization (Fig. 7(a)) and the traditional ESF (Fig. 7(b)) still cannot provide satisfying results (Fig. 7(c)). The noise-like textures surrounding initial contours cause the LSF to fall into a premature state, which is convergent but distant from underwater objects. This problem can be solved by adding guidance from the optical collimation distribution (Fig. 7(d)). The results in Fig. 7(e) show that our new ESF can maintain appropriate deviation at the object edges and smooth the textures inside or outside object regions. This ensures that the level set evolution finally stops at our desired edges (Fig. 7(f)).

Fig. 7 Segmentation results with our optical collimation guidance. (a) Initial LSF under the optical collimation region guidance, (b) original ESF, (c) final contour by combining (a) and (b), (d) optical collimation distribution, (e) new ESF under the optical collimation distribution guidance, and (f) final contour by combining (a) and (e).

Download Full Size | PDF

Overall, the unique contribution of our optical collimation guidance is evident. As for underwater object segmentation, the optical collimation region guidance can ensure a good start of the level set evolution, and a regular convergence is maintained by the optical collimation distribution guidance. Moreover, during level set evolution, model parameters are adaptively controlled by the optical collimation guidance. These three factors, which characterize our new method, cooperate in a uniform framework to generate satisfying underwater object segmentation results. However, such results cannot be obtained by using only one or two of these factors.

5.3 Comparison with other image segmentation methods

In this section, our method is compared with five typical image segmentation methods that have shown excellent performance for various data sets, and thus have been frequently cited in the literature. The parameters initialized by the optical collimation guidance are shown in Table 3. From the results in Fig. 8, we observe that for underwater object segmentation, the best results are obtained using our method. Among all the methods, only our contour can correctly and tightly enclose underwater objects. Moreover, notably, for local intensity clustering and edge-region-based level set methods, we try our best to manually initialize the level set to improve their performance, as shown in Fig. 9. However, this careful effort does not give the desired results. As shown in Figs. 8(a) and 8(b), the final contour intricately spread over the entire image.

Table 3. Initial parameter values in Fig. 8.

View Table | View all tables in this article

Fig. 8 Segmentation results. (a) Local intensity clustering-based level set, (b) edge-region-based level set, (c) spatial fuzzy clustering-based level set, (d) fuzzy region competition-based level set, (e) localized region-based active contour, and (f) our method.

Download Full Size | PDF

Fig. 9 Manually initialized level set.

Download Full Size | PDF

For the preprocessor-based methods, such as the spatial fuzzy clustering and fuzzy region competition-based level set methods, underwater images are clustered into homogeneous regions before the level set initialization. However, these clusters cannot identify the region that contains the objects. As the result, it is still difficult for these methods to selectively move the level set toward object edges. To solve this problem, we manually select the region including the objects and expect the level set to evolve there. However, in spite of this effort, according to Figs. 8(c) and 8(d), the final results are still incorrect, because the margin between region partitions, and the spur noises seriously disturb the evolution process.

Unlike the level set method, the development of the last active contour method is driven by a uniform modeling energy. It is claimed to have an ability to model the foreground and background as constant intensities represented by their means [25]. In the experiment with the active contour model, we manually set the initial contour to include the underwater objects as the active contour is known to be sensitive to its initialization. However, according to the results given by the active contour (Fig. 8(e)), the man-made initialization cannot evolve into the desired contours even with >1,000 iterations.

Overall, for underwater images, only our method can initialize and develop the contour in a totally automatic and adaptive pattern, while the other compared methods used in this experiment depend on a full-supervision or semi-supervision pattern. Thus, it quite unexpected that our proposed method performed so well. These comparisons highlight the advantage of our optical collimation guidance for underwater object segmentation.

5.4 Quantitative comparison

In addition to qualitative evaluation, the accuracy of the underwater object segmentation can be evaluated quantitatively and objectively using the Jaccard similarity index (JSI), which is the ratio of the overlap between the segmented volume R₁ and the ground-truth volume R₂.

J S I (R_{1}, R_{2}) = \frac{| R_{1} \cap R_{2} |}{| R_{1} \cup R_{2} |} .

The JSI ranges from 0 to 1, and a higher value corresponds to a more accurate segmentation result. The scores for each method are scaled by 10 samples, as shown in Figs. 5 and 8.

The scores in Table 4 show the significant superiority of our method: our method is the only one that correctly segments the objects from underwater scenes as JSI > 0.5, which is commonly deemed as the criterion for a successful segmentation result.

Table 4. JSI values of the image segmentation results shown in Figs. 4 and 8.

View Table | View all tables in this article

5.5 Comparison with salient region detection methods

State-of-the-art salient region detection methods are tested to demonstrate the performance of our method for underwater object identification. The results shown in Fig. 10 indicate diverse outcomes from different salient object detection methods; some methods perform well for underwater object identification while others are susceptible to contextual textures. Among the salient object detection methods, none can provide results comparable to our method for all benchmarks. Moreover, these saliency-based methods can only provide a transition between underwater objects and the background; they cannot accurately depict the object edges and structures, as shown in Figs. 10(b)–10(i).

Fig. 10 Object identification results with our method and salient object detection methods. (a) Ground-truth, (b) Itti, (c) SR, (d) PFT, (e) HFT, (f) AIM, (g) BMS, (h) SUN, (i) GBVS, and (j) our method.

Download Full Size | PDF

6. Conclusion

A novel optically guided level set has been proposed for underwater object segmentation. After exploring the optical principles in underwater environments, we propose a novel type of guidance—optical guidance—that recognizes the optical collimation region and determines the optical collimation distribution of the underwater light source. Under a non-supervision pattern, this guidance automatically ensures that the initial contour correctly encloses the objects and evolves toward the desired edges. Also, this ability is intrinsically adapted to various underwater environments. Combining our novel optical collimation guidance with DRLSE, we obtain a novel optically guided level set. In contrast to existing image segmentation methods as well as the methods for salient object detection, our method shows better performance for identifying the object region and object contour from underwater images.

Our results demonstrate that the optical principles in challenging underwater environments provide us important cues for overcoming the difficulties inherent in such conditions. The work presented in this paper is essentially interdisciplinary in that it involves the application of optical principles to the mathematical optimization process. This is a novel idea in underwater optical imaging and image processing, and will motivate future research in areas as diverse as computer vision.

Funding

National Natural Science Foundation of China (NSFC) (61671201 and 61701166), and the Fundamental Research Funds for the Central Universities (2017B01914).

References

1. Meng-Che Chuang, Jenq-Neng Hwang, and K. Williams, “A feature learning and object recognition framework for underwater fish images,” IEEE Trans. Image Process. 25(4), 1862–1872 (2016). [PubMed]

2. Y. Li, H. Lu, J. Li, X. Li, Y. Li, and S. Serikawa, “Underwater image de-scattering and classification by deep neural network,” Comput. Electr. Eng. 54, 68–77 (2016). [CrossRef]

3. P. M. Lee, B. H. Jeon, and S. M. Kim, “Visual servoing for underwater docking of an autonomous underwater vehicle with one camera,” Oceans 1, 677–682 (2003).

4. R. Gibson, R. Atkinson, and J. Gordon, “A review of underwater stereo-image measurement for marine biology and ecology applications,” Oceanogr. Mar. Biol. Annu. Rev. 47, 257–292 (2016).

5. A. Jantzi, W. Jemison, A. Laux, L. Mullen, and B. Cochenour, “Enhanced underwater ranging using an optical vortex,” Opt. Express 26(3), 2668–2674 (2018). [CrossRef] [PubMed]

6. S. Q. Duntley, “Light in the sea,” JOSA 53(2), 214–233 (1963). [CrossRef]

7. M. Twardowski and A. Tonizzo, “Scattering and absorption effects on asymptotic light fields in seawater,” Opt. Express 25(15), 18122–18130 (2017). [CrossRef] [PubMed]

8. K. Jung, P. Youn, S. Choi, J. Lee, H. Kang, and H. Myung, “Development of retro-reflective marker and recognition algorithm for underwater environment,” 4th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), 666–670 (2017). [CrossRef]

9. C. O. Ancuti, C. Ancuti, C. De Vleeschouwer, and P. Bekaert, “Color Balance and Fusion for Underwater Image Enhancement,” IEEE Trans. Image Process. 27(1), 379–393 (2018). [CrossRef] [PubMed]

10. C. Li, C. Xu, C. Gui, and M. D. Fox, “Distance regularized level set evolution and its application to image segmentation,” IEEE Trans. Image Process. 19(12), 3243–3254 (2010). [CrossRef] [PubMed]

11. B. N. Li, C. K. Chui, S. Chang, and S. H. Ong, “Integrating spatial fuzzy clustering with level set methods for automated medical image segmentation,” Comput. Biol. Med. 41(1), 1–10 (2011). [CrossRef] [PubMed]

12. B. N. Li, J. Qin, R. Wang, M. Wang, and X. Li, “Selective level set segmentation using fuzzy region competition,” IEEE Access 4, 4777–4788 (2016). [CrossRef]

13. Y. Guo, H. Song, H. Liu, H. Wei, P. Yang, S. Zhan, H. Wang, H. Huang, N. Liao, Q. Mu, J. Leng, and W. Yang, “Model-based restoration of underwater spectral images captured with narrowband filters,” Opt. Express 24(12), 13101–13120 (2016). [CrossRef] [PubMed]

14. S. C. Yu, T. Ura, T. Fujii, and H. Kondo, “Navigation of autonomous underwater vehicles based on artificial underwater landmarks,” Oceans 1, 409–416 (2001).

15. D. Lee, G. Kim, D. Kim, H. Myung, and H. T. Choi, “Vision-based object detection and tracking for autonomous navigation of underwater robots,” Ocean Eng. 48, 59–68 (2012). [CrossRef]

16. D. Kim, D. Lee, H. Myung, and H.-T. Choi, “Artificial landmark-based underwater localization for AUVs using weighted template matching,” Intell. Serv. Robot. 7(3), 175–184 (2014). [CrossRef]

17. D. R. Edgington, K. A. Salamy, M. Risi, R. E. Sherlock, D. Walther, and C. Koch, “Automated event detection in underwater video,” Oceans 5, 2749–2753 (2003).

18. D. L. Rizzini, F. Kallasi, F. Oleari, and S. Caselli, “Investigation of vision-based underwater object detection with multiple datasets,” Int. J. Adv. Robot. Syst. 12(6), 77 (2015). [CrossRef]

19. Z. Chen, Z. Zhang, F. Dai, Y. Bu, and H. Wang, “Monocular Vision-Based Underwater Object Detection,” Sensors (Basel) 17(8), 1784–1796 (2017). [CrossRef] [PubMed]

20. J. W. Kaeli, H. Singh, C. Murphy, and C. Kunz, “Improving color correction for underwater image surveys,” Oceans 11, 1 (2011). [CrossRef]

21. D. Calloway, “Beer-lambert law,” J. Chem. Educ. 74(7), 744–761 (1997). [CrossRef]

22. N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979). [CrossRef]

23. C. Li, R. Huang, Z. Ding, J. C. Gatenby, D. N. Metaxas, and J. C. Gore, “A level set method for image segmentation in the presence of intensity inhomogeneities with application to MRI,” IEEE Trans. Image Process. 20(7), 2007–2016 (2011). [CrossRef] [PubMed]

24. S. Balla-Arabé, X. Gao, and B. Wang, “GPU accelerated edge-region based level set evolution constrained by 2D gray-scale histogram,” IEEE Trans. Image Process. 22(7), 2688–2698 (2013). [CrossRef] [PubMed]

25. S. Lankton and A. Tannenbaum, “Localizing region-based active contours,” IEEE Trans. Image Process. 17(11), 2029–2039 (2008). [CrossRef] [PubMed]

26. L. Itti and C. Koch, “Computational modelling of visual attention,” Nat. Rev. Neurosci. 2(3), 194–203 (2001). [CrossRef] [PubMed]

27. X. Hou and L. Zhang, “Saliency Detection: A Spectral Residual Approach,” IEEE Conference on Computer Vision and Pattern Recognition 1–8 (2007). [CrossRef]

28. X. Bai, Y. Fang, W. Lin, L. Wang, and B.-F. Ju, “Saliency-Based Defect Detection in Industrial Images by Using Phase Spectrum,” IEEE Trans. Industr. Inform. 10(4), 2135–2145 (2014). [CrossRef]

29. J. Li, M. D. Levine, X. An, X. Xu, and H. He, “Visual Saliency Based on Scale-Space Analysis in the Frequency Domain,” IEEE Trans. Pattern Anal. Mach. Intell. 35(4), 996–1010 (2013). [CrossRef] [PubMed]

30. N. Bruce and J. Tsotsos, “Saliency based on information maximization,” Adv. Neural Inf. Process. Syst. 18, 155–169 (2006).

31. J. Zhang and S. Sclaroff, “Exploiting Surroundedness for Saliency Detection: A Boolean Map Approach,” IEEE Trans. Pattern Anal. Mach. Intell. 38(5), 889–902 (2016). [CrossRef] [PubMed]

32. C. Kanan, M. H. Tong, L. Zhang, and G. W. Cottrell, “SUN: Top-down saliency using natural statistics,” Vis. Cogn. 17(6-7), 979–1003 (2009). [CrossRef] [PubMed]

33. J. Harel, C. Koch, and P. Perona, “Graph-based visual saliency,” NIPS, 545–552 (2006).

Parameter		Significance
$T$		Threshold for identifying the optical collimation region (initial LSF)
$β$		Weight for enhancing the edges of underwater objects
$σ$		Controlling the width of the Gaussian smoothing function
$ω$		Regulator for Dirac function
$u$		Weight for the regularization term
$λ$		Weight for the edge description term
$α$		Weight for the area description term (speedup force)
$τ$		Time step of level set evolution
$Κ$		Maximum iteration of level set evolution

Image/rows	$λ$	$β$	$σ$	$ω$	$u$	T	$α$	$τ$	$Κ$
1st	0.523	2	1.5	1.5	0.065	1.538	[-1,1]	3.0769	1200
2nd	0.810	2	1.5	1.5	0.083	1.205	[-1,1]	2.4096	1200
3rd	0.755	2	1.5	1.5	0.084	1.190	[-1,1]	2.3810	1200
4th	0.550	2	1.5	1.5	0.092	1.087	[-1,1]	2.1739	1200
5th	0.675	2	1.5	1.5	0.077	1.298	[-1,1]	2.5974	1200
6th	0.636	2	1.5	1.5	0.087	1.149	[-1,1]	2.2989	1200
7th	0.798	2	1.5	1.5	0.075	1.333	[-1,1]	2.6667	1200
8th	0.850	2	1.5	1.5	0.086	1.163	[-1,1]	2.3256	1200
9th	0.812	2	1.5	1.5	0.079	1.266	[-1,1]	2.5316	1200

Method			1		2		3		4
Local intensity clustering			0.2152		0.0089		0.0037		0.0215
Edge-region			0.0072		0.0056		0.0096		0.0120
Spatial fuzzy clustering			0.0848		0.0245		0.0699		0.0439
Fuzzy region competition			0.0680		0.0070		0.0114		0.0101
Active contour			0.1717		0.0141		0.0402		0.0761
Our method			0.8156		0.7444		0.5640		0.5630
The bold values denote the best segmentation results.
Table 4. Continued
5	6	7		8		9		10
0.1169	0.0958	0.0707		0.0637		0.0352		0.0296
0.1011	0.0439	0.0392		0.0467		0.0295		0.0237
0.1263	0.1823	0.1794		0.2119		0.1621		0.2285
0.1027	0.0682	0.0421		0.0355		0.0395		0.0309
0.2504	0.1161	0.1269		0.1857		0.1235		0.0385
0.7480	0.8043	0.8767		0.9301		0.8329		0.7922

Parameter		Significance
$T$		Threshold for identifying the optical collimation region (initial LSF)
$β$		Weight for enhancing the edges of underwater objects
$σ$		Controlling the width of the Gaussian smoothing function
$ω$		Regulator for Dirac function
$u$		Weight for the regularization term
$λ$		Weight for the edge description term
$α$		Weight for the area description term (speedup force)
$τ$		Time step of level set evolution
$Κ$		Maximum iteration of level set evolution

Image/rows	$λ$	$β$	$σ$	$ω$	$u$	T	$α$	$τ$	$Κ$
1st	0.523	2	1.5	1.5	0.065	1.538	[-1,1]	3.0769	1200
2nd	0.810	2	1.5	1.5	0.083	1.205	[-1,1]	2.4096	1200
3rd	0.755	2	1.5	1.5	0.084	1.190	[-1,1]	2.3810	1200
4th	0.550	2	1.5	1.5	0.092	1.087	[-1,1]	2.1739	1200
5th	0.675	2	1.5	1.5	0.077	1.298	[-1,1]	2.5974	1200
6th	0.636	2	1.5	1.5	0.087	1.149	[-1,1]	2.2989	1200
7th	0.798	2	1.5	1.5	0.075	1.333	[-1,1]	2.6667	1200
8th	0.850	2	1.5	1.5	0.086	1.163	[-1,1]	2.3256	1200
9th	0.812	2	1.5	1.5	0.079	1.266	[-1,1]	2.5316	1200

Optically guided level set for underwater object segmentation

Abstract

1. Introduction

2. Related works

2.1 Man-made calibrations

2.2 Natural objects

2.3 Background of our method

3. Underwater active optical imaging model and optical collimation

3.1 Underwater light field

3.2 Underwater light field

3.3 Underwater optical collimation

4. Optically guided level set

4.1 Optically guided level set initialization

4.2 Optically guided level set evolution

4.3 Optically guided parameter tuning

5. Experimental results and discussion

5.1 Details of our object segmentation process

5.2 Comparison with DRLSE

5.3 Comparison with other image segmentation methods

5.4 Quantitative comparison

5.5 Comparison with salient region detection methods

6. Conclusion

Funding

References

Cited By

Figures (10)

Tables (4)

Equations (28)

Optics Express