Automatic land-sea classification in a nearshore environment using satellite-based photon-counting LiDAR data

Guoping Zhang; Guoping Zhang; Shuai Xing; Qing Xu; Qing Xu; Songtao Guo; Ming Gao; Li Chen; Dandi Wang

doi:10.1364/OE.479449

1. Introduction

A new generation of satellite-based photon-counting light detection and ranging (LiDAR), the Ice, Cloud, and Land Elevation Satellite-2 (ICESat-2), was launched in late September 2018 [1]. The onboard Advanced Topographic Laser Altimeter System (ATLAS) is a 532 nm laser with a 10 kHz frequency. The laser footprint is 11 m in diameter, and the spacing between each footprint is 0.7 m [2], much higher than the previous Earth observation missions [3,4]. ICESat-2 has been widely used in forest canopy monitoring and terrain surveying [5–7]. Unexpectedly, the satellite can also penetrate ∼1 Secchi depth (which is related to water clarity and is a measure of how deep a laser can penetrate the water) and can survey underwater topography up to 40 m [8,9]. Accurate classification of ICESat-2 data into land and sea is the prerequisite for producing data products with high resolution under different global surfaces.

Although manual classification is relatively easy, it is still challenging to classify massive ICESat-2 data in the nearshore environment automatically. First, ATLAS is sensitive and can be triggered by solar radiation, atmospheric scattering, and water backscattering [10,11]. Noisy detected photons are randomly distributed in ICESat-2 data. Unfortunately, the surface with different reflectivity will lead to a difference in noise rate [12,13], and the background noise rate from the land is higher than that from the sea [14]. Removing noise photons before classifying them is difficult [15]. In addition, the nearshore environment is one of the most variable environments on Earth. Various types of coastal zones and different data collection conditions will make it difficult to predict the land-sea boundary [16–18]. Moreover, the reflected energy or intensity is absent in ICESat-2 data [2], so it is unlikely to be detected by the application of normalized difference water index (NDWI) in passive optical images [19,20] and scan line intensity-elevation ratio (SLIER) in airborne oscillating LiDAR [21,22]. A novel method for land-sea classification images of photon-counting LiDAR data is needed.

ICESat-2 data provides land cover labels obtained from MODIS and other classification results with a resolution of 0.05° [23] and can only be used in regional-scale research. In fine-resolution research, land cover products, such as National Land Cover Database (NLCD) 2016, are often Ref.d [11,24]. However, the low update frequency of these products cannot adapt to the variability of the nearshore environment. Therefore, background-noise-rate-based methods were proposed. Kwok [25] realized ice-water classification by setting empirical thresholds of background noise rate. Zhang et al. [14] introduced the specular reflection theory into the classification, improved the solar background noise model of land and sea, and based on this, realized the classification on the east coast of North Carolina. With the help of the National Centers for Environmental Prediction (NCEP) wind speed data [26], this method did not utilize other spatial features of ICESat-2 data. Xie et al. [27] trained the regression model with spatial features and classified sediments in the Alaska coastal zone. The ICESat-2 science team [28,29] also used the random forest to identify sea ice whilst creating advanced products (ATL07 and ATL10). The machine-learning-based methods show potential in land cover classification, especially the random forest, which has good adaptability and performance. However, the manual collection of training data restricts its further application.

Since the ICESat-2 data lacks accurate land cover labels, the proposed land-sea classification methods depend on auxiliary information or manually collected samples. This study aims to realize the automatic land-sea classification in nearshore environments and avoid complicated theoretical derivation and heavy labor. To achieve this goal, we need to 1) design an index called normalized photon rate-elevation ratio (NPRER) for measuring the possibility of sea surface appearance; 2) propose an automatic land-sea classification method for nearshore environments so ICESat-2 data can be classified without manual intervention.

2. Materials

2.1 Study site

As shown in Fig. 1, the study site is at east Cook Inlet (61.06°N, 150.00°W), an inlet extending from the Gulf of Alaska to south-central Alaska. Cook Inlet separates the Kenai Peninsula from mainland Alaska, and there are various geomorphologies on both banks of Cook Inlet, which are suitable to verify the performance of our classification method. Additionally, Cook Inlet is rich in natural gas resources and is an essential habitat for endangered beluga whales [30]. Accurate mapping of Cook Inlet benefits the harmonious coexistence of man and nature.

Fig. 1. Study site in the east Cook Inlet, Alaska, with the trajectories of ICESat-2 data.

Download Full Size | PDF

2.1 ICESat-2 ATL03 data

ATLAS is equipped with three pairs of laser beams, and each pair includes a strong beam and a weak beam, labeled gt1r, gt1l, gt2r, gt2l, and gt3r, gt3l, respectively. The laser energy ratio between the strong beam and weak beam is close to 4:1 [2]. ICESat-2 ATL03 (Global Geolocated Photons) data corresponds to the ATLAS beam label, and the data bin is divided every 20 m along the track direction. Each photon's time, longitude, latitude, and altitude are recorded and stored in HDF format [23].

Six pieces of ICESat-2 ATL03 data obtained in 2021 and 2022 were collected, and the strong beams were used in the study. As shown by the trajectories in Fig. 1, these data have been cut into segments of about 4 km. The data collection details are shown in Table 1. Due to the lack of fine-resolution land cover data at the same time, the referenced land-sea boundary of ICESat-2 data is obtained by visual interpretation.

Table 1. ICESat-2 ATL03 Data Collection Details in Study Site

View Table | View all tables in this article

Table 2. Seven Photon Features Used in Reclassification

View Table | View all tables in this article

3. Normalized photon rate-elevation ratio (NPRER)

The difference in the surface's laser reflection, transmission, absorption ability, topography, and roughness results in the difference in reflectivity. That would lead to the difference of photon number in ICESat-2 data bins within a certain distance under a different surface; that is, the land cover would cause the change of ICESat-2 photon rate.

Figure 2 illustrates a typical nearshore environment. Due to the strong penetration ability of 532 nm laser in water, the absorption and scattering of laser energy by the water, the strong directivity of the sea surface reflection, and the influence of waves, the reflectivity of the sea is always lower than that of other land surfaces. When the laser wavelength is 532 nm, the reflectivity of water is lower than 5%. In contrast, under the same wavelength, the reflectivity of vegetation is ∼10%, and the reflectivity of rock is higher than 20%. This difference in reflectivity makes the signal photon rate of the sea consistently lower than that of the land. Moreover, assuming that the sea surface is specularly reflected, the number of noise photons from part of the land surface is always dozens of times that form part of the sea surface, and the noise photon rate on the sea surface is lower than that on the land surface [14].

Fig. 2. Typical nearshore environment during ICESat-2 surveying. The photon rate and elevation of the sea are lower than those of the land.

Download Full Size | PDF

As shown in formula (1), the photons recorded in ICESat-2 data are composed of signal photons, background noise photons, and noise photons caused by the dark current in the instrument, which has no apparent difference in photon rate under different land covers. Thus, the photon rate of the sea surface is always lower than that of the land.

(1)$$N{P_t} = N{P_{sf}} + \left( {N{P_a} + N{P_{sb}} + N{P_d}} \right)$$

where $N{P_t}$ is the total photon number, $N{P_{sf}}$ is the number of signal photons, $N{P_a}$, $N{P_{sb}}$ and $N{P_d}$ are the number of photons introduced by atmosphere, solar and dark current of the instrument, respectively.

Therefore, a normalized photon rate ratio (NPRR) is proposed to measure the difference in reflectivity between land and sea as follows:

(2)$$\textrm{N}PR{R_i} = \frac{{P{R_{max}} - P{R_i}}}{{P{R_{max}} - P{R_{\textrm{min}}}}}$$

where $P{R_{max}}$, $P{R_{min}}$ and $P{R_i}$ refer to the largest photon rate, the smallest photon rate and photon rate of the ${i^{th}}$ data bin, respectively.

However, ATLAS works in photon counting mode, and its photomultiplier follows Poisson's distribution when recording photon counts [23], which may cause statistical differences in photon rate. Using NPRR alone is not enough. Therefore, the data difference in topography is considered. In most nearshore environments, as shown in Fig. 2, the elevation of the sea surface is always lower than that of the land. To evaluate the elevation difference, it is not feasible to directly calculate the mean elevation of each data bin considering the influence of noise photons. Therefore, the data are divided by collection time or along-track distance into several data bins, and the single peak Gaussian function is used to fit the photon elevation distribution [31]. The signal photons are denser than the noisy photons, so the mean value of the fitting results represents the land or sea surface elevation. Thus, the improved normalized photon rate-elevation ratio (NPRER) is proposed making the difference between land and sea more obvious.

(3)$$\textrm{N}PR\,{\rm E}{R_i} = \frac{{P{R_{max}} - P{R_i}}}{{P{R_{max}} - P{R_{\textrm{min}}}}}\cdot\frac{{{{\rm E}_{max}} - {{\rm E}_i}}}{{{E_{max}} - {E_{\textrm{min}}}}}$$

where ${E_{max}}$, ${E_{min}}$ and ${E_i}$ refer to the largest mean value, the smallest mean value and mean value of the fitting result of the ${i^{th}}$ data bin, respectively.

4. Automatic land-sea classification method

Although the machine-learning-based method has been applied to the high-resolution land-sea classification of the photon-counting LiDAR data, the need to collect samples manually limits this method in the variable nearshore environment. According to the difference in reflectivity and elevation in the nearshore environment, NPRER can automatically classify sea bins in the data. However, the reflection of water has strong directivity. When the sea surface is nearly straight with the ATLAS photon detector under the influence of waves, the photon rate $P{R_i}$ would be exaggerated. The NPRER is a description of the statistical characteristics of land and sea used to measure the possibility of sea detection, but not the only basis for land-sea classification.

Therefore, a novel automatic land-sea classification method is proposed, as shown in Fig. 3. Firstly, the NPRER of each data bin is calculated and adjusted to a form favorable for classification. The potential sea bins are extracted according to the classification threshold automatically obtained by the Otsu method [32]. The upper boundary of the sea surface is estimated so that the preliminary land-sea classification can be carried out. Then, the preliminary classification results are used as training labels corresponding to different photon features, the random forests classifier is trained, and data are reclassified. In post-processing enhancement, the salt-and-pepper effects are checked, and the misclassified results are corrected.

Fig. 3. The workflow of automatic land-sea classification method.

Download Full Size | PDF

4.1 Preliminary classification

The preliminary classification aims to obtain classified labels automatically. In the NPRER calculation, ICESat-2 data is divided into data bins according to the time or along-track distance, and its NPRER is calculated. To keep consistent with the length of data bin in ICESat-2 ATL03 data, our study divides the data every 20 m in the along-track direction. Although NPRER makes the difference between land and sea apparent, its value range is [0, 1], so getting an accurate classification threshold is not easy. A feasible method is to convert NPRER into the form shown in the following formula, to not only enlarge the difference between land and sea but also keep the index at the same magnitude:

(4)$$C{I_i} = \textrm{lg}\left( {{{\frac{1}{{NPRER}}}_i}} \right)$$

where $C{I_i}$ is the classification index (CI) of the ${i^{th}}$ data bin and $\textrm{lg}$ refers to the logarithm with base 10.

The Otsu method automatically obtains the threshold to divide CI into land and sea parts [32]. The Otsu method takes the interclass variance as the reference and the CI with the largest variance as the classification threshold. Specifically, let n be the number of all data bins, and t be the number of potential sea bins, and the Otsu method is calculated as follows:

(5)$${\sigma ^2} = {\omega _s}(\textrm{t} ){({{{\mu}_s}(\textrm{t} )- {\mu} (\textrm{t} )} )^2} + {\omega _l}(\textrm{t} ){({{{\mu}_l}(\textrm{t} )- {\mu} (\textrm{t} )} )^2}$$

where ${\sigma ^2}$ is the interclass variance, ${\omega _s}(t )$ is the proportion of potential sea bins, ${\omega _l}(t )$ is the proportion of potential land bins, ${\mu _s}(t )$ represents the average CI of potential sea bins, ${\mu _l}(t )$ represents the average CI of potential land bins, and $\mu (t )$ represents the average CI of all data bins, as follows:

(6)$${\omega _s}(t )= \frac{t}{n}$$

(7)$${\omega _l}(t )= 1 - \frac{t}{n}$$

(8)$${\mu _s}(t )= \frac{{\mathop \sum \nolimits_{i = 1}^t C{I_i}}}{t}$$

(9)$${\mu _l}(t )= \frac{{\mathop \sum \nolimits_{i = t + 1}^n C{I_i}}}{{n - t}}$$

(10)$$\mu (t )= {\omega _s}(t ){\mu _s}(t )+ {\omega _l}(t ){\mu _l}(t )$$

If t changes, ${\mu _s}(t )$ and ${\mu _l}(t )$ are recalculated, and ${\sigma ^2}$ is refreshed. When ${\sigma ^2}$ becomes the largest, the CI at this time is regarded as the optimal threshold, and the data bins with CI smaller than this threshold are labeled as potential sea bins.

Note that this classified label is only used to help understand the situation of the sea surface in the nearshore environment and to estimate the upper boundary of the sea surface by utilizing the continuity of the sea according to the following equation:

(11)$${E_{\textrm{UP}}} = \textrm{max}({{E_i}} )$$

where ${E_{UP}}$ represents the elevation of the upper boundary of the sea surface, ${E_i}$ refers to the mean value of gaussian fitting of the ${i^{th}}$ data bin [31].

The estimated upper boundary of the sea surface is used to classify the data preliminarily. The data bin with ${E_i} \le {E_{UP}}$ are labeled as the sea, and the remaining data bins are labeled as land. After the above steps, the preliminary classification results are used as training labels in reclassifying.

4.2 Reclassification

After the data is divided into data bins, the difference in land cover can be calculated and described. For each data bin, seven photon features are used for reclassification (Table 2).

Among them, the height features are used to measure the height difference between land and sea. In contrast, the morphology features describe the difference in photon rate by calculating the morphological characteristics of gaussian fitting results. The calculation formulas of the morphological features are as follows:

(12)$$\textrm{max} = \textrm{max}({\textrm {p}{\textrm{r}_\textrm{i}}} )$$

(13)$${S_k} = \frac{1}{{{e_r} - {e_l}}}\mathop \sum \nolimits_{{e_l}}^{{e_r}} {\left( {\frac{{\textrm{pr}{_i} - {\mathrm{\mu}_{\textrm{pr}}}}}{{{\sigma_{\textrm{pr}}}}}} \right)^3}$$

(14)$${\textrm K_u} = \frac{1}{{{e_r} - {e_l}}}\mathop \sum \nolimits_{{e_l}}^{{e_r}} {\left( {\frac{{\textrm{pr}{_i} - {\mathrm{\mu}_{\textrm{pr}}}}}{{{\sigma_{\textrm{pr}}}}}} \right)^4}$$

where $p{r_i}$ represents the fitting result corresponding to the ${i^{th}}$ height segment. For convenience of calculation, it is divided every 1 m in the elevation direction. ${\mu _{pr}}$ and ${\sigma _{pr}}$ correspond to the mean and variance of gaussian fitting result of current data bin, respectively. ${e_r}$ and ${e_l}$ are the right and left boundary of $Amp$, which is usually set to the elevation where the gaussian fitting result is equal to 10% of $Amp$.

Random forest is a supervised classifier composed of several decision trees. Each decision tree only extracts a part of the training data to form a sample set and is verified by the remaining data. This classifier refers to multiple results of the decision tree. It uses a voting mechanism to generate the final output, which has been proved to have good generalization ability and prediction ability in the land cover classification of photon-counting LiDAR data. In this study, the number of decision trees is set to 500, and 4 random variables are set in each decision tree.

The classified labels in preliminary classification are used to train the random forest classifier. After training, the random forest is used to predict the categories of each data bin. Therefore, this method is a labor-saving choice to realize the automatic land-sea classification in the nearshore environment without inputting parameters or manual intervention.

4.3 Post-processing enhancement

To make the sea-land classification method have a reliable theoretical basis, a simple post-processing enhancement is designed to remove the salt-and-pepper effects from the reclassification results [22]. This step can keep the classification results consistent with the visual perception, although it will not significantly improve the accuracy.

The enhancement method checks four adjacent bins of each data bin (2 bins on the right and two on the left). If the classified labels of the adjacent bins are inconsistent with the label of the current data bin, it is due to the salt-and-pepper effect, and the label of the current data bin should be changed. For example, if all the adjacent bins are labeled as the sea, and the current bin is labeled as land, the label of the current data bin should be changed to sea. After each data bin is checked, the post-processing enhancement is completed.

5. Results

5.1 Verification of NPRER and preliminary classification

NPRER is designed to measure the distribution of the sea surface appearance. During classification, NPRER is converted into CI, as shown in the formula (4). Figure 4 shows the calculation results from various coastal types. The left side of Fig. 4 shows the CI values of all the photons in various coastal types. In nearshore environments, the photon rate of the sea surface is lower, and the elevation of the sea surface is lower, so the CI values are smaller than the land surface. Therefore, the CI results of the sea are bluer, and the CI results of the land are greener. In the muddy and rocky coast (see Fig. 4 (c) and (e)), the CI results of the sea are the bluest part. Some abnormal results can be observed on the manmade coast (the positions marked by the red arrow in Fig. 4 (a)), where the CI values are larger than the surrounding sea. This is because of the sea surface's specular reflection and wave angle, which makes the photon rates in these positions much higher than in other positions. Generally, the NPRER is an effective index to describe the difference between land and sea, while the CI can classify land and sea in manmade, muddy, or rocky coasts.

Fig. 4. The CI values of all the photons in various coastal types and their corresponding histograms. (a) The CI values of photons in manmade coast; (b) the corresponding CI histogram of manmade coast; (c) the CI values of photons in muddy coast; (d) the corresponding CI histogram of muddy coast; (e) the CI values of photons in rocky coast; (f) the corresponding CI histogram of rocky coast. The data in manmade coast was acquired on the night of January 26th, 2021, and its track number is gt1r. The data in muddy coast was acquired on the night of August 30th, 2021, and its track number is gt2r. The data in rocky coast was acquired on the night of April 28th, 2021, and its track number is gt3r.

Download Full Size | PDF

On the right side of Fig. 4 are the corresponding CI histograms, in which the gray line is the threshold obtained by the Otsu method, the blue part represents the potential sea, and the green part is the potential land. No matter how the frequency of CI is affected by different coastal types and data collection time, the classification thresholds are always between the peak of the sea and the peak of land, which illustrates the superior performance of the Otsu method.

After calculating the classification threshold by the Otsu method and extracting potential sea bins, the upper boundary of the sea surface can be estimated. The data bins with ${E_i}$ lower than the boundary can be preliminarily classified as the sea, and the rest can be labeled as land. Figure 5 (a), 6 (a), and 7 (a) show the preliminary classification results of manmade coast, muddy coast, and rocky coast, respectively, in which the blue dots represent the sea, and the green dots represent the land. The accuracy of each result is shown on the upper left. Figure 5 (b), 6 (b), and 7 (b) show the trajectories on the high-resolution images, and the land-sea boundaries are marked with red lines. The accuracies of the preliminary classification results are above 90%, regardless of the coastal type. In particular, the elevation difference between land and sea on the muddy coast shown in Fig. 6 (a) needs to be made apparent, making it difficult to classify, resulting in errors marked by the red arrow. However, most of the sea photons are accurately extracted.

Fig. 5. The land-sea classification results in manmade coast. (a) The result after preliminary classification; (b) the trajectory of preliminary classification result; (c) the result after reclassification; (d) the trajectory of reclassification result; (e) the result after post-processing enhancement; (f) the trajectory of post-processing enhancement result. This data was acquired on the night of January 26th, 2021, and its track number is gt1r.

Download Full Size | PDF

Fig. 6. The land-sea classification results in muddy coast. (a) The result after preliminary classification; (b) the trajectory of preliminary classification result; (c) the result after reclassification; (d) the trajectory of reclassification result; (e) the result after post-processing enhancement; (f) the trajectory of post-processing enhancement result. This data was acquired on the night of August 30th, 2021, and its track number is gt2r.

Download Full Size | PDF

Fig. 7. The land-sea classification results in rocky coast. (a) The result after preliminary classification; (b) the trajectory of preliminary classification result; (c) the result after reclassification; (d) the trajectory of reclassification result; (e) the result after post-processing enhancement; (f) the trajectory of post-processing enhancement result. This data was acquired on the night of April 28th, 2021, and its track number is gt3r.

Download Full Size | PDF

Table 3 further shows the classification accuracy of the different coastal types after every step. The overall accuracy after preliminary classification is over 91%. Comparing the preliminary classification accuracy in different coastal types, the accuracy of the rocky coast is the highest, reaching more than 97%. In contrast, that of manmade or muddy coasts is slightly lower. Therefore, the preliminary classification is effective.

Table 3. Classification accuracy of different coastal type

View Table | View all tables in this article

5.2 Verification of reclassification

Although the preliminary classification has achieved good results, the errors marked by the red arrows in Fig. 6 (a) are still common, and the land-sea boundary needs further clarification. Therefore, we verified the performance of reclassification, and the results are shown in Fig. 5 (c), 6 (c), and 7 (c), while their trajectories are shown in Fig. 5 (d), 6 (d) and 7 (d).

We have found that the reclassification has not adjusted the results because of the high accuracy of the initial classification. Especially when dealing with data on the rocky coast (see Fig. 7(c)), reclassification did not change the preliminary classification results since the accuracy before and after reclassification did not change. Considering that the accuracy rate is 98.76%, this is acceptable. For the result of manmade or muddy coast, the positions adjusted by reclassification have been marked with red arrows in Fig. 5 (c) and 6 (c). The reclassification does improve the accuracy. As shown in Fig. 6 (a) and (c), some land locations close to the sea were mistakenly marked as sea during the preliminary classification, but the errors were corrected after reclassification. Depending on the random forest classifier, the land-sea boundary was carefully adjusted during reclassification. As shown by the red arrows in Fig. 5 (c), the reclassification step adjusted the position of the land and sea boundary. However, the random forest classifier does not consider the continuity of land cover, and the salt and pepper effects, as shown in Fig. 6(c), appear in the reclassification results.

It can be known from Table 3 that although the adjustments by the reclassification are not visually apparent, it does further improve the classification accuracy. Compared with the preliminary classification results, the relative accuracy of the reclassification is improved by 7.41%, which we consider an accurate land-sea classification. Because the data on the rocky coast after the preliminary classification has achieved high accuracy (96.98%), the reclassification has yet to significantly improve this coastal type's accuracy. Due to the adjustment of the land-sea boundary and errors, the accuracy of the reclassified results in manmade coast and muddy coast reached 97.68% and 96.88%, respectively, which was close to the accuracy of the rocky coast.

5.3 Effects of post-processing enhancement

The post-processing enhancement aims to remove the salt and pepper effects in the results so that the results are consistent with a visual inspection. In most cases, such as the results in Fig. 5 and 7, the effect of this enhancement is difficult to detect, and it can only be found when there are apparent salt and pepper effects, as shown in Fig. 6. Table 3 also verifies this conclusion. Compared with the reclassification results, the accuracy after post-processing enhancement still needs to be improved. After post-processing enhancement, the overall accuracy only increased by 0.11%. However, the post-processing enhancement step is still necessary to eliminate the salt-and-pepper effect and make the results consistent with that observed by eyes.

After preliminary classification, reclassification, and post-processing enhancement, the overall accuracy of land-sea classification has been improved from 90.62% to 97.98%. Therefore, this study's automatic land-sea classification method is effective and can obtain accurate classification results.

6. Discussion

6.1 Effects of costal type, data collection time and feature sets

Although the proposed automatic land-sea classification method can reach an overall accuracy of nearly 98%, it is still necessary to discuss the effects of different coastal types, data collection time, and classification feature sets on the classification results to know the most suitable processing scenario of the method.

For the effect of coastal types, Table 3 lists the classification accuracy of each coastal type after each step. In contrast, Table 4 shows the influence of different feature sets in each coastal type. After preliminary classification, the rocky coast can quickly achieve high accuracy (96.98%), and the results in the rocky coast have little relationship with feature sets. Among all the types, the classification accuracy of muddy coast is always the lowest, which poses the greatest challenge to the classification method. One possible explanation is that the elevation difference partly influences the classification accuracy among manmade, muddy, and rocky coasts. On the rocky coast, the land part is often a hillside, and the elevation difference between land and sea is noticeable, which makes the difference between NPRER and CI more significant. During preliminary classification, the estimated upper boundary of the sea surface on the rocky coast can accurately separate the sea from the land. As for the muddy coast, due to the fuzzy border between land and sea, the estimated upper boundary of the sea surface cannot always distinguish the sea from the land. Although the muddy coast is challenging, its classification accuracy is over 97% after all steps, which is slightly inferior to the rocky coast.

Table 4. Overall accuracy using different feature sets in each coastal type

View Table | View all tables in this article

The signal-to-noise ratio (SNR) of data collected at different times is variable. The daytime data is affected by solar noise and contains several times more noise photons than nighttime data. Because the object of land-sea classification is the original ICESat-2 data, the collection time may also affect the classification accuracy. Table 5 shows the accuracy of land-sea classification results at different data collection times. The accuracy when dealing with nighttime data is 98.43%, slightly higher than daytime data. Therefore, using the single peak gaussian function to fit the photon elevation distribution [23,28] is not affected by the data collection time. It can accurately estimate the surface elevation and calculate NPRER under different SNRs. In addition, considering the change in collection time is necessary for the proposed land-sea classification method to achieve an accuracy higher than 95% and perform well.

Table 5. Overall accuracy using different feature sets at different data collection times

View Table | View all tables in this article

The verification of classification results shows that the feature sets can improve the accuracy of land-sea classification. As shown in Table 4 and 5, using height features and morphology features at the same time is the best choice by further analyzing the effect of different feature sets. Both height features and morphology features can improve accuracy. Compared with height features, morphology features often contribute several percent to the improvement of accuracy, which is more conducive to classification. Rather than the height percentiles, amplitude, skewness, and kurtosis are less influenced by the coastal type and data collection time, so the description of the data bin is more appropriate.

6.2 Innovations, applications, and limitations

To meet the needs of high-resolution land-sea classification of ICESat-2 data, the differences between land and sea in photon rate and elevation have been summarized for the first time by an index called NPRER, and inspired by this, an automatic land-sea classification algorithm is proposed. The results show that NPRER can measure the distribution of sea appearance in the nearshore environment. The proposed land-sea classification algorithm can accurately classify ICESat-2 data with a resolution of 20 m without human participation. The results of each step in the classification method are analyzed in detail, and the effects of different coastal types, data collection time, and feature sets on the classification accuracy are quantitatively evaluated.

It has been more than four years since ICESat-2 entered orbit. This technology can not only be used to obtain accurate land and sea data but also help to understand the changes and laws of land-sea boundaries on a global scale. The classification method proposed in this paper is expected to improve the automation degree of land-sea classification of satellite-based data and help researchers get rid of tedious work.

However, this study has its limitations. Although there is always an elevation difference between the land and the sea, NPRER assumes that the sea level is lower than the land, which is only sometimes appropriate. If the local area is lower than the sea level, such as Amsterdam, Netherlands, one of the coasts with the most human activities, the classification results will become unreliable. Fortunately, with prior knowledge, it is easy to adjust NPRER. Through the results obtained in this study, we have every reason to believe that the automatic land-sea classification method is effective in most areas.

7. Conclusion

The nearshore environment is one of the areas with the most frequent changes on the Earth. It is the promise of producing high-precision data products to realize the classification of ICESat-2 data into land and sea. In this paper, NPRER is designed to measure the possibility of sea appearance according to the differences of ICESat-2 data in photon rate and elevation. Inspired by this, an automatic land-sea classification method, including preliminary classification, reclassification, and post-processing enhancement, is proposed. To comprehensively evaluate the performance of our automatic land-sea classification method, the accuracy of each step was verified, and the effects of coastal type, data collection time, and classification feature sets on the accuracy were quantitatively evaluated. The results show that the overall accuracy of the proposed automatic classification method is better than 97%, and the effects of environmental factors on the results are minimal, which helps improve the automation degree of land and sea classification of satellite-based data. In the future, we will conduct experiments in challenging scenarios, including areas below sea level, to provide a more adaptive land-sea classification scheme.

Funding

National Natural Science Foundation of China (41371436, 41876105).

Acknowledgments

We thank the National Aeronautics and Space Administration (NASA) for providing ICESat-2 data used in the article.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. V. A. Casasanto, B. Campbell, A. Manrique, K. Ramsayer, T. Markus, and T. Neumann, “Lasers, penguins, and polar bears: Novel outreach and education approaches for NASA's ICESat-2 mission,” Acta Astronaut. 148, 396–402 (2018). [CrossRef]

2. T. Markus, T. Neumann, A. Martino, W. Abdalati, K. Brunt, B. Csatho, S. Farrell, H. Fricker, A. Gardner, D. Harding, M. Jasinski, R. Kwok, L. Magruder, D. Lubin, S. Luthcke, J. Morison, R. Nelson, A. Neuenschwander, S. Palm, S. Popescu, C. K. Shum, B. E. Schutz, B. Smith, Y. Yang, and J. Zwally, “The Ice, Cloud, and land Elevation Satellite-2 (ICESat-2): Science requirements, concept, and implementation,” Remote Sens. Environ. 190, 260–273 (2017). [CrossRef]

3. X. Wang, X. Cheng, P. Gong, H. Huang, Z. Li, and X. Li, “Earth science applications of ICESat/GLAS,” International Journal of Remote Sensing 32(23), 8837–8864 (2011). [CrossRef]

4. T. E. Fatoyinbo and M. Simard, “Height and biomass of mangroves in Africa from ICESat/GLAS and SRTM,” International Journal of Remote Sensing 34(2), 668–681 (2013). [CrossRef]

5. A. Neuenschwander and K. Pitts, “The ATL08 land and vegetation product for the ICESat-2 Mission,” Remote Sens. Environ. 221, 247–259 (2019). [CrossRef]

6. S. C. Popescu, T. Zhou, R. Nelson, A. Neuenschwande, R. Sheridan, L. Narine, and K. M. Walsh, “Photon counting LiDAR: An adaptive ground and canopy height retrieval algorithm for ICESat-2 data,” Remote Sens. Environ. 208, 154–170 (2018). [CrossRef]

7. G. Zhang, S. Xing, Q. Xu, P. Li, D. Wang, X. Zhang, and K. Chen, “Ground Photon Extraction From Photon-Counting LiDAR Data Using Adaptive Cloth Simulation With Terrain Index,” IEEE Geosci. Remote Sensing Lett. 19, 1–5 (2022). [CrossRef]

8. P. Voosen, “Ice monitor delivers a bonus: seafloor maps,” Science 368(6488), 224 (2020). [CrossRef]

9. C. E. Parrish, L. A. Magruder, A. L. Neuenschwander, N. Forfinski-Sarkozi, M. Alonzo, and M. Jasinski, “Validation of ICESat-2 ATLAS Bathymetry and Analysis of ATLAS's Bathymetric Mapping Performance,” Remote Sens. 11(14), 1634–1653 (2019). [CrossRef]

10. S. Nie, C. Wang, X. Xi, S. Luo, G. Li, J. Tian, and H. Wang, “Estimating the vegetation canopy height using micro-pulse photon-counting LiDAR data,” Opt. Express 26(10), A520–A540 (2018). [CrossRef]

11. L. L. Narine, S. C. Popescu, and L. Malambo, “Using ICESat-2 to Estimate and Map Forest Aboveground Biomass: A First Example,” Remote Sens. 12(11), 1824–1840 (2020). [CrossRef]

12. C. S. Gardner, “Rnging performance of satellite laser altimeters,” IEEE Trans. Geosci. Remote Sensing 30(5), 1061–1072 (1992). [CrossRef]

13. C. S. Gardner, “Target signatures for laser altimeters - an analysis,” Appl. Opt. 21(3), 448–453 (1982). [CrossRef]

14. Z. Zhang, Y. Ma, N. Xu, S. Li, J. Sun, and X. H. Wang, “Theoretical background noise rate over water surface for a photon-counting lidar and its application in land and sea cover classification,” Opt. Express 27(20), A1490–A1505 (2019). [CrossRef]

15. G. Zhang, Q. Xu, S. Xing, P. Li, X. Zhang, D. Wang, and M. Dai, “A Noise-Removal Algorithm Without Input Parameters Based on Quadtree Isolation for Photon-Counting LiDAR,” IEEE Geosci. Remote Sensing Lett. 19, 1–5 (2022). [CrossRef]

16. A. J. Brown, “Equivalence relations and symmetries for laboratory, LIDAR, and planetary Müeller matrix scattering geometries,” J. Opt. Soc. Am. A 312789–2794 (2014). [CrossRef]

17. A. J. Brown, T. Michaels, S. Byrne, W. Sun, T. N. Titus, A. Colaprete, M. J. Wolff, G. Videen, and C. J. Grund, “The Science Case for a Modern, Multi-Wavelength, Polarization-Sensitive LIDAR in Orbit around Mars,” J. Quant. Spectrosc. Radiat. Transfer 153, 131–143 (2015). [CrossRef]

18. A. J. Brown and Y. Xie, “Symmetry Relations Revealed in Mueller Matrix Hemispherical Maps,” J. Quant. Spectrosc. Radiat. Transfer 113(8), 644–651 (2012). [CrossRef]

19. B. C. Gao, “NDWI - A normalized difference water index for remote sensing of vegetation liquid water from space,” Remote Sens. Environ. 58(3), 257–266 (1996). [CrossRef]

20. S. K. McFeeters, “The use of the normalized difference water index (NDWI) in the delineation of open water features,” Int. J. Remote Sens. 17(7), 1425–1432 (1996). [CrossRef]

21. W. Y. Yan, A. Shaker, and P. E. LaRocque, “Scan Line Intensity-Elevation Ratio (SLIER): An Airborne LiDAR Ratio Index for Automatic Water Surface Mapping,” Remote Sens. 11(7), 814–834 (2019). [CrossRef]

22. A. Shaker, W. Y. Yan, and P. E. LaRocque, “Automatic land-water classification using multispectral airborne LiDAR data for near-shore and river environments,” ISPRS J. Photogramm. Remote Sens. 152, 94–108 (2019). [CrossRef]

23. T. Neumann, A. Brenner, D. Hancock, J. Robbins, J. Saba, K. Harbeck, A. Gibbons, and J. Lee, S.B. :ithcke, T. Rebold. “ICESat2 Algorithm Theoretical Basis Document for Global Geolocated Photons (ATL03)”, (2021), Available online: https://icesat2.gsfc.nasa.gov/sites/default/files/page_files/ICESat2_ATL03_ATBD_r004.pdf (accessed on 21 October 2022).

24. Y. Ma, W. Zhang, J. Sun, G. Li, X. H. Wang, S. Li, and N. Xu, “Photon-Counting Lidar: An Adaptive Signal Detection Method for Different Land Cover Types in Coastal Areas,” Remote Sens. 11(4), 471–489 (2019). [CrossRef]

25. R. Kwok, A. A. Petty, M. Bagnardi, N. T. Kurtz, G. F. Cunningham, A. Ivanoff, and S. Kacimi, “Refining the sea surface identification approach for determining freeboards in the ICESat-2 sea ice products,” Cryosphere 15(2), 821–833 (2021). [CrossRef]

26. T. R. McVicar, T. G. Van Niel, L. T. Li, M. L. Roderick, D. P. Rayner, L. Ricciardulli, and R. J. Donohue, “Wind speed climatology and trends for Australia, 1975-2006: Capturing the stilling phenomenon and comparison with near-surface reanalysis output,” Geophys. Res. Lett. 35(20), L20403 (2008). [CrossRef]

27. H. Xie, Y. Sun, X. Liu, Q. Xu, Y. Guo, S. Liu, X. Xu, S. Liu, and X. Tong, “Shore Zone Classification from ICESat-2 Data over Saint Lawrence Island,” Marine Geodesy 44(5), 454–466 (2021). [CrossRef]

28. R. Kwok, S. Kacimi, T. Markus, N. T. Kurtz, M. Studinger, J. G. Sonntag, S. S. Manizade, L. N. Boisvert, and J. P. Harbeck, “ICESat-2 Surface Height and Sea Ice Freeboard Assessed With ATM Lidar Acquisitions From Operation IceBridge,” Geophys. Res. Lett. 46(20), 11228–11236 (2019). [CrossRef]

29. A. A. Petty, M. Bagnardi, N. T. Kurtz, R. Tilling, S. Fons, T. Armitage, C. Horvat, and R. Kwok, “Assessment of ICESat-2 Sea Ice Surface Classification with Sentinel-2 Imagery: Implications for Freeboard and New Estimates of Lead and Floe Geometry,” Earth Space Sci. 8(3), 1–17 (2021). [CrossRef]

30. M. O. Lammers, M. Castellote, R. J. Small, S. Atkinson, J. Jenniges, A. Rosinski, J. N. Oswald, and C. Garner, “Passive acoustic monitoring of Cook Inlet beluga whales (Delphinapterus leucas),” J. Acoust. Soc. Am. 134(3), 2497–2504 (2013). [CrossRef]

31. A. P. Greeley, T. A. Neumann, N. T. Kurtz, T. Markus, and A. J. Martino, “Characterizing the System Impulse Response Function From Photon-Counting LiDAR Data,” IEEE Trans. Geosci. Remote Sensing 57(9), 6542–6551 (2019). [CrossRef]

32. N. Otsu, “Threshold selection method from gray-level histograms,” IEEE Trans. Syst., Man, Cybern. 9(1), 62–66 (1979). [CrossRef]

Feature sets	Name	Symbol	Description
height	RH25	$R H_{25}$	The height features are calculated at the height percentile of 25, 50, 75 and 98.
	RH50	$R H_{50}$
	RH75	$R H_{75}$
	RH98	$R H_{98}$
morphological	Amplitude	$A m p$	Amplitude refers to the maximum value of Gaussian fitting results.
	Skewness	$S_{k}$	Skewness is used to describe the direction and degree of deviation of gaussian fitting results.
	Kurtosis	$K_{u}$	Kurtosis is used to describe the average value of gaussian fitting results to measure the steepness of gaussian curve.

Feature sets	Classification accuracy (%)
Feature sets	Manmade	Muddy	Rock	Overall
only height features	92.62	91.88	96.99	90.42
only morphology	97.35	92.17	97.47	95.48
features	97.84	97.02	99.42	97.98

Feature sets	Classification accuracy (%)
Feature sets	Daytime	Nighttime	Overall
only height features	93.31	93.66	93.42
only morphology	94.20	97.51	95.48
features	95.87	98.43	97.98

Feature sets	Name	Symbol	Description
height	RH25	$R H_{25}$	The height features are calculated at the height percentile of 25, 50, 75 and 98.
	RH50	$R H_{50}$
	RH75	$R H_{75}$
	RH98	$R H_{98}$
morphological	Amplitude	$A m p$	Amplitude refers to the maximum value of Gaussian fitting results.
	Skewness	$S_{k}$	Skewness is used to describe the direction and degree of deviation of gaussian fitting results.
	Kurtosis	$K_{u}$	Kurtosis is used to describe the average value of gaussian fitting results to measure the steepness of gaussian curve.

Feature sets	Classification accuracy (%)
Feature sets	Manmade	Muddy	Rock	Overall
only height features	92.62	91.88	96.99	90.42
only morphology	97.35	92.17	97.47	95.48
features	97.84	97.02	99.42	97.98

Automatic land-sea classification in a nearshore environment using satellite-based photon-counting LiDAR data

Abstract

1. Introduction

2. Materials

2.1 Study site

2.1 ICESat-2 ATL03 data

3. Normalized photon rate-elevation ratio (NPRER)

4. Automatic land-sea classification method

4.1 Preliminary classification

4.2 Reclassification

4.3 Post-processing enhancement

5. Results

5.1 Verification of NPRER and preliminary classification

5.2 Verification of reclassification

5.3 Effects of post-processing enhancement

6. Discussion

6.1 Effects of costal type, data collection time and feature sets

6.2 Innovations, applications, and limitations

7. Conclusion

Funding

Acknowledgments

Disclosures

Data availability

References

Data availability

Cited By

Figures (7)

Tables (5)

Equations (14)

Optics Express

Costal type	Collection date	Collection time
manmade	2021-01-26	nighttime
manmade	2021-07-27	daytime
muddy	2021-08-30	nighttime
muddy	2022-02-28	daytime
rocky	2021-04-28	daytime
rocky	2022-04-27	daytime

Costal type	Classification accuracy (%)
Costal type	Preliminary classification	Reclassification	Post-processing enhancement
manmade	91.39	97.68	97.84
muddy	91.20	96.88	97.02
rocky	96.98	99.39	99.42
overall	92.62	97.87	97.98