Open Access
12 March 2024 Estimation of shallow bathymetry using Sentinel-2 satellite data and random forest machine learning: a case study for Cheonsuman, Hallim, and Samcheok Coastal Seas
Jae-yeop Kwon, Hye-kyeong Shin, Da-hui Kim, Hyeon-gyu Lee, Jin-kwang Bouk, Jung-hyun Kim, Tae-ho Kim
Author Affiliations +
Abstract

Bathymetry, the measurement of sea depth, has conventionally been conducted using echo-sounders on vessels. However, various factors limit conventional shipborne surveys in coastal regions, including data continuity, geographic obstacles, diplomatic concerns, and marine infrastructures. Remote sensing technology can address these limitations, particularly with the advancement of satellite imaging technology. Indeed, many studies are underway to develop machine learning-based water depth estimation technologies. However, previous studies have focused on clear waters with low turbidity or uniform seabed sediment. Therefore, in this study, we developed a satellite-derived bathymetry (SDB) model using the random forest machine learning algorithm, which was applied to three coastal areas around the Korean Peninsula with distinct characteristics: clear waters (Samcheok), high turbidity (Cheonsuman), and varied seabed sediments (Hallim). We then compared the accuracy of the bathymetric mapping data derived in these three areas. The estimated depth values exhibited the highest accuracy in Samcheok, followed by Hallim and Cheonsuman. Based on Worldview-3 images and on-site surveys, we confirmed the presence of basalt on the seabed. However, the remote reflectance was attenuated due to the effect of the black rock, leading to an overestimation of the depth. In the future, additional satellite images will be applied as training data for the machine learning model to advance the SDB technology using turbidity and seabed sediment distribution data for each area. Ultimately, the SDB results will be applied as depth monitoring data to facilitate safe ship passage in coastal areas, including ports that require periodic and consistent coastal bathymetry. In addition, they can be applied as input data for numerical ocean models, contributing to various fields.

1.

Introduction

The depth of coastal waters is a crucial factor influencing marine environmental management, marine hydrodynamic structures, marine infrastructures, and ship navigation safety. Traditionally, bathymetric surveys have been conducted using shipborne echo-sounders, which calculate ocean depths based on the time it takes for sound waves to reflect off the seabed and return. However, this method is limited in its ability to investigate the depth across large maritime expanses in a short period due to budget constraints and weather conditions.13 Accordingly, the demand for indirect bathymetric survey techniques continues to increase when direct surveys are not possible due to geographical or diplomatic reasons, high ship traffic, or shallow areas that are inaccessible by ships.46 To overcome the limitations of contact-based bathymetry methods, extensive research on satellite-derived bathymetry (SDB) is underway globally. This technique provides low-cost, high-efficiency bathymetric surveys in shallow waters, meeting the requirements for short cycles and timeliness.13,59

Water depth is estimated by SDB technology based on correlations between the remotely sensed reflectance values of satellite imagery observed with optical multispectral sensors and the water depth during image acquisition. While SDB can be generally applied to depths up to 20 m, it may only be applicable to depths of up to 10 m, depending on the characteristics of the marine region.5,10,11 Indeed, the reflectance of light penetrating the water decreases exponentially with increasing water depth, and the attenuation of remote reflectance varies by wavelength.3,7 The Lyzenga linear band model is a widely used simple SDB model that defines a linear relationship, assuming that the seabed reflectance is linearly related to the depth variation. Meanwhile, the Stumpf logarithmic band ratio model estimates water depth using the reflectance ratio of the seabed to the water based on various experimental parameters.3,8,9,1214 However, these models adopt empirical methodologies; consequently, the input values for depth estimation are variable between marine regions, impeding the construction of a universally applicable model.3 Accordingly, recent research has actively focused on developing depth estimation models using various machine learning algorithms and independent satellite imagery to ensure universality.13,5,11,15 Random forest (RF) is a machine learning algorithm that falls under the category of decision tree learning. It is commonly employed for tasks involving classification and regression analysis.16 In particular, its capacity to readily adjust variables and parameters, combined with its capacity to efficiently handle large amounts of data, makes RF a commonly used modality in SDB research for the construction of regression models.17 Indeed, RF models exhibit lower errors than other SDB machine learning techniques, making it advantageous for the generation of accurate models.15,1826

Most SDB research has been conducted in marine waters, where the concentration of suspended solids is low, and the concentration of phytoplankton is less than an annual average of 1.0 mg/m3, the underwater transparency is extremely high, or in atolls and coastal waters with low turbidity.2,3,5,911,2729 The coastal waters of the Korean Peninsula’s West, South, and East Seas differ considerably in marine environmental characteristics, including depth distribution, water turbidity, and sediment composition. The Yellow Sea (West Sea) seabed comprises sand and mud and is characterized by continuous sediment influx from rivers, seabed topography with low-gradient slopes, extensive tidal flats due to a large tidal range, shallower depths, strong tidal influence, and high underwater turbidity due to consistent tidal currents. In contrast, the East Sea has a simple coastline and a narrow continental shelf, leading to rapid depth increases from the coast and relatively clear water with low turbidity. Meanwhile, the South Sea presents characteristics intermediate between the East and Yellow Seas, with a more complex coastline dotted with many small- to medium-sized islands, and a seabed composed primarily of sand and mud.30 In addition, the coastal waters around Jeju Island have a mix of sandy and fine-grained shell sediments alongside basalt reefs.31,32 Domestic and international studies have applied various AI techniques to a single study area with clear waters and determined the most appropriate AI approach,15,33 or have applied an AI technique to multiple study areas with marine characteristics less affected by turbidity.5 However, no study has quantitatively evaluated satellite-based bathymetry results estimated using an SDB model developed with the same methodology for multiple study areas with distinct marine environmental characteristics, such as the West, South, and East Seas of the Korean Peninsula.

In this study, we aimed to develop a model for estimating water depths in three selected coastal areas with distinct marine environmental characteristics, specifically in terms of tide, seabed sediment, and turbidity. To this end, we utilized Sentinel-2 satellite imagery with a 10-m resolution and multibeam bathymetric data provided by the Korea Hydrographic and Oceanographic Agency (KHOA). This model employed the RF machine learning algorithm for training and evaluation datasets. The application of SDB was restricted to areas with depths up to 20 m, and a comparative test was performed using the bathymetric data acquired from KHOA. Furthermore, the potential sources of estimation errors were analyzed by considering the marine environmental characteristics unique to each area, and the feasibility of implementing SDB technology was evaluated.

2.

Materials and Methods

2.1.

Study Area

In this study, the East Sea, Yellow Sea, and South Sea, and the waters around Jeju Island were included in the analyses. To accurately represent the diverse marine environmental characteristics of the three seas surrounding the Korean Peninsula and generate optimal machine learning data along with credible depth estimation results, we selected Samcheok (East Sea), Hallim (South Sea), and Cheonsuman Bay (West Sea) as our training areas (Fig. 1 and Table 1). Specifically, Hallim was selected as an area representative of the South Sea based on the general characteristics of the South Sea coastal waters and the coexistence of sandy and basaltic seabed materials.

Fig. 1

Geographic location of training areas. Sentinel-2A/B RGB images of (a) Cheonsuman, (b) Hallim, and (c) Samcheok. The blue boxes indicate three additional test areas, different from the training areas. Deokjeok in Yellow Sea, Seongsan in South Sea, and Sokcho in East Sea.

JARS_18_1_014522_f001.png

Table 1

Geographic coordinates of training areas.

AreaCheonsuman (Yellow Sea)Hallim (South Sea)Samcheok (East Sea)
Latitude (WGS-84)36.40° N–36.61° N33.38° N–33.44° N37.39° N–37.48° N
Longitude (WGS-84)126.37° E–126.50° E126.20° E–126.27° E129.16° E–129.23° E

2.2.

Materials

2.2.1.

Satellite imagery

Level-2 multispectral instrument images, which underwent atmospheric correction to obtain surface reflectance (SR) products from the Sentinel-2A/B datasets provided by the European Space Agency, were used in this study. The images are freely available at the Copernicus Open Access Hub.34 This satellite captures images of the same area at 5-days intervals. For the machine learning model that estimated depths using the Sentinel-2 satellite imagery, we employed five bands: blue (B2), green (B3), red (B4), vegetation red edge (B5), and near-infrared (NIR, B8). The fifth band had a spatial resolution of 20 m and was, thus, resampled using a linear method to correspond with the 10-m spatial resolution of the other bands [Table 2].

Table 2

Characteristics of the Sentinel-2 band data used in this study.

BandB2B3B4B5B8
NameBlueGreenRedVegetation red edgeNIR
Wavelength (center) (nm)458 to 523 (490)543 to 578 (560)650 to 680 (665)698 to 713 (705)758 to 899 (842)
Resolution (m)1010102010

For the training dataset, we utilized six to seven multi-temporal images per training area. This approach was adopted to mitigate the influence of different real water depth for each satellite image under large tidal variation area, which could be a limitation when using a single image5. Subsequently, we selected images of the training areas captured in 2020, the same year as the nautical chart’s production, focusing on images with minimal cloud coverage (10%) and lesser influences of turbidity and waves (Table 3).

Table 3

Sentinel-2 imagery used as training data for each area.

AreaTile numberDateTime (KST)
Cheonsuman (Yellow Sea)7-ImageT52SBFJanuary 18, 202011:20
January 23, 202011:19
March 8, 202011:16
April 4, 202011:15
April 4, 202011:16
October 24, 202011:17
November 8, 202011:18
Hallim (South Sea) 6-ImageT52SBCMarch 8, 202011:16
April 7, 202011:16
April 27, 202011:16
May 12, 202011:15
October 14, 202011:16
October 29, 202011:18
Samcheok (East Sea) 6-ImageT52SEGJanuary 5, 202011:10
January 15, 202011:10
October 26, 202011:07
November 10, 202011:09
November 25, 202011:10
December 20, 202011:11

2.2.2.

Bathymetric data

The depth data provided by KHOA were water depth values used for the latest nautical chart (echo-sounding in 2020), extracted and edited based on echo-sounder data, with the referenced depths based on the datum level (DL) (Fig. 2). Considering that a vector format was used for bathymetric data, they were resampled to align with the 10-m resolution of the satellite imagery grid. Drawing on prior research regarding the limitations of light penetration depth, our study focused solely on areas with depths <20  m.3,5,14 We employed tidal values from the corresponding timestamps for model training and result test to adjust for the datum.

Fig. 2

Real-depth in situ echo-sounding bathymetry data from the KHOA of (a) Cheonsuman, (b) Hallim, and (c) Samcheok.

JARS_18_1_014522_f002.png

2.2.3.

Tidal data

To obtain tidal data, we employed the NAO99.Jb tidal model provided by the National Astronomical Observatory of Japan.35 The NAO99.Jb tidal model with a spatial resolution of 1/12 deg (about 1 km) was designed for use in the Northwest Pacific region.36

To ascertain the precision of the NAO99.Jb model, its outputs were systematically compared to observation data. The detailed analysis was performed on three positions in Yellow Sea (Incheon, Pyeongtaek, and Anheung) with high tidal range [Fig. 3(a)]. The tidal amplitudes obtained from the tide stations at these locations were compared to the predicted results provided from the model. The error at all stations was calculated to be <5  cm (Table 4). These findings suggest that the NAO99.Jb model can be effectively utilized as tidal calibration data.

Fig. 3

(a) A map showing the locations of Incheon (37.4519° N, 126.5922° E), Pyeong-taek (37.1366° N, 126.5408° E), and Anheung (37.6747° N, 126.1294° E). (b) A time-series graph comparing values between the observation (black) and NAO99.Jb (red) from July 1, 2020, to July 3, 2020. The x axis is time and y axis is sea surface height.

JARS_18_1_014522_f003.png

Table 4

The amplitude of observed values from the three regions was compared with the amplitude of values from the NAO99.Jb model.

Amplitude (cm)IncheonPyeongtaekAnheung
Observation (①)464.0465.4354.7
NAO99.Jb (②)463.3465.0358.4
Difference (①-②)0.70.5−3.7

2.3.

Satellite-Derived Bathymetry Mechanism

2.3.1.

Beer–Lambert law

The total upwelling radiance (Lt) observed from the satellite was defined as the sum of atmospheric path radiance (Lp), specular radiance (Ls), subsurface volumetric radiance (Lv), and bottom radiance (Lb), as expressed by Eq. (1) (Fig. 4):

Eq. (1)

Lt=Lp+Ls+Lv+Lb.

Fig. 4

SDB mechanism for various sediment types.

JARS_18_1_014522_f004.png

In Eq. (1), Lp can be removed through atmospheric correction and Ls and Lv can be removed through sun-glint and deep-water corrections, respectively, leaving only Lb, defined by the Beer–Lambert law as per Eq. (2):

Eq. (2)

Lb=λλ+Δλ[ρ(λ)Lλe(secθ+secØ)α(λ)Z]dλ,
where λ is the wavelength, ρ(λ) is the bottom reflectance, α(λ) is the water’s attenuation coefficient, θ is the viewing angle (from the nadir), Φ is the solar-illumination angle (from the vertical), and Z is the water depth. Equation (2) indicates that reflectance decreases exponentially as the water depth increases; this decrease is more pronounced for longer wavelengths.

2.4.

Preprocessing of Satellite Images

2.4.1.

Sun-glint correction

Sun-glint is observed in satellite images when sunlight reflects directly into the sensor due to a tilted surface. This can arise from various factors, including the sea surface, sun’s position, sensor’s viewing angle, and wind. To enhance the accuracy of water reflectance results, sun-glint correction was applied.37 This correction typically involves establishing a linear relationship between the NIR band and other bands, followed by adjustments for outlier pixels.38,39 This procedure was conducted using the Sentinel Application Platform offered by the European Space Agency (Fig. 5).

Fig. 5

(a) Sentinel-2 red-green-blue (RGB) image of Cheonsuman in the Yellow Sea, (b) RGB image before sun-glint correction, and (c) RGB image after land masking and sun-glint correction.

JARS_18_1_014522_f005.png

2.4.2.

Land masking

In the NIR band, water intensely absorbs light, resulting in pixel reflectance values that approach 0. The normalized difference water index (NDWI) exploits this characteristic to differentiate between water and land by employing the NIR and green bands.40 This relationship is represented by Eq. (3), with NDWI values ranging from 1 to 1. Regions with an NDWI value >0 were categorized as sea, while those with <0 were identified as land [Fig. 5(c)]:

Eq. (3)

NDWI=GreenNIRGreen+NIR.

2.4.3.

Tidal correction

The depth indicated on the nautical chart is based on the DL standard, while the SDB result reflects the real water depth from the sea surface, influenced by the tide at the time of satellite imaging. Both the Yellow Sea and South Sea of Korea are strongly influenced by tides. Notably, in the Yellow Sea, the tidal range can span up to 10 m.41,42 Consequently, to generate an SDB result aligned with the DL standard, a datum correction accounting for the tidal level during the imaging time is necessary:

Eq. (4)

Ds(m)=MSL+tlMSL=DL+(Hm+Hs+Ho+Hk)+tlMSL,
where Ds(m) is the depth at the time of satellite imaging, tlMSL is the tidal level from the mean sea level (MSL), and Hm, Hs, Ho, and Hk denote the semi-range of the tidal constituents M2, S2, O1, and K1, respectively.

The tidal height data during satellite imaging were sourced from the tidal model, using the grid value nearest to the training area. Tidal heights are positive and negative for high and low tides, respectively, relative to the mean sea level. These values were input into Eq. (4) to determine the corrected depth from the nautical chart during imaging. The corrected depth was then utilized as the reference data for SDB model training. Similarly, this approach can change the depth value derived from the SDB model to DL. Predicting bathymetry through satellite imaging is the process of measuring the depth at the time the satellite images were taken. Therefore, for quantitative assessment, the depth value adjusted to the nautical chart datum must be used.

2.5.

Depth Estimation Using a Machine Learning Model

2.5.1.

Depth estimation model based on random forest

We constructed a model to estimate water depth using Sentinel-2A/B satellite images, electronic nautical chart data, and the tidal model (Fig. 6). The input data for the machine learning model were extracted from the Sentinel-2A/B images of the training areas, followed by sun-glint correction and land masking. Subsequently, we employed a mean filter to mitigate noise in satellite images by replacing the value of target pixels with the mean of their 3×3 pixels surroundings. Our model used depth data from the electronic nautical chart, corrected based on tidal height values extracted while capturing satellite images. We then matched the preprocessed satellite imagery of five bands with the reference depth data corrected through tidal model data. A dataset was created using matched data corresponding to the number of multi-temporal images. We grouped the band values and depth values for every pixel where band values of each image were present. Finally, the dataset was composed by creating a random sample from the data. This dataset was used as training material for the machine learning model. For depth estimation, we used the RF method, which involves training multiple decision trees. This algorithm facilitates easy variable modification and demands high dataset accuracy for training.5 We constructed the RF-based SDB model using the ensemble package in Python’s scikit-learn tool [Fig. 6(a)]. The trained model was then applied to independently observed satellite images to estimate water depths in the training and test area. Depths estimated from the satellite imagery were adjusted to the DL standard for comparison with the nautical chart data [Fig. 6(b)].

Fig. 6

Flowchart of the SDB model. (a) Training and (b) predicting processes. The red box is the SDB model created during the training process and used for predicting.

JARS_18_1_014522_f006.png

2.5.2.

Depth estimation model training

The SDB model was developed using satellite images and categorized into three groups based on their respective training areas. For each area, datasets were created by combining data from five bands of the satellite images with nautical chart data, specifically targeting pixels representing depths between 5 and 25 m. Given the depth distribution across the training areas and to avoid training bias at certain depths, a consistent number of data points (15,000) was randomly sampled at 5-m intervals to create the training dataset. Of each dataset, 80% of the data points were utilized for model training, and the remaining 20% were randomly selected for validation [Fig. 6(a)].

The accuracy of the satellite (estimated) data in relation to the in situ (actual) data was measured through a Pearson’s correlation coefficient (r) and the root mean square error (RMSE). The calculations for r and RMSE are represented by Eqs. (5) and (6), respectively:

Eq. (5)

r=i=1N(DiD¯i)(DiD¯s)(N1)SiSs

Eq. (6)

RMSE=1Ni=1N(DiDs)2,
where Di is the actual depth, Ds is the estimated depth, D¯i and D¯s are the averages of Di and Ds, Si and Ss are the standard deviation of Di and Ds.

2.5.3.

Depth estimation model evaluation

To evaluate the performance of the depth estimation model tailored for each training area, we derived depth predictions from satellite images taken on dates distinct from those used in training (Table 5). The satellite images used for test were preprocessed identically to the training data and then applied to the corresponding SDB model for the training area. Since these predictions represent depths at the precise moment of satellite imaging, they underwent a tidal correction to align with the DL standard. Subsequently, only predictions within the 0 to 20 m range were compared with the real depth.

Table 5

Sentinel-2 data is used evaluation data for model verification.

AreaTile numberDateTime (KST)
CheonsumanT52SBFApril 17, 202211:16
HallimT52SBCOctober 19, 202011:16
SamcheokT52SEGMarch 25, 202011:06

3.

Results

3.1.

Depth Estimation Based on Satellite Imagery

The model accuracy was validated using 20% of the learning dataset, corresponding to 3000 randomly selected data points per depth segment (totaling 12,000). As measured by RMSE, the errors were 5.0526, 4.6616, and 2.0749 m for Cheonsuman, Hallim, and Samcheok, respectively. Samcheok demonstrated a higher r of 0.9152 than the other locations (Table 6). A previous study on waters with clarity similar to Samcheok, specifically in areas where the annual average chlorophyll-a concentration is less 1.0  mg/m3, reported similar RMSE values between 1.77 and 1.97 m.5

Table 6

Validation results of three SDB models trained with Sentinel-2.

AreaCheonsumanHallimSamcheok
RMSE (m)5.05264.66162.0749
r0.52890.69150.9152

3.2.

Quantitative Evaluation

To compare the performance of the SDB model across the three training areas, a quantitative evaluation was conducted on the model using statistical analyses. For objective testing, data points were randomly extracted at 5-m depth intervals from satellite images that were not used during the depth estimation model training process. These evaluation results were then compared with depths from electronic nautical charts (Table 7). Significant variances were observed among the training areas: Samcheok exhibited the highest estimation accuracy (r=0.9, RMSE=2.5861), followed by Hallim (r=0.52, RMSE=5.4863) and Cheonsuman (r=0.05, RMSE = 6.4603). The shallow water estimations for Samcheok were precise estimations, with accuracy diminishing linearly as depth increased. Interestingly, the RMSE for depths between 15 and 20 m was lower than that for depths between 5 and 15 m.

Table 7

Sentinel-2 data used as evaluation data.

AreaCheonsumanHallimSamcheok
Depth (m)
TotalRMSE (m)6.46035.48632.5861
r–0.04960.52130.8989
0 to 5RMSE (m)6.40137.80011.9885
r0.33830.29440.3060
5 to 10RMSE (m)3.50245.91002.6476
r–0.03300.30210.5653
10 to 15RMSE (m)4.63833.07293.1803
r–0.08540.03340.4897
15 to 20RMSE (m)9.58753.65702.3820
r–0.07470.04680.1673

3.3.

Qualitative Evaluation

Independent satellite images were used for the evaluation dataset, similar to the quantitative evaluation. To investigate the cause, we conducted a density scatter plot analysis, which allows for the visualization of data distribution and patterns. A density scatter plot was generated to visually compare the real and estimated depths across the three training areas, with the x and y axes representing the real and estimated depths, respectively [Fig. 7]. In the density scatter plot, the higher the proportion of red indicates a higher density of points, which can be interpreted as being influenced by certain factors. A plot closer to the 1:1 line indicated that the model better represented the real values. For Cheonsuman, the estimated depth values predominantly clustered within the 5- to 12-m range irrespective of the variations in actual depth [Fig. 7(a)]. Meanwhile, higher accuracy was achieved at depths of 10 to 20 m in Hallim. Nevertheless, regions with high-density scatter points appeared in the 0- to 7.5-m depth range [Fig. 7(b)]. Conversely, the data for Samcheok exhibited high prediction accuracy in shallow waters. However, this accuracy decreased with increasing depth [Fig. 7(c)].

Fig. 7

Density scatter plot analysis between the real and estimated depths. (a) Cheonsuman, (b) Hallim, and (c) Samcheok.

JARS_18_1_014522_f007.png

Maps were generated for the three training areas showing the estimated depths [Figs. 8(a), 8(d), and 8(g)], 2020 nautical chart depths [Figs. 8(b), 8(e), and 8(h)], and differences between the two [Figs. 8(c), 8(f), and 8(i)]. These maps were utilized to identify regions with significant discrepancies. The smallest error between the real and estimated depths was observed for Samcheok (RMSE = 2.5861 m). A general tendency of overestimation was observed across the entire area, with localized underestimations along the coastal regions [Fig. 8(i)]. In Hallim, results centered around Biyangdo indicated an RMSE of 5.4863 m. Notably, the coastal areas close to Biyangdo exhibited overestimation in the northwest direction and underestimation in the offshore direction [Fig. 8(f)]. The southeastern waters exhibited relatively low error, with certain areas indicating overestimation. Cheonsuman had the highest RMSE value at 6.4603 m. The northern region of this training area saw alternating patterns of over- and underestimations, while the southern region predominantly showed an overestimation trend [Fig. 8(c)]. The causes behind the errors observed in both Hallim and Cheonsuman are discussed in Sec. 4.

Fig. 8

Water depth and differences for (a)–(c) Cheonsuman, (d)–(f) Hallim, and (g)–(i) Samcheok, respectively. (a), (d), (g) Real depth maps and (b), (e), (h) SDB model result maps of the training areas. (c), (f), and (i) Differences in water depth between that obtained from the SDB model results and excluding the real depth. Red and blue in (c), (f), and (i) indicate overestimated and underestimated SDB model results, respectively.

JARS_18_1_014522_f008.png

4.

Discussion

4.1.

Evaluation of SDB Results and Accuracy Affected by Turbidity

In water bodies, such as the Yellow Sea, that are significantly affected by tides and predominantly distributed with fine-grained sediment, the seabed sediments are readily resuspended due to strong and recurring tidal currents, thus, increasing the seawater turbidity.38 This scattering often leads to shallower depth estimates than the actual depths. Cheonsuman, embodying the marine environmental characteristics typical of the Yellow Sea, exhibited a depth estimation accuracy lower than that in the East Sea.

The density scatter plot and correlation coefficient also failed to align with the actual depth value distributions. In areas with high turbidity, shallow depths are overestimated, and deep depths are underestimated, leading to errors.33 This was observed in Cheonsu Bay. In regions deeper than 10 m, an inversion phenomenon was observed, where the estimated depth decreased as the actual depth increased. In areas with heightened turbidity, underwater suspended matter typically scatters light, resulting in a higher reflectance than in clearer waters.

To address these issues, we conducted an additional analysis using the normalized difference turbidity index (NDTI) to assess the impact of turbidity on the depth estimation model. The NDTI method measures the concentrations of soil sediments, microalgae, and other suspended materials that contribute to water turbidity, utilizing the green (B3) and red band (B4), as defined in Eq. (7).43 NDTI values range from 1 to 1, with those nearing 1 indicating less turbidity and clear waters:

Eq. (7)

NDTI=RedGreenRed+Green.

We calculated the NDTI using the reflectance of pixels from the depth estimation test across the three areas (Fig. 9). Cheonsuman displayed a mid-level NDTI value without correlating to the depth estimation results [Fig. 9(a)]. In contrast, for Hallim and Samcheok, regions with relatively low turbidity indices, higher accuracy was achieved for the in-depth estimation [Figs. 9(b) and 9(c)]. However, varied NDTI values were observed in the shallow waters of Hallim at 0 to 7.5 m depth [Fig. 9(b)].

Fig. 9

Scatter plot analysis based on NDTI levels for (a) Cheonsuman, (b) Hallim, and (c) Samcheok.

JARS_18_1_014522_f009.png

Upon the addition of NDTI values to the training data, the correlation and accuracy of the results improved compared to those yielded by the original SDB model, with Cheonsuman showing the most significant change. The correlation shifted from an originally non-existent correlation (r=0.05) to a low correlation (r=0.3272), and the RMSE decreased by 1.1 m. However, in the 0 to 5 m depth range of Cheonsuman, the RMSE increased. Meanwhile, in the deeper waters (15 to 20 m) of Hallim and Samcheok, the RMSE increased by 0.20 and 0.56 m, respectively, whereas in the shallow waters (0 to 5 m), the RMSE improved slightly, decreasing by 0.20 and 0.08 m, respectively. These results indicate that if turbidity is not taken into consideration when developing satellite image-based depth estimation models, it can lead to significant errors due to the influence of turbidity. Therefore, it suggests that only by considering turbidity can the accuracy for turbid waters be improved (Fig. 10).

Fig. 10

Results of the SDB model with additional NDTI data. Note: The number under the r and RMSE values indicates the change from the existing SDB model. Blue and red indicate positively and negatively altered main results, respectively.

JARS_18_1_014522_f010.png

4.2.

Impact of Seabed Sediment on the SDB Model

Coastal areas with tides typically experience resuspension of fine sedimentary particles due to recurring tidal currents, which affects underwater turbidity. Despite the partial removal of the turbidity effect in Hallim at 0 to 10 m depth, which caused an approximate 0.2 m reduction in the RMSE, a relatively high error was still observed in the range of 5.7102 to 7.5959 m. To discern the cause of this discrepancy, depth profiles were plotted for the overestimated depth range and areas with relatively low deviation, and a depth trend analysis was conducted (Fig. 11).

Fig. 11

(a)–(c) Water depth on the ①–③ transect lines in (d), respectively. Black and red in (a)–(c) indicate real and SDB model depths, respectively. The location of 0 m in (a)–(c) implies a point close to the coastline in (d).

JARS_18_1_014522_f011.png

Figures 11(a) and 11(b) show that the depths estimated by the machine learning model were 10  m deeper than the actual depths, although the patterns of depth changes aligned well. As verified through Sec. 2.2.3, the error was within 5 cm, and considering the tidal range in this region is <4.5  m, such a 10 m discrepancy cannot be attributed solely to tidal influences.

To determine the cause of this overestimation, we created a scatter plot of reflectivity by depth [Fig. 12(a)]. According to the Beer–Lambert law, we distinguished area where reflectivity showed an exponential decrease with depth (blue), and those where reflectivity increased with depth (red). This characteristic was observed in all bands, although there were differences in slope. Although reflectivity decreases exponentially in most sea areas in clear waters,1 results for Hallim showed a different characteristic, implying the involvement of other factors.

Fig. 12

(a) Scatter plot of the Hallim area results with the actual depth as the x-axis and band-2 as the y-axis. Black dots express all points in the area, among which blue dots indicate that the difference between the actual and predicted depths is 3  m; red dots indicate that the difference between the actual and predicted depths is >3  m. (b) Depths within 10 m classified based on (a).

JARS_18_1_014522_f012.png

The locations were confirmed by marking them according to their characteristics [Fig. 12(b)], and visual readings were obtained using WorldView-3 satellite R (MS2, 630 nm)-G (MS3, 545 nm)-B (MS4, 480 nm) images (with a spatial resolution of 1.2 m) (Fig. 13). As observed in Figs. 13(a)①, 13(a)②, and 13(b)③ areas where the model overestimated the depth displayed a noticeably darker seabed than their surroundings. In conducting on-site investigations of these regions, we identified a basaltic seabed interspersed with patches of white sand (Fig. 14).

Fig. 13

WorldView-3 RGB image of Hallim 2018-11-15 11:43(KST). (a) Areas where depth was overestimated, and (b) areas where depth was well estimated. The red box in (b) shows the in-situ area in Fig. 14(a).

JARS_18_1_014522_f013.png

Fig. 14

Sites where on-site investigations were conducted at (a) Hallim (yellow circles indicate points where pictures were taken), (b) 33.39376° N, 126.23653° E, (c) 33.39381° N, 126.23635° E, and (d) 33.3941° N, 126.23776° E.

JARS_18_1_014522_f014.png

In shallow waters, the seabed color could influence SR. Remote reflectance was affected by the color of the seabed materials. That is, areas with dark seabed materials had low reflectance and are overestimated than their actual depth. For Hallim Harbor, the overestimation in shallow areas can likely be attributed to the basaltic nature of the seabed. Indeed, we found that reflectance characteristics varied based on seabed materials. Thus, if we incorporate additional seabed spatial data into the training dataset in the future, we anticipate enhancements in model performance. The sediment distribution map, created from airborne hyperspectral imaging, is scheduled to be provided by KHOA.

4.3.

Application of the SDB Model in Test Area

To assess whether the results of this study could represent different marine environments, we applied the three SDB models to regions with marine characteristics similar to those of the training areas. For this purpose, we selected Deokjeok, Seongsan, and Sokcho, all within 100 km and where the latest nautical chart data exist (Table 8). To determine the appropriateness of applying the SDB model in regions with similar coastal characteristics, we compared predictions for Seongsan using two different SDB models (Fig. 15).

Table 8

Information on regions with characteristics similar to the study area.

AreaDeokjeok (Yellow Sea)Seongsan (South Sea)Sokcho (East Sea)
Latitude (WGS-84)37.20° N–37.31° N33.45° N–33.50° N38.15° N–38.22° N
Longitude (WGS-84)126.10° E–126.15° E126.88° E–126.95° E128.57° E–128.65° E
Similar regions (SDB model)CheonsumanHallimSamcheok
Distance (from the study area)85.10 km63.63km100.38km
Tile numberT52SBGT52SBCT52SDH
DateOctober 9, 2020October 29, 2020December 5,
Time (KST)11:1611:1811:10

Fig. 15

Results of Evaluation Seongsan using a different SDB model. (a) Real depth maps and (b), (d) SDB model result maps. (c), (e) Density scatter plot analysis between the real and estimated depths. (b), (c) Results of using the SDB model of Hallim. (d), (e) Results of using the SDB model of Samcheok.

JARS_18_1_014522_f015.png

The results from the SDB model of Hallim showed r=0.69 and RMSE = 4.7903 m, which were more accurate than the results from the SDB model of Samcheok, r=0.40 and RMSE = 5.4924 m (Fig. 15). Despite the higher validation of the SDB model of Samcheok, the more accurate evaluations using the SDB model of Hallim indicate the effectiveness of applying SDB models trained on area with similar characteristics (Table 6).

When predictions were made using the SDB model with similar oceanic environmental characteristics for three area, the results were also similar to those in Sec. 3 (Fig. 16). As measured by RMSE, the prediction accuracies were 5.8292, 4.7903, and 3.0220 m for Deokjeok, Seongsan, and Sokcho, respectively. Sokcho demonstrated a higher correlation coefficient (r) of 0.8848 than the other locations (Table 9).

Fig. 16

Evaluation results of test area. (a)–(c) Deokjeok, (d)–(f) Seongsan, and (g)–(i) Sokcho. (a), (d), (g) Real depth maps and (b), (e), (h) SDB model result maps. (c), (f), (i) Density scatter plot analysis between the real and estimated depths.

JARS_18_1_014522_f016.png

Table 9

Evaluation results of a test area using the SDB model.

AreaDeokjeokSeongsanSokcho
RMSE (m)5.82924.79033.0220
r0.39340. 69080.8848

Analysis of the depth map and density scatter plot results also revealed similarities to those in Sec. 3. Significantly, the data distribution patterns in the density scatter plots were alike. In Seongsan, the results were similar to those in Hallim, with accurate predictions in shallow depths (0 to 5 m) and some overestimations [Fig. 16(f)]. Sokcho’s results showed an inverse proportional pattern, akin to those in Samcheok [Fig. 16(i)]. The results for Deokjeok were similar to those in Cheonsu, overestimating in shallow depths and underestimating in deeper waters [Fig. 16(c)]. Therefore, it is inferred that the predictions for regions not included in the SDB model’s training process yielded similar results, indicating that these outcomes may be characteristic of specific coastal environments.

Most studies on SDB models are limited by their application to only shallow water depths of 0 to 20 m; depth estimates are affected by external factors that affect the reflectivity of the sea surface and the depth, such as sunlight, waves, and tides. Therefore, it is important to determine how accurately external factors can be compensated for through the mean filter, sunglint correction, etc.

Furthermore, since the quality of the ground truth data determines the model’s accuracy, it is necessary to calibrate the estimated depth using accurate composition information for each sea area. The bathymetry data used as the ground truth data must comprise up-to-date data surveyed at approximately the same time as the satellite image acquisition.

5.

Conclusions

Conventional shipborne bathymetry conducted using an echo sounder entails many limitations, especially when simultaneous surveys of wide areas (such as ports) are required or when survey areas are difficult to access due to geographic or diplomatic issues. Despite limitations to the estimable depth of water, remote sensing technology can overcome the drawbacks of shipborne bathymetric methods. In this study, we developed SDB models for three seas surrounding the Korean Peninsula: The East Sea (Samcheok), Yellow Sea (Cheonsuman), and South Sea (Hallim). Sentinel-2 satellite images captured from January to December 2020 were used for machine learning model training: seven, six, and six images were captured for Cheonsuman, Hallim, and Samcheok, respectively. Based on these images, an RF-based SDB model was constructed. Samcheok showed the best results (RMSE = 2.5861 m; r=0.8989), while the RMSE values for Cheonsuman and Hallim were 6.4603 and 5.4863 m, respectively.

The primary causes of errors were determined to be turbidity and seabed sediment, and the depths were re-estimated using the SDB model combining NDTI. In all study areas, the RMSE decreased by an average of 0.4873 m, with Cheonsuman showing the most prominent improvement (RMSE decreased by 1.1000 m and r increased by 0.3768). Considering the high turbidity of the Yellow Sea, where Cheonsuman is located, this study demonstrated that spatial information on turbidity can improve depth estimation accuracy. However, given the relatively low accuracy (RMSE = 5.3552 m), further research is needed to determine the feasibility of viable depth estimation. In Hallim, for the 0 to 5 m depth range, despite the low seabed turbidity, which typically allows for visual surveys, the error was notably high (RMSE = 7.5959 m).

Further, high-resolution WorldView-3 satellite images with a spatial resolution of 1.2 m were used, along with on-site survey photos, to investigate the study areas. The remote reflectance was attenuated due to the dark-colored basaltic seabed. Such characteristics can contribute to the overestimation of depth, and future research should aim to incorporate various seabed materials into the training data. Unlike previous studies that presented SDB model results for waters with high transparency, this study developed individual SDB models that can be applied to waters with various characteristics and suggested a method for improved results. The SDB results are expected to be used as depth monitoring data for safe ship passage in coastal areas, including ports that require periodic and consistent coastal bathymetry, or as input data for numerical ocean models, contributing to various fields.

Disclosures

The authors declare no conflicts of interest.

Code and Data Availability

Most satellite data originated from the ESA public Copernicus constellations and are free of charge. However, access to the bathymetry data requires permission and can be obtained upon request from the KHOA.

Acknowledgments

This study was conducted with the support of the “Development of marine satellite image analysis application technology” project hosted by the Korea Institute of Marine Science & Technology Promotion (KIMST) funded by the Ministry of Oceans and Fisheries (Grant No. 20210046). The depth data required for this study were provided by the KHOA National Ocean Satellite Center and the Hydrographic Survey Department. We extend our sincere gratitude for their support.

References

1. 

M. D. M. Manessa et al., “Satellite-derived bathymetry using random forest algorithm and worldview-2 imagery,” Geoplann. J. Geomat. Plann., 3 (2), 117 –126 https://doi.org/10.14710/geoplanning.3.2.117-126 (2016). Google Scholar

2. 

S. S. J. D. Mudiyanselage et al., “Satellite-derived bathymetry using machine learning and optimal Sentinel-2 imagery in South-West Florida coastal waters,” GISci. Remote Sens., 59 (1), 1143 –1158 https://doi.org/10.1080/15481603.2022.2100597 (2016). Google Scholar

3. 

Z. Wu et al., “Satellite-derived bathymetry based on machine learning models and an updated quasi-analytical algorithm approach,” Opt. Express, 30 (10), 16773 –16793 https://doi.org/10.1364/OE.456094 OPEXFF 1094-4087 (2022). Google Scholar

4. 

J. M. Choi, “Possibility of using global bathymetry (GEBCO) data,” J. Kor. Assoc. Profess. Geogr., 45 (4), 581 –590 (2011). Google Scholar

5. 

T. Sagawa et al., “Satellite derived bathymetry using machine learning and multi-temporal satellite images,” Int. J. Remote Sens., 11 (10), 1155 https://doi.org/10.3390/rs11101155 IJSEDK 0143-1161 (2019). Google Scholar

6. 

H. J. You, D. S. Kim and H. S. Shin, “Evaluation of depth measurement method based on spectral characteristics using hyperspectrometer,” Kor. J. Remote Sens., 36 (2), 103 –119 https://doi.org/10.7780/kjrs.2020.36.2.1.2 (2020). Google Scholar

7. 

L. Meliala, W. A. Wibowo and J. Amalia, “Satellite derived bathymetry on shallow reef platform: a preliminary result from Semak Daun, Seribu Islands, Java Sea, Indonesia,” KnE Eng., 4 (3), 192 –202 https://doi.org/10.18502/keg.v4i3.584 (2019). Google Scholar

8. 

R. C. N. M. Said, M. R. Mahmud and R. C. Hasan, “Evaluating satellite-derived bathymetry accuracy from Sentinel-2A high-resolution multispectral imageries for shallow water hydrographic mapping,” IOP Conf. Ser.: Earth Environ. Sci., 169 (1), 012069 https://doi.org/10.1088/1755-1315/169/1/012069 (2018). Google Scholar

9. 

J. Zhu et al., “Determine the stumpf 2003 model parameters for multispectral remote sensing shallow water bathymetry,” J. Coast. Res., 102 (SI), 54 –62 https://doi.org/10.2112/SI102-007.1 (2020). Google Scholar

10. 

S. Rasheed et al., “An improved gridded bathymetric data set and tidal model for the Maldives archipelago,” Earth Space Sci., 8 e2020EA001207 https://doi.org/10.1029/2020EA001207 (2021). Google Scholar

11. 

A. P. Yunus et al., “Improved bathymetric mapping of coastal and lake environments using Sentinel-2 and Landsat-8 images,” Sensors, 19 (12), 2788 https://doi.org/10.3390/s19122788 SNSRES 0746-9462 (2019). Google Scholar

12. 

B. Gabr, M. Ahmed and Y. Marmoush, “PlanetScope and Landsat 8 imageries for bathymetry mapping,” J. Mar. Sci. Eng., 8 (2), 143 https://doi.org/10.3390/jmse8020143 (2020). Google Scholar

13. 

P. Jagalingam, B. J. Akshaya and A. V. Hegde, “Bathymetry mapping using Landsat 8 satellite imagery,” Proc. Eng., 116 560 –566 https://doi.org/10.1016/j.proeng.2015.08.326 (2015). Google Scholar

14. 

D. R. Lyzenga, N. P. Malinas and F. J. Tanis, “Multispectral bathymetry using a simple physically based algorithm,” IEEE Trans. Geosci. Remote Sens., 44 (8), 2251 –2259 https://doi.org/10.1109/TGRS.2006.872909 IGRSD2 0196-2892 (2006). Google Scholar

15. 

Z. Duan et al., “Satellite-derived bathymetry from Landsat-8 and Sentinel-2A images: assessment of atmospheric correction and depth derivation models for shallow waters,” Opt. Express, 30 (3), 3238 –3261 https://doi.org/10.1364/OE.444557 OPEXFF 1094-4087 (2022). Google Scholar

16. 

L. Breiman, “Random forests,” Mach. Learn., 45 (1), 5 –32 https://doi.org/10.1023/A:1010933404324 MALEEZ 0885-6125 (2001). Google Scholar

17. 

C. Xie et al., “Satellite-derived bathymetry combined with Sentinel-2 and ICESat-2 datasets using machine learning,” Front. Earth Sci., 11 1111817 https://doi.org/10.3389/feart.2023.1111817 (2023). Google Scholar

18. 

Z. Wu, Z. Mao and W. Shen, “Integrating multiple datasets and machine learning algorithms for satellite-based bathymetry in seaports,” Int. J. Remote Sens., 13 (21), 4328 https://doi.org/10.3390/rs13214328 IJSEDK 0143-1161 (2021). Google Scholar

19. 

A. Knudby and G. Richardson, “Incorporation of neighborhood information improves performance of SDB models,” Remote Sens. Appl. Soc. Environ., 32 101033 https://doi.org/10.1016/j.rsase.2023.101033 (2023). Google Scholar

20. 

M. Ashphaq, P. K. Srivastava and D. Mitra, “Review of near-shore satellite derived bathymetry: classification and account of five decades of coastal bathymetry research,” J. Ocean Eng. Sci., 6 340 –359 https://doi.org/10.1016/j.joes.2021.02.006 (2021). Google Scholar

21. 

G. Casal et al., “Satellite-derived bathymetry in optically complex waters using a model inversion approach and Sentinel-2 data,” Estuarine Coastal Shelf Sci., 241 106814 https://doi.org/10.1016/j.ecss.2020.106814 (2020). Google Scholar

22. 

H. Meyer et al., “Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation,” Environ. Model. Softw., 101 1 –9 https://doi.org/10.1016/j.envsoft.2017.12.001 (2018). Google Scholar

23. 

R. Chénier et al., “Consideration of level of confidence within multi-approach satellite-derived bathymetry,” Int. J. Geo-Inf., 8 48 https://doi.org/10.3390/ijgi8010048 (2019). Google Scholar

24. 

P. Ploton et al., “Spatial validation reveals poor predictive performance of large-scale ecological mapping models,” Nat. Commun., 11 4540 https://doi.org/10.1038/s41467-020-18321-y NCAOBW 2041-1723 (2020). Google Scholar

25. 

W. Zhou et al., “Comparison of machine learning and empirical approaches for deriving bathymetry from multispectral imagery,” Int. J. Remote Sens., 15 393 https://doi.org/10.3390/rs15020393 IJSEDK 0143-1161 (2023). Google Scholar

26. 

J. Zhong et al., “Nearshore bathymetry from ICESat-2 LiDAR and Sentinel-2 imagery datasets using deep learning approach,” Int. J. Remote Sens., 14 4299 https://doi.org/10.3390/rs14174229 IJSEDK 0143-1161 (2022). Google Scholar

27. 

C. Parente and M. Pepe, “Bathymetry from worldview-3 satellite data using radiometric band ratio,” Acta Polytech., 58 (2), 109 –117 https://doi.org/10.14311/AP.2018.58.0109 (2018). Google Scholar

28. 

S. Shen, “Simulation study on detecting shallow bathymetry via wavelength,” IOP Conf. Ser.: Earth Environ. Sci., 170 (5), 022055 https://doi.org/10.1088/1755-1315/170/2/022055 (2018). Google Scholar

29. 

A. Sukmono et al., “The extraction of near-shore bathymetry using Sentinel-2A satellite imagery: algorithms and their modifications,” TEM J., 11 (1), 150 https://doi.org/10.18421/TEM111-17 (2022). Google Scholar

31. 

J. S. Youn et al., “Seasonal variations of Iho and Hamdeok beach sediments in the Jeju Island, Korea,” Econ. Environ. Geol., 41 (2), 243 –252 (2008). Google Scholar

32. 

J. S. Youn and T. J. Kim, “Seasonal variations of Hamo and Hyeopjae beach sediments in the western part of Jeju Island,” J. Kor. Earth Sci. Soc., 32 (3), 265 –275 https://doi.org/10.5467/JKESS.2011.32.3.265 (2011). Google Scholar

33. 

E. Evagorou et al., “Evaluation of satellite-derived bathymetry form high and medium-resolution sensors using empirical methods,” Int. J. Remote Sens., 14 (3), 772 https://doi.org/10.3390/rs14030772 IJSEDK 0143-1161 (2022). Google Scholar

36. 

K. Matsumoto, T. Takanezawa and M Ooe, “Ocean tide models developed by assimilating TOPEX/POSEION altimeter data into hydrodynamical model: a global and regional model around Japan,” J. Oceanogr., 56 567 –581 https://doi.org/10.1023/A:1011157212596 (2000). Google Scholar

37. 

I. Caballero and R. P. Stumpf, “Retrieval of nearshore bathymetry from Sentinel-2A and 2B satellites in South Florida coastal waters,” Estuarine Coastal Shelf Sci., 226 (6), 106277 https://doi.org/10.1016/j.ecss.2019.106277 (2019). Google Scholar

38. 

J. D. Hedley, A. R. Harborne and P. J. Mumby, “Simple and robust removal of sun glint for mapping shallow‐water benthos,” Int. J. Remote Sens., 26 (10), 2107 –2112 https://doi.org/10.1080/01431160500034086 IJSEDK 0143-1161 (2005). Google Scholar

39. 

S. G. Kim, W. I. Lee and Y. H. Woo, “An analysis on relative importance and priority of hydrographic survey for major ports in South Korea,” J. Kor. Soc. Marine Environ. Saf., 21 (2), 154 –163 https://doi.org/10.7837/kosomes.2015.21.2.154 (2015). Google Scholar

40. 

S. K. McFeeters, “The use of the normalized difference water index (NDWI) in the delineation of open water features,” Int. J. Remote Sens., 17 (7), 1425 –1432 https://doi.org/10.1080/01431169608948714 IJSEDK 0143-1161 (1996). Google Scholar

41. 

T. S. Jung, “Sea level change due to nonlinear tides in coastal region,” J. Kor. Soc. Coastal Ocean Eng., 29 (5), 228 –238 https://doi.org/10.9765/KSCOE.2017.29.5.228 (2017). Google Scholar

42. 

Y. K. Choi and J. N. Kwon, “Seasonal variation of transparency in the southeastern Yellow Sea,” J. Kor. Soc. Fish. Ocean Technol., 31 (3), 323 –329 (1998). Google Scholar

43. 

J. P. Lacaux et al., “Classification of ponds from high-spatial resolution remote sensing: application to Rift Valley Fever epidemics in Senegal,” Remote Sens. Environ., 106 (1), 66 –74 https://doi.org/10.1016/j.rse.2006.07.012 (2007). Google Scholar

Biography

Tae-Ho Kim is a director at the Underwater Survey Technology 21 Inc., Department of Remote Sensing. He received his PhD in ocean remote sensing from the University of Science and Technology in 2018. His research interests include application of multi-sensor satellite image for ocean environment, target recognition and interpretation ocean phenomena, and developing algorithm using AI models.

Biographies of the other authors are not available.

CC BY: © The Authors. Published by SPIE under a Creative Commons Attribution 4.0 International License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.
Jae-yeop Kwon, Hye-kyeong Shin, Da-hui Kim, Hyeon-gyu Lee, Jin-kwang Bouk, Jung-hyun Kim, and Tae-ho Kim "Estimation of shallow bathymetry using Sentinel-2 satellite data and random forest machine learning: a case study for Cheonsuman, Hallim, and Samcheok Coastal Seas," Journal of Applied Remote Sensing 18(1), 014522 (12 March 2024). https://doi.org/10.1117/1.JRS.18.014522
Received: 12 November 2023; Accepted: 26 February 2024; Published: 12 March 2024
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
KEYWORDS
Data modeling

Satellites

Education and training

Satellite imaging

Water

Earth observing sensors

Turbidity

Back to Top