Comparing thresholding with machine learning classifiers for mapping complex water

Tsitsi Bangira; Silvia Maria Alfieri; Massimo Menenti; Adriaan van Niekerk

doi:10.3390/rs11111351

Comparing thresholding with machine learning classifiers for mapping complex water

Tsitsi Bangira^*, Silvia Maria Alfieri, Massimo Menenti, Adriaan van Niekerk

^*Corresponding author for this work

Optical and Laser Remote Sensing

Research output: Contribution to journal › Article › Scientific › peer-review

92 Citations (Scopus)

180 Downloads (Pure)

Abstract

Small reservoirs play an important role in mining, industries, and agriculture, but storage levels or stage changes are very dynamic. Accurate and up-to-date maps of surface water storage and distribution are invaluable for informing decisions relating to water security, flood monitoring, and water resources management. Satellite remote sensing is an effective way of monitoring the dynamics of surface waterbodies over large areas. The European Space Agency (ESA) has recently launched constellations of Sentinel-1 (S1) and Sentinel-2 (S2) satellites carrying C-band synthetic aperture radar (SAR) and a multispectral imaging radiometer, respectively. The constellations improve global coverage of remotely sensed imagery and enable the development of near real-time operational products. This unprecedented data availability leads to an urgent need for the application of fully automatic, feasible, and accurate retrieval methods for mapping and monitoring waterbodies. The mapping of waterbodies can take advantage of the synthesis of SAR and multispectral remote sensing data in order to increase classification accuracy. This study compares automatic thresholding to machine learning, when applied to delineate waterbodies with diverse spectral and spatial characteristics. Automatic thresholding was applied to near-concurrent normalized difference water index (NDWI) (generated from S2 optical imagery) and VH backscatter features (generated from S1 SAR data). Machine learning was applied to a comprehensive set of features derived from S1 and S2 data. During our field surveys, we observed that the waterbodies visited had different sizes and varying levels of turbidity, sedimentation, and eutrophication. Five machine learning algorithms (MLAs), namely decision tree (DT), k-nearest neighbour (k-NN), random forest (RF), and two implementations of the support vector machine (SVM) were considered. Several experiments were carried out to better understand the complexities involved in mapping spectrally and spatially complex waterbodies. It was found that the combination of multispectral indices with SAR data is highly beneficial for classifying complex waterbodies and that the proposed thresholding approach classified waterbodies with an overall classification accuracy of 89.3%. However, the varying concentrations of suspended sediments (turbidity), dissolved particles, and aquatic plants negatively affected the classification accuracies of the proposed method, whereas the MLAs (SVM in particular) were less sensitive to such variations. The main disadvantage of using MLAs for operational waterbody mapping is the requirement for suitable training samples, representing both water and non-water land covers. The dynamic nature of reservoirs (many reservoirs are depleted at least once a year) makes the re-use of training data unfeasible. The study found that aggregating (combining) the thresholding results of two SAR and multispectral features, namely the S1 VH polarisation and the S2 NDWI, respectively, provided better overall accuracies than when thresholding was applied to any of the individual features considered. The accuracies of this dual thresholding technique were comparable to those of machine learning and may thus offer a viable solution for automatic mapping of waterbodies.

Original language	English
Article number	1351
Number of pages	21
Journal	Remote Sensing
Volume	11
Issue number	11
DOIs	https://doi.org/10.3390/rs11111351
Publication status	Published - 2019

Keywords

Machine learning
Optically complex
Remote sensing
Sentinel-1
Sentinel-2
Thresholding
Waterbody mapping

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.3390/rs11111351

remotesensing-11-01351Final published version, 10.4 MBLicence: CC BY

Cite this

@article{338c74abc3dd443daa837a68f1e6084c,

title = "Comparing thresholding with machine learning classifiers for mapping complex water",

abstract = "Small reservoirs play an important role in mining, industries, and agriculture, but storage levels or stage changes are very dynamic. Accurate and up-to-date maps of surface water storage and distribution are invaluable for informing decisions relating to water security, flood monitoring, and water resources management. Satellite remote sensing is an effective way of monitoring the dynamics of surface waterbodies over large areas. The European Space Agency (ESA) has recently launched constellations of Sentinel-1 (S1) and Sentinel-2 (S2) satellites carrying C-band synthetic aperture radar (SAR) and a multispectral imaging radiometer, respectively. The constellations improve global coverage of remotely sensed imagery and enable the development of near real-time operational products. This unprecedented data availability leads to an urgent need for the application of fully automatic, feasible, and accurate retrieval methods for mapping and monitoring waterbodies. The mapping of waterbodies can take advantage of the synthesis of SAR and multispectral remote sensing data in order to increase classification accuracy. This study compares automatic thresholding to machine learning, when applied to delineate waterbodies with diverse spectral and spatial characteristics. Automatic thresholding was applied to near-concurrent normalized difference water index (NDWI) (generated from S2 optical imagery) and VH backscatter features (generated from S1 SAR data). Machine learning was applied to a comprehensive set of features derived from S1 and S2 data. During our field surveys, we observed that the waterbodies visited had different sizes and varying levels of turbidity, sedimentation, and eutrophication. Five machine learning algorithms (MLAs), namely decision tree (DT), k-nearest neighbour (k-NN), random forest (RF), and two implementations of the support vector machine (SVM) were considered. Several experiments were carried out to better understand the complexities involved in mapping spectrally and spatially complex waterbodies. It was found that the combination of multispectral indices with SAR data is highly beneficial for classifying complex waterbodies and that the proposed thresholding approach classified waterbodies with an overall classification accuracy of 89.3%. However, the varying concentrations of suspended sediments (turbidity), dissolved particles, and aquatic plants negatively affected the classification accuracies of the proposed method, whereas the MLAs (SVM in particular) were less sensitive to such variations. The main disadvantage of using MLAs for operational waterbody mapping is the requirement for suitable training samples, representing both water and non-water land covers. The dynamic nature of reservoirs (many reservoirs are depleted at least once a year) makes the re-use of training data unfeasible. The study found that aggregating (combining) the thresholding results of two SAR and multispectral features, namely the S1 VH polarisation and the S2 NDWI, respectively, provided better overall accuracies than when thresholding was applied to any of the individual features considered. The accuracies of this dual thresholding technique were comparable to those of machine learning and may thus offer a viable solution for automatic mapping of waterbodies.",

keywords = "Machine learning, Optically complex, Remote sensing, Sentinel-1, Sentinel-2, Thresholding, Waterbody mapping",

author = "Tsitsi Bangira and Alfieri, {Silvia Maria} and Massimo Menenti and {van Niekerk}, Adriaan",

year = "2019",

doi = "10.3390/rs11111351",

language = "English",

volume = "11",

journal = "Remote Sensing",

issn = "2072-4292",

publisher = "MDPI",

number = "11",

}

TY - JOUR

T1 - Comparing thresholding with machine learning classifiers for mapping complex water

AU - Bangira, Tsitsi

AU - Alfieri, Silvia Maria

AU - Menenti, Massimo

AU - van Niekerk, Adriaan

PY - 2019

Y1 - 2019

N2 - Small reservoirs play an important role in mining, industries, and agriculture, but storage levels or stage changes are very dynamic. Accurate and up-to-date maps of surface water storage and distribution are invaluable for informing decisions relating to water security, flood monitoring, and water resources management. Satellite remote sensing is an effective way of monitoring the dynamics of surface waterbodies over large areas. The European Space Agency (ESA) has recently launched constellations of Sentinel-1 (S1) and Sentinel-2 (S2) satellites carrying C-band synthetic aperture radar (SAR) and a multispectral imaging radiometer, respectively. The constellations improve global coverage of remotely sensed imagery and enable the development of near real-time operational products. This unprecedented data availability leads to an urgent need for the application of fully automatic, feasible, and accurate retrieval methods for mapping and monitoring waterbodies. The mapping of waterbodies can take advantage of the synthesis of SAR and multispectral remote sensing data in order to increase classification accuracy. This study compares automatic thresholding to machine learning, when applied to delineate waterbodies with diverse spectral and spatial characteristics. Automatic thresholding was applied to near-concurrent normalized difference water index (NDWI) (generated from S2 optical imagery) and VH backscatter features (generated from S1 SAR data). Machine learning was applied to a comprehensive set of features derived from S1 and S2 data. During our field surveys, we observed that the waterbodies visited had different sizes and varying levels of turbidity, sedimentation, and eutrophication. Five machine learning algorithms (MLAs), namely decision tree (DT), k-nearest neighbour (k-NN), random forest (RF), and two implementations of the support vector machine (SVM) were considered. Several experiments were carried out to better understand the complexities involved in mapping spectrally and spatially complex waterbodies. It was found that the combination of multispectral indices with SAR data is highly beneficial for classifying complex waterbodies and that the proposed thresholding approach classified waterbodies with an overall classification accuracy of 89.3%. However, the varying concentrations of suspended sediments (turbidity), dissolved particles, and aquatic plants negatively affected the classification accuracies of the proposed method, whereas the MLAs (SVM in particular) were less sensitive to such variations. The main disadvantage of using MLAs for operational waterbody mapping is the requirement for suitable training samples, representing both water and non-water land covers. The dynamic nature of reservoirs (many reservoirs are depleted at least once a year) makes the re-use of training data unfeasible. The study found that aggregating (combining) the thresholding results of two SAR and multispectral features, namely the S1 VH polarisation and the S2 NDWI, respectively, provided better overall accuracies than when thresholding was applied to any of the individual features considered. The accuracies of this dual thresholding technique were comparable to those of machine learning and may thus offer a viable solution for automatic mapping of waterbodies.

AB - Small reservoirs play an important role in mining, industries, and agriculture, but storage levels or stage changes are very dynamic. Accurate and up-to-date maps of surface water storage and distribution are invaluable for informing decisions relating to water security, flood monitoring, and water resources management. Satellite remote sensing is an effective way of monitoring the dynamics of surface waterbodies over large areas. The European Space Agency (ESA) has recently launched constellations of Sentinel-1 (S1) and Sentinel-2 (S2) satellites carrying C-band synthetic aperture radar (SAR) and a multispectral imaging radiometer, respectively. The constellations improve global coverage of remotely sensed imagery and enable the development of near real-time operational products. This unprecedented data availability leads to an urgent need for the application of fully automatic, feasible, and accurate retrieval methods for mapping and monitoring waterbodies. The mapping of waterbodies can take advantage of the synthesis of SAR and multispectral remote sensing data in order to increase classification accuracy. This study compares automatic thresholding to machine learning, when applied to delineate waterbodies with diverse spectral and spatial characteristics. Automatic thresholding was applied to near-concurrent normalized difference water index (NDWI) (generated from S2 optical imagery) and VH backscatter features (generated from S1 SAR data). Machine learning was applied to a comprehensive set of features derived from S1 and S2 data. During our field surveys, we observed that the waterbodies visited had different sizes and varying levels of turbidity, sedimentation, and eutrophication. Five machine learning algorithms (MLAs), namely decision tree (DT), k-nearest neighbour (k-NN), random forest (RF), and two implementations of the support vector machine (SVM) were considered. Several experiments were carried out to better understand the complexities involved in mapping spectrally and spatially complex waterbodies. It was found that the combination of multispectral indices with SAR data is highly beneficial for classifying complex waterbodies and that the proposed thresholding approach classified waterbodies with an overall classification accuracy of 89.3%. However, the varying concentrations of suspended sediments (turbidity), dissolved particles, and aquatic plants negatively affected the classification accuracies of the proposed method, whereas the MLAs (SVM in particular) were less sensitive to such variations. The main disadvantage of using MLAs for operational waterbody mapping is the requirement for suitable training samples, representing both water and non-water land covers. The dynamic nature of reservoirs (many reservoirs are depleted at least once a year) makes the re-use of training data unfeasible. The study found that aggregating (combining) the thresholding results of two SAR and multispectral features, namely the S1 VH polarisation and the S2 NDWI, respectively, provided better overall accuracies than when thresholding was applied to any of the individual features considered. The accuracies of this dual thresholding technique were comparable to those of machine learning and may thus offer a viable solution for automatic mapping of waterbodies.

KW - Machine learning

KW - Optically complex

KW - Remote sensing

KW - Sentinel-1

KW - Sentinel-2

KW - Thresholding

KW - Waterbody mapping

UR - http://www.scopus.com/inward/record.url?scp=85067419226&partnerID=8YFLogxK

U2 - 10.3390/rs11111351

DO - 10.3390/rs11111351

M3 - Article

AN - SCOPUS:85067419226

SN - 2072-4292

VL - 11

JO - Remote Sensing

JF - Remote Sensing

IS - 11

M1 - 1351

ER -

Comparing thresholding with machine learning classifiers for mapping complex water

Abstract

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this