Abstract
Analyzing satellite images and remote sensing (RS) data using artificial intelligence (AI) tools and data fusion strategies has recently opened new perspectives for environmental monitoring and assessment. This is mainly due to the advancement of machine learning (ML) and data mining approaches, which facilitate extracting meaningful information at a large scale from geo-referenced and heterogeneous sources. This paper presents the first review of AI-based methodologies and data fusion strategies used for environmental monitoring, to the best of the authors’ knowledge. The first part of the article discusses the main challenges of geographical image analysis. Thereafter, a well-designed taxonomy is introduced to overview the existing frameworks, which have been focused on: (i) detecting different environmental impacts, e.g. land cover land use (LULC) change, gully erosion susceptibility (GES), waterlogging susceptibility (WLS), and land salinity and infertility (LSI); (ii) analyzing AI models deployed for extracting the pertinent features from RS images in addition to data fusion techniques used for combining images and/or features from heterogeneous sources; (iii) describing existing publicly-shared and open-access datasets; (iv) highlighting most frequent evaluation metrics; and (v) describing the most significant applications of ML and data fusion for RS image analysis. This is followed by an overview of existing works and discussions highlighting some of the challenges, limitations and shortcomings. To provide the reader with insight into real-world applications, two case studies illustrate the use of AI for classifying LULC changes and monitoring the environmental impacts due to dams’ construction, where classification accuracies of 98.57% and 97.05% have been reached, respectively. Lastly, recommendations and future directions are drawn.
Introduction
The world is witnessing a significant environmental change driven by the increased world population and human activities, such as urbanization, construction of hydroelectric power (HEP) and nuclear power stations, converting land from forests to agriculture, burning fossil fuels, etc. [1]. Satellite imagery and geographical information systems (GIS) data are critical components for environmental monitoring, used to analyze and predict the impacts different activities have on the environment. For example, changes due to dams’ construction can be predicted from different perspectives, including environment, forestry, hydrology, agriculture, and geology. Moreover, the quantity of remote sensing (RS) data gathered by various airborne and spaceborne sensors is increasing exponentially, which requires the use of efficient AI-based algorithms, big data analytics, data fusion mechanisms, and adequate computing resources for improving the performance of environmental monitoring solutions concerning the application at hand [2]. Therefore, satellite RS has been widely used to generate crop type maps at both high spatial and temporal resolutions to resolve rapid phenological transition at the field scale and predict a wide range of environmental impacts [3], [4].
When conducting environmental monitoring, it is of utmost importance to determine the temporal changes in the soil, land use, and land cover (LULC) change, gully erosion susceptibility (GES), waterlogging susceptibility (WLS), and land salinity and infertility (LSI). These occur due to several reasons, such as constructing the hydropower stations. Therefore, this becomes possible using (i) RS data collected using different devices, e.g. LiDar images [5], synthetic aperture radar (SAR) images [2], etc. Furthermore, artificial intelligence (AI) tools, such as deep convolutional neural networks (DCNN), deep autoencoders (DAE), generative adversarial networks (GAN), recurrent neural networks (RNN), etc. are playing a crucial role in collecting images, analyzing them and extracting relevant information [6]. They have received great attention across various domains, including WLS, GES, and LULC mapping. More specifically, DCNN was the main class of deep learning (DL) algorithms that has been widely investigated due to its salient characteristics and advantages in using convolution operations to extract image features.
From another hand, data fusion methods have been largely employed in various research fields (e.g. smart systems [7], hydraulic [8], energy [9], wireless sensor networks [10], healthcare [11], etc.) to aggregate data from different sensors and hence attain better accuracy in extracting the pertinent features compared to using data from a single sensor/device [12]. In the same manner, for analyzing different kinds of RS images and coming up with more precise information, various data fusion have been adopted, e.g. spatio-temporal fusion [13], spectral fusion [6], multi-source data fusion [14] and pixel-level image fusion [15], [16]. For instance, HS images usually consist of cube forms that contain two-dimensional spatial information and valuable spectral information in the third dimension. The fusion of data from the large volume of spectral bands and spatial features of images (e.g. textures, shapes, geometrical structures, etc.) will improve the environmental monitoring by facilitating the discrimination between the regions of the images.
To that end, this paper aims at reviewing AI and data fusion-based frameworks proposed to predict and investigate the environmental impacts of dams construction using RS images. In this regard, a generic taxonomy is firstly adopted to classify the main environmental issues, including WLS, GES, LULC, and LSI. Moving on, existing AI models are split into three main categories, i.e. supervised, semi-supervised, and unsupervised based methods. Accordingly, existing AI-based frameworks are discussed regarding various sub-categories, such as traditional classifiers, clustering models, DCNN, GAN, RNN, and stochastic classifications models (SCM). Next, data fusion frameworks used to aggregate different kinds of images and/or features for getting more precise information are described. Next, existing datasets used to predict environmental impacts are presented, along with the most frequent metrics used to assess the performance of AI and data fusion-based frameworks. Finally, two case studies are introduced, describing using AI models to classify LULC changes and the environmental impacts of constructing HEP stations. The structure and contribution of the presented work are portrayed in Fig. 1. Overall, the main contributions of this paper can be summarized as follows:
- •
A thorough review of existing environmental monitoring works based on AI and data fusion is presented, in which a generic taxonomy is introduced and utilized for classifying them into different categories based on challenges of remote image sensing analysis, type of environmental impacts (e.g. WLS, GES, LULC, and LSI), ML models deployed to learn the classification and prediction tasks, data fusion schemes deployed to combine various kinds of data, etc. The survey is performed by searching academic databases for peer-reviewed journal articles and conference publications targeting satellite imagery analysis for environmental monitoring using AI and information.
- •
The most significant applications of ML and data fusion for RS image analysis are also discussed and their open challenges are identified.
- •
A detailed analysis is conducted, and an in-depth discussion about the surveyed works is presented by (i) emphasizing the importance of using AI and data fusion in image analysis for environmental monitoring, and (ii) highlighting their limitations and drawbacks, and (iii) discussing current challenges.
- •
Two case studies explaining the use of different ML models for detecting LULC change and environmental impacts of dams’ construction are presented after briefly describing existing open-access datasets employed for predicting environmental impacts.
- •
Future research directions towards improving the efficiency, reliability, and accuracy of detecting environmental impacts using AI and data fusion are described, which revolve around using (i) explainable AI for better understanding of the outcomes of AI models, (ii) point-clouds for a better description of the objects and scenes in satellite images, and (iii) advanced ML-based fusion techniques for developing smart data fusion mechanisms.
The rest of this paper is organized as follows. Section 2 explains the geographical image analysis briefly and highlights the most critical challenges of RS image analysis used for environmental monitoring. Section 3 overviews existing environmental monitoring frameworks by discussing the different environmental impacts, ML tools, AI-based segmentation, evaluation metrics, and public datasets. Section 4 describes the state-of-the-art data fusion techniques utilized to collect data from different image sources. Section 5 describes the most significant applications of ML and data fusion for RS image analysis. Section 6 focuses on describing the main challenges and limitations, and drawbacks of the state-of-the-art. Moving forward, two case studies are then presented in Section 7 illustrating how ML algorithms are used to classify LULC changes and detect the environmental impacts of constructing HEP stations. Section 8 derives a set of recommendations and future directions to improve the performance of AI and data fusion-based solutions. Finally, Section 9 summarizes the main findings of this article.
Section snippets
Challenges of remote sensing image analysis
RS image analysis and processing play an essential role in environmental monitoring [17], [18]. Typically, they aim to generate automated spatial datasets and establish spatial relationships [19], [20] from different satellites, such as Landsat, Sentinel, CORONA, MODIS, ASTER, Meteosat, GeoEye, and Maxar [21]. Digital analysis of RS images is progressing rapidly due to the advance of intelligent automatic image interpretation. Even though it was initially introduced to analyze aerial
Overview of AI for environmental monitoring
Data fusion
The data fusion strategy refers to collecting pertinent information from various data sources and including them in fewer data repositories, generally a single one. For example, aiming at detecting environmental changes, data are extracted from multiple images, including spatial (e.g. panchromatic) and spectral (e.g. MS) images, generated by different RS devices, and then merged in a single image, which is more accurate and informative compared to any other unique image. Overall, data fusion
Applications of ML and data fusion for RS image analysis
This survey discusses the statistical analysis of ML-based and data-fusion-based environmental monitoring methods using RS images. The comprehensive screening of the state-of-the-art has shown the growing interest devoted to using DCNN architectures, where the and spectral–spatial features of RS images are used for analysis. Precisely, most of the research focus is towards developing classification applications, such as LULC classification, object detection, and scene classification.
Trends analysis and important findings
After overviewing existing environmental monitoring frameworks, it is essential to conduct trends analysis and discussion. This section focuses mainly on describing the state-of-the-art’s main challenges, limitations, and drawbacks. Proportional to the problem complexity, conducting RS-based environmental monitoring using AI and data fusion is computationally demanding, time-consuming, potentially inefficient, and may result in sub-optimal solutions if the exploration space is not covered
LULC classification
This section presents an example of conducting LULC classification using Sentinel-2 satellite images. Accordingly, the EuroSat dataset has been used, which covers 13 spectral bands, consists of 10 different classes, and includes up to 27,000 annotated and geo-referenced images. Fig. 9 portrays an example of the RS images pertaining to the 10 different classes from the EuroSat dataset used to train the ML models.
To perform a LULC classification, we train three DCNN models, namely shallow CNN
Explainable AI (XAI)
Although data-driven techniques, especially DL models, are achieving state-of-the-art results in most RS image processing applications (LULC change detection, GES, WLS, etc.), their black-box operations still impede the comprehension of their decision-making, which helps to conceal biases and/or other shortcomings in RS datasets and models’ performance [377]. Put differently, existing AI models cannot furnish clear and meaningful cognitive or physical interpretations of the models’ features, RS
Conclusion
The research community has given the subject of environmental monitoring using AI and data fusion for RS images analysis a great deal of attention overly. Geographical image analysis and mapping, extracting multi-modal features using ML, and fusing these heterogeneous features to develop more informative images are three crucial components of these methodologies. However, environmental monitoring using RS images has exclusive and unique characteristics that make it much different from the
CRediT authorship contribution statement
Yassine Himeur: Conceptualization, Formal analysis, Methodology, Writing – review & editing. Bhagawat Rimal: Conceptualization, Formal analysis, Methodology, Writing – review & editing. Abhishek Tiwary: Funding acquisition, Writing – review & editing, Project administration, Supervision. Abbes Amira: Funding acquisition, Writing – review & editing, Project administration, Supervision.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This paper was made possible by the Research England , Grant Number: QR GCRF 2020/21. The authors are thankful to Mr. Saurav Shrestha from the Institute of Forestry (IOF), Pokhara Campus, Pokhara. The statements made herein are solely the responsibility of the authors.