Machine learning algorithm partially reconfigured on FPGA for an image edge detection system

Gracieth Cavalcanti Batista; Johnny Öberg; Osamu Saotome; Haroldo F. de Campos Velho; Elcio Hideiti Shiguemori; Ingemar Söderquist

doi:10.1016/j.jnlest.2024.100248

1 Introduction

Unmanned autonomous vehicles (UAVs) began to be used during World War II to perform functions that risked soldiers’ lives, such as territory reconnaissance. Sometimes, they were also used for attacks [1]. In the past, UAVs had a surveillance role; while currently, they have a more active role in many areas such as agriculture and livestock, the wind industry, advertising, civil construction, workplace safety, and border security. Because current UAVs can be controlled from a long distance [1].

UAVs are often employed in high-risk environments. The ability to sense and avoid obstacles and rebuild their flight paths is an important feature that UAVs should possess, and the corresponding algorithms should be embedded in their guidance and control systems [2]. Amongst others, the navigation system has top priority because a fast and well-developed autonomous navigation system makes UAV operators work easier.

Autonomous navigation requires a robust and reliable self-localization ability. One strategy for UAV autonomous navigation is to combine global navigation satellite systems (GNSSs), such as the global positioning system (GPS), with an onboard inertial navigation system (INS) sensor for pose and attitude estimation. Additionally, positioning information with the INS sensor is also important because there is an accumulation of the drift error, which could lead to a great divergence between the estimated position and the actual position [3]. Many techniques can be used for addressing the data fusion of GNSS-INS, such as sequential Bayesian filtering including Kalman filters [4,5], extended Kalman filters [6], and particle filters [7]. Metaheuristic optimization approaches were applied to design integrated navigation systems based on GNSS-INS [8]. However, it could cause reliability issues in the GNSS-INS fusion, mainly if there is any obstacle between UAV and the emitting satellites or intentional and unintentional interference. Furthermore, the most common malicious attacks to disrupt the position, navigation, and time derived from GNSSs are spoofing (allowing the attacker to take control and/or making the receiver calculate a false position) and jamming (overpowering GPS satellite signals locally so that the receiver can no longer operate).

In addition, natural phenomena can interfere with the propagation of the GNSS signal. Ionosphere is such an atmospheric layer, which suffers ionization by solar radiation and locates the region of 48 km up to 965 km [9]. However, it is not a homogeneous atmospheric layer. Disturbance can appear on it, which has a negative impact on the propagation of electromagnetic waves, generating random fluctuations in the amplitude and phase of these waves. This kind of disturbance is called scintillations, which can cause full disruption to the emitted GNSS signal [10]. One permanent ionospheric disturbance is the equatorial ionization anomaly [11], sometimes also named equatorial fountain, responsible for increasing the electron density over the north and south regions of the magnetic equator; another factor associated with the ionospheric scintillation is the ionospheric bubbles [12–14]. The ionospheric bubble effects over the Brazilian territory can cover up to 30 degrees of south latitude, and this is a seasonal phenomenon, mainly during the summer period in the Southern Hemisphere [15].

Vision-based localization and vision-aided localization are the most predominant solutions to replace or supplement the GNSS-INS fusion in recent years [16]. However, the main disadvantage of these approaches is the computation load due to the vast amount of data to analyze and the complexity of interpreting data. Vision-based UAV localization covers relative visual localization (RVL) and absolute visual localization (AVL), which are also sometimes called frame-to-frame localization and frame-to-reference localization, respectively [16]. AVL is performed by matching or registering the current view of UAV against a visual memory built from the aforementioned data, called image-matching. The immunity to drift is achieved by ensuring complete independence between pose estimation. The most used image-matching techniques are template matching, feature points matching, deep learning matching, and visual odometry matching.

In this paper, the AVL concept is studied by using images from UAV equipped with synthetic aperture radar (SAR) sensors. The artificial intelligence (AI)-based image processing technique in this area is well-known and outperforms traditional image-matching methods. Numerous studies have focused on convolutional neural networks [17,18], weightless neural network architectures [19], self-configured neural networks [20,21], radial basis function neural networks (RBF-NNs) [22], deep learning [23,24], training of kernels [25], Siamese networks [26], and others. However, to the best of our knowledge, there have been no reports on machine learning (ML) techniques based on statistical learning theory (SLT).

Here, to address the challenge of image edge detection in an image-matching system, UAV pose estimation based on ML was applied, and the support vector machine (SVM) regression model (also known as SVR) was used to predict “edge” and “non-edge” patterns. The regression model of the proposed method was developed in MATLAB from a binary image database. The prediction phase was designed for a field programmable gate array (FPGA) device, with 18-bit fixed-point data output from the training phase. This design was implemented with dynamic partial reconfiguration (DPR). The SVR prediction phase used DPR in its synchronous datapath to modify the hardware granularity and present an adaptive layout, which has not been reported previously. As a result, the granularity of the reconfigurable region was gradually increased, where three architectures (Architecture N#1, Architecture N#3, and Architecture N#9) dynamically reconfigured in a Zedboard ZYNQ-7 device were realized.

2 Related work

The UAV pose estimation techniques based on image edge detection have been widely investigated, including the AI-based ones [27,28]. However, only several studies applied the deep learning-based and bio-inspired methods to realize edge detection [8,29]. Yang and co-authors overviewed the representative edge and object contour detection methods reported in the past two decades [30]. Amongst others, two methods have gained much attention. One was proposed by Al-Amaren et al. [29]. They applied a new visual geometry group network-16 layers (VGG-16)-based deep convolutional neural network (DCNN) for edge detection with residual learning and demonstrated that this methodology outperformed all the existing VGG-16-based techniques with superior performance and low complexity. The other one is the multi-scale representation. A bi-directional pyramid network was constructed by combining two pyramid networks (a down-sampling pyramid network and a lightweight up-sampling pyramid network) with a backbone network [31]. This contributed to a higher training speed and equivalent test accuracy. However, intense computing is the prerequisite with the designed neural network architectures. This indicates that high-performance computing systems are required.

One of the UAV pose estimation techniques is integrating INS with GNSS, where metaheuristic algorithms have been proposed to achieve pose estimation [8]. The corresponding simulation results are superior to those of the genetic algorithm (GA) and particle swarm optimization (PSO) in terms of navigation performance.

Due to its superiority, FPGA is gained increasingly interest in real-time, reliable, and low-cost embedded systems. Because FPGAs have higher computation efficiency than central processing units (CPUs) and graphics processing units (GPUs), especially with ML [32]. It is reconfigurable, which makes FPGA a promising candidate to realize high-speed operation. It can perform well with less execution time at a low cost and low power consumption. Besides, it has various unique design advantages including reliability, long-term maintenance, and flexibility [33]. Particularly, DPR, as a modern capability of FPGA, enables the user to reconfigure the area, part of which can be used dynamically while the rest stays operating normally [34].

In a real-time image filtering and edge detection system, which needs a real-time display of the image, Sun et al. used the Gaussian filtering and Sobel edge processing algorithms to process the image edge [35]. In Ref. [36], the same problem has been addressed by image filtering and edge detection, where a look-up table (LUT), instead of a multiplier, and a distributed algorithm are applied. However, there is still no enough information about the implementation parameters of FPGA.

Vivo et al. developed a mono-dimensional noise-resistant algorithm for edge detection [37]. Such an algorithm guarantees fast computation, making it very attractive for real-time image processing, remote sensing, and UAV surveillance. Kaur et al. [38] proposed an edge detection technique based on Riesz fractional derivative (RFD) in the fractional Fourier transform (FrFT) domain and demonstrated that the proposed approach is highly efficient. However, both were only validated in simulation while no experimental implementation was conducted. Although Zhang et al. applied the Sobel edge detection algorithm in an FPGA-based image processing system [39], it cannot process in real time. Moreover, neither the power and area consumption in the FPGA device, nor the time consumption to output the processed image was investigated. A quaternion-based improved cuckoo algorithm for faster processing was proposed and experimentally demonstrated [40]. Also, the processing time and quality were evaluated. However, the edges of the output images were not refined.

Based on image convolution from segmented images, Conte and Doherty used Sobel’s algorithm for edge extraction [41]. Similarly, with the same UAV flight experimental data, Braga et al. [42] performed image convolution, but an optimal neural network was adopted to execute the edge extraction. Here an optimization problem was solved by calculating the hyperparameters from the meta-heuristic called multi-particle collision algorithm [43]. Such an edge detection method is capable to localize UAV more precisely (with a smaller error) and faster than Sobel’s algorithm [42]. Braga applied convolution between segmented images/data for UAV positioning estimation [27] and found that better performance in image edge identification was obtained with the multi-layer perceptron neural network (MLP-NN) than that with both Sobel’s and Canny’s algorithms. The neural network was implemented on CPU and FPGA. Two pieces of hardware with FPGA embedded, Raspberry PI Model B-1 (FPGA Spartan-6 LX9) and Zybo Zynq 7000 (FPGA Artix-7), were employed. The results verify that FPGA Artix-7 can process faster, while FPGA Spartan-6 LX9 consumes less energy. By employing image convolution for UAV positioning with the SAR image, Silva et al. [28] also evaluated three different algorithms’ performance for edge detection including Canny’s algorithm, RBF-NN, and the fuzzy system. Their results show RNF-NN performs the best with a positioning error of 34.8 m, smaller than that of 63.6 m for Canny’s algorithm and 40.9 m for the fuzzy system. In Ref. [44], a system is developed with only a downward facing monocular RGB camera on UAV, where pre-existing satellite imagery of the flight location to which the UAV imager is compared and aligned. This results in an average positioning error of smaller than 8 m.

In brief, the representative studies overviewed in this paper are summarized in Table 1, where their applications, characteristics, and used methods are analyzed. Obviously, no programmable hardware implementation has been achieved, where an edge detection algorithm is applied for UAV pose estimation to reduce the cost, power consumption, and response time of the system. Also, no flexible system is available.

Table 1. Overview of the representative related studies.

View table

View all Tables

Table 1. Overview of the representative related studies.

Applications	Characteristics	Methods
Edge detection using residual learning	A residual deep neural network based on the VGG-16 architecture with deep supervision is developed.	DCNN [29]
Edge detection using two pyramid networks	A down-sampling pyramid network and a lightweight up-sampling pyramid network are constructed to enrich the multi-scale representation from the encoder and decoder, respectively.	Multi-stream learning approach [31]
Real-time image filtering and edge detection	The image information is collected by the camera, Gaussian filtering is applied to remove noise, then Sobel processing is performed, and the image edge processing is finally realized.	Gaussian filtering and Sobel edge processing algorithms implemented on FPGA [35]
Real-time image filtering and edge detection	The image filtering and edge detection is investigated and analyzed where LUT is applied instead of a multiplier, and a distributed algorithm is used in terms of hardware.	Method based on FPGA [36]
Integrated navigation systems	The proposed metaheuristic algorithms are reviewed compared with GA and PSO algorithms.	Metaheuristic algorithms [8]
Edge detection	A real-time data-driven fire propagator is used to support wildfire fighting operation and to facilitate the risk assessment and decision-making process.	Mono-dimensional noise-resistant algorithm [37]
Edge detection	The acquisition, storage, and image display of image data are completed by an FPGA-based image processing system, and the Sobel edge detection algorithm is processed and implemented.	Sobel edge detection algorithm implemented on FPGA [39]
Edge detection	The RFD mask used for edge detection is obtained by using various interpolation methods. The mask size is selected based on the figure of merit and edge preservation index. The edges obtained with the proposed approach in the FrFT domain are further used for image enhancement.	RFD in the FrFT domain [38]
Processing of colored UAV images	A novel guiding equation is used to optimize the positions of the improved cuckoo algorithm before the Levi flight. And after the Levi flight, a novel disturbance equation is applied to obtain a varied location for the next location.	Novel quaternion-based improved cuckoo algorithm [40]
Edge extraction	This is the original strategy of applying image convolution from segmented images.	Sobel’s algorithm [41]
Image convolution for UAV positioning estimation	In terms of image edge identification, Sobel’s and Canny’s algorithms are compared with MLP-NN.	Sobel’s, Canny’s, and MLP-NN algorithms, where the neural network is implemented on both CPU and FPGA [27]

3 Methodology

3.1 Image processing

The UAV coordinates can be identified from the maximum convolution between the segmented images from UAV (in the thermal infrared band) and the georeferenced Google Earth images. To estimate the UAV position, the procedure shown in Fig. 1 is conducted in this paper.

Figure 1.Flowchart of the procedure to estimate the UAV position with the proposed SVR technique.

Table 1. Overview of the representative related studies.

Table 1. Overview of the representative related studies.

Table 2. Description of Algorithm 1.

Table 2. Description of Algorithm 1.

Table 3. Description of control signals and the corresponding functions.

Table 3. Description of control signals and the corresponding functions.

Table 4. Summarized features of the hardware implementation.

Table 4. Summarized features of the hardware implementation.

Table 5. Comparison of the results from the SVM classifier processed in CPU, GPU, and both.

Table 5. Comparison of the results from the SVM classifier processed in CPU, GPU, and both.

Table 6. Comparison of the results from Architecture N#1, Architecture N#3, and Architecture N#9 without and with DPR.

Table 6. Comparison of the results from Architecture N#1, Architecture N#3, and Architecture N#9 without and with DPR.

Table 7. DPR information of Architecture N#1, Architecture N#3, and Architecture N#9.

Table 7. DPR information of Architecture N#1, Architecture N#3, and Architecture N#9.

Table 8. Performance comparison of our proposed architecture with the one reported in Ref. [54].

Table 8. Performance comparison of our proposed architecture with the one reported in Ref. [54].

微信扫一扫：分享