Application of Deep Learning in Underwater Imaging（Invited）

Jun XIE; Jianglei DI; Yuwen QIN

doi:10.3788/gzxb20225111.1101001

Acta Photonica Sinica, Volume. 51, Issue 11, 1101001(2022)

Application of Deep Learning in Underwater Imaging（Invited）

Jun XIE, Jianglei DI^*, and Yuwen QIN^**

Institute of Advanced Photonics Technology，School of Information Engineering，Guangdong Provincial Key Laboratory of Information Photonics Technology，Guangdong University of Technology，Guangzhou 510006，China

show less

Abstract Get PDF(in Chinese)

Figures & Tables(55)

Fig. 1. Principle of underwater image degradation

Download full size

View in Article

Fig. 2. Classification of underwater imaging

Download full size

View in Article

Fig. 3. CNN structure

Download full size

View in Article

Fig. 4. Effects of different algorithms before and after processing^［23］

Download full size

View in Article

Fig. 5. The applications of deep learning in image enhancement

Download full size

View in Article

Fig. 6. Image enhancement effect by neural networks

Download full size

View in Article

Fig. 7. WaterGAN model structure^［32］

Download full size

View in Article

Fig. 8. Image restoration results^［41］

Download full size

View in Article

Fig. 9. Neural network for parameter estimation^［56］

Download full size

View in Article

Fig. 10. Neural network for image restoration

Download full size

View in Article

Fig. 11. Diagram of our two-stage learning^［59］

Download full size

View in Article

Fig. 12. Computational polarization difference imaging systems based on Stokes vector^［63］

Download full size

View in Article

Fig. 13. The relationship between K（x，y）and ∆D（x，y）^［66］

Download full size

View in Article

Fig. 14. Passive under water polarization imaging detection method in neritic area^［4］

Download full size

View in Article

Fig. 15. Recovery results of different underwater objects^［61］

Download full size

View in Article

Fig. 16. Recovery results of different underwater objects^［70］

Download full size

View in Article

Fig. 17. Neural network for polarimetric underwater image recovery

Download full size

View in Article

Fig. 18. Four kinds of polarization-intensity information confluence models and its comparative versions^［73］

Download full size

View in Article

Fig. 19. Comparison between raw images and restoration results of eight models^［73］

Download full size

View in Article

Fig. 20. Schematic diagram of ghost imaging

Download full size

View in Article

Fig. 21. Structure of CGI^［77］

Download full size

View in Article

Fig. 22. Reconstruction results of CSGI and GIDL at different sampling rates^［88］

Download full size

View in Article

Fig. 23. Reconstruction results based on DL and CS methods at different concentrations^［87］

Download full size

View in Article

Fig. 24. Comparison of simulation results of UGI-GAN，UDLGI，and PDLGI at different sampling rates^［84］

Download full size

View in Article

Fig. 25. Hyperspectral image data cube

Download full size

View in Article

Fig. 26. HyperDiver UHI system and its components^［95］

Download full size

View in Article

Fig. 27. A multi-faced dataset from HyperDiver^［95］

Download full size

View in Article

Fig. 28. Color image of the seabed from UHI and SAM classification^［97］

Download full size

View in Article

Fig. 29. Underwater spectral imaging with filterwheel^［89］

Download full size

View in Article

Fig. 30. A tunable LED-based underwater multispectral imaging system^［98］

Download full size

View in Article

Fig. 31. Staring underwater spectral imaging system with optimal waveband subset^［100］

Download full size

View in Article

Fig. 32. Self-supervised hyperspectral and multispectral image fusion network^［110］

Download full size

View in Article

Fig. 33. The structure of single pixel camera^［115］

Download full size

View in Article

Fig. 34. Single-pixel imaging system^［116］

Download full size

View in Article

Fig. 35. Reconstruction results by traditional FSI and FSPI^［129］

Download full size

View in Article

Fig. 36. Reconstruction results of GAN-FSI and FSI at different sampling rates^［130］

Download full size

View in Article

Fig. 37. CS-SRCNN network structure^［133］

Download full size

View in Article

Fig. 38. LLS structure

Download full size

View in Article

Fig. 39. Principle of streak tubeimaging^［147］

Download full size

View in Article

Fig. 40. Results of streak tube 3D imaging^{［151-153］}

Download full size

View in Article

Fig. 41. The target imaging with the distance of 20 m in clear water was recorded by the lidar-radar^［158］

Download full size

View in Article

Fig. 42. The principle of underwater range-gated imaging system

Download full size

View in Article

Fig. 43. Images of underwater target^［169］

Download full size

View in Article

Fig. 44. Holographic imaging structure diagram

Download full size

View in Article

Fig. 45. Robot-driven DIHM^［198］

Download full size

View in Article

Fig. 46. Rapidly extract focused targets from underwater digital holograms^［212］

Download full size

View in Article

Table 1. Summary of traditional underwater image enhancement methods and deep learning methods

View table

View in Article

Table 1. Summary of traditional underwater image enhancement methods and deep learning methods

Methods	Principle	Advantages	Disadvantages	Application
Spatial domainmethod	Adjust the gray scale and RGB channels of spatial pixels	Easy to implementandobvious effects	Easy to cause oversaturation and loss of details；Has a certain blindness	Adjust the overall or local over bright（dark）problem；Increase image contrast
Frequency domainmethod	Transform images to the corresponding domain for filtering	Separate high and low frequency information；Enhance edge information；suppress interference noise；High processing efficient in the frequency domain	Limited effect on processing color distortion and low contrast	Denoising；Deblurring
Color constancy method	According to the relationship between the environment and the target pixel，the environment information is estimated and the raw image is restored according to the hypothesis	Great color restoration effect	Rely on the accuracy of assumptions；Limited effect on image denoising	Color correction
Method based on deep learning	The degraded image is restored by using the mapping between degraded image and restored image learned by neural network	Noise removal，color correction and contrast increase can be performed at the same time；No prior information is required	Network training takes time；Heavily dependence on datasets；Poor generalization ability	Denoising； Color correction；Improving contrast

Table 2. Summary of image restoration methods based on priori and deep learning

View table

View in Article

Table 2. Summary of image restoration methods based on priori and deep learning

Methods

Principle

Advantages

Disadvantages

Application

Restoration methods based on priori

The water features and related parameters are estimated by a priori hypothesis，and the images before degradation are restored by physical model

It is targeted and directional，and avoids blind recovery；Results recovered by physical model are natural

The choice of a priori hypothesis is subject to subjective influence；The model deviation and other restrictive factors make it difficult to apply in complex water environment

Deblurring；

Color correction；Contrast enhancement

Restoration methods based on deep learning

Neural network is used to learn the mapping between degraded image and related parameters to estimate model parameters，and restore the degraded image

It avoids subjective error caused by artificial selection of prior conditions and has certain generalization

It heavily relies on datasets；Artificial datasets differ from the real environment；It takes longer time compared with prior method

Deblurring；

Color correction；Contrast enhancement

Table 3. Summary of underwater polarization imaging methods and deep learning-based methods

View table

View in Article

Table 3. Summary of underwater polarization imaging methods and deep learning-based methods

Methods	Principle	Advantages	Disadvantages	Application
Polarization difference imaging	It uses the difference of the light vibration between the target and the background to remove the background scattering noise	Simple and effective	The restoration results of objects with various polarization and details are poor	Deblurring；Imaging in scattering media
Passive polarization imaging	According to the difference of polarization characteristics between background scattered light and target light under natural light，the clear scene image is reconstructed by using underwater light transmission model	Distance information is added to the physical model，which has a significant restoration effect on complex scenes	The background area needs to be selected manually；The model is only applicable to objects with low degree of polarization；The recovery effect is poor under high scattering concentration；Uniform light field conditions are required	Deblurring；Imaging in scattering media
Active polarization imaging	The active complete polarized light source is introduced，and the background scattering noise is removed by using the polarization characteristics difference between the background and the target reflected light	It is suitable for low illumination environment；Imaging quality is better than underwater passive polarization imaging	The restoration effect is limited when the difference between the target and the background polarization degree is small or the target contains multiple polarization degrees；The assumption that the polarization direction of the target light and the background scattered light in the model is the same is different from the reality	Deblurring；Imaging in scattering media
Polarization imaging based on deep learning	It uses the additional information of polarization on light intensity to improve the effect of traditional intensity image restoration，recognition，fusion and reconstruction	It has better imaging quality and complete details than conventional imaging	It is heavily dependent on datasets and still in preliminary exploration	Deblurring；Imaging in scattering media

Table 4. Summary of different ghost imaging methods and methods based on deep learning

View table

View in Article

Table 4. Summary of different ghost imaging methods and methods based on deep learning

Methods	Principle	Advantages	Disadvantages	Application
TGI	It calculates the correlation of light field intensity fluctuation to reconstruct the target	Strong anti-interference ability；Lensless imaging；Wide scope of action	It needs two optical paths，which is complicated in experiment；A large amount of data needs to be collected，and the relevant calculation takes a long time；Low signal-to-noise ratio	Denoising；Imaging in scattering media
CGI	The target image is obtained by calculating the intensity distribution and the second-order correlation of the intensity collected by the detector	The controllable light field is obtained by SLM or DMD，and the experiment is simplified to a single light path；Greater imaging perspective	It still needs to collect a large amount of data，and the relevant calculation takes a long time	Denoising；Imaging in scattering media
CSGI	Compressed sensing is used for sparse sampling reconstruction of ghost image	It can reconstruct high-quality images at low sampling rate and shorten the sampling time；It hashigh signal to noise ratio	It needs mass computing，and signal processing takes long time	Imaging at a low sampling rate；Super resolution imaging
DIGL	The neural network is used to learn the mapping between blurred image and clear image，or signal collected by bucket detector and reconstructed imaging to reconstruct the image	It avoids using illumination mode and acquires high quality images at a low sampling rate；The reconstruction from barrel detector avoids the complex calculation of CS reconstruction and has better results	Itstill needs mass computing，and heavily relies on datasets	Imaging at a low sampling rate

Table 5. Summary of traditional MS and HS fusion fusion method and deep learning-based method

View table

View in Article

Table 5. Summary of traditional MS and HS fusion fusion method and deep learning-based method

Methods	Principle	Advantages	Disadvantages	Application
Matrix factorization	Based on the linear spectral hybrid model，the end element spectral matrix with high spectral resolution and the abundance matrix with high spatial resolution are obtained by alternating non negative matrix decomposition of HS and MS data，and then the fused image with high spatial resolution and high spectral resolution are obtained by multiplication	The model theory is simple，easy to implement and close to the actual situation	It requires iterative solution and mass computing；Model parameters are sensitive and difficult to set；It relies on observation model	HS and MS fusion
Tensor decomposition	HS is regarded as a three-dimensional tensor，which is decomposed into a three-mode factor matrix and a three-dimensional core tensor by Tucker decomposition. The core tensor is extracted from the high-resolution MS block set by tensor sparse coding，and is multiplied with the factor matrix to obtain images with high spatial resolution and high spectral resolution	The reconstruction quality is better than that based on matrix factorization	Model parameters are sensitive and difficult to set；It requires mass computing	HS and MS fusion
Deep learning based	The mapping between HS and MS and hyperspectral images is established by using neural network for fusion	It has high reconstruction accuracy，high efficiency and good robustness without iteration	It relies heavily on datasets and has poor generalization	HS and MS fusion

Table 6. Summary of different SPI reconstruction methods and methods based on deep learning

View table

View in Article

Table 6. Summary of different SPI reconstruction methods and methods based on deep learning

Methods	Principle	Advantages	Disadvantages	Application
Conventional SPI	The object image is reconstructed by cross-correlation between the illumination field modulated by random pattern and the value obtained by single pixel camera	It has great interference immunity，high single pixel detection frequency and great weak light detection capability	Better image quality requires far more sampling times than the number of reconstructed image pixels	Imaging in scattering media
FSI/HSI	The Hadamard/ Fourier basis spectrum of the target image is obtained by modulating the light field with the Hadamard/ Fourier basis mask，and then the target image is reconstructed by applying the inverse Hadamard/ Fourier transform	It has great interference immunity，and reconstruct the object image without distortion	High frequency details are easy to be lost；Image artifacts exist；High quality reconstruction requires more sampling times	Imaging in scattering media
Deeplearning based	Neural networks are used to learning the mapping ofimage or one-dimensional signalto reconstructed imagefor image reconstruction	It has high reconstruction efficiency，good reconstruction quality and certain de-noising ability	The network is prone to over fitting and takes time to train；It requires high adaptability and robustness of neural network	Imaging in scattering media

Table 7. Summary of different underwater laser imaging methods

View table

View in Article

Table 7. Summary of different underwater laser imaging methods

Methods	Principle	Advantages	Disadvantages	Application
LLS	According to the characteristic that the backscattered light of waterdecreases rapidly relative to the central axis of illumination，the target light and scattered light are separated in space	It reduces the influence of scattered light on imaging	Imaging equipment has large volume；It is impossible to avoid the influence of scattering medium on the transmission optical path；Lengthy imaging time leads to accuracy degradation	Imaging in scattering media
STIL	The deflection module in the streak tube is used to convert the time information into the distance information to obtain the three-dimensional image	It has high imaging accuracy，fast imaging speed and large field of view	It is not suitable for moving target imaging；The system has a short imaging time，which cannot meet the needs of long-time photography	3D imaging
Range-gated imaging	The backscattered light in the process of light transmission is reduced by adjust the open time of laser and camera	It reduces the influence of scattered light on imaging，and has fast imaging speed	Laser energy is scattered，and only small field of view imaging can be performed；The system is costly with limited resolution，and the operation is complex	Imaging in scattering media；3D imaging

Table 8. Summary of Fourier transform reconstruction and reconstruction based on deep learning

View table

View in Article

Table 8. Summary of Fourier transform reconstruction and reconstruction based on deep learning

Methods	Principle	Advantages	Disadvantages	Application
Fourier transform reconstruction	After the hologram is transformed into frequency domain by Fourier transform，the angle difference between the target light wave and other holographic components is used for separation，and then the spatial carrier is removed by inverse Fourier transform. The reconstructed image is obtained by calculating the diffraction integral	It can obtain the amplitude and phase information of objects in real time and quantitatively	It needs mass computing and prior knowledge；Only a single hologram can be processed each time，so the efficiency is low	3D microscopic imaging
Holographic reconstruction based on deep learning	Neural network is used to establish the mapping between hologram and reconstructed image for holographic reconstruction	It has high imaging efficiency and higher imaging quality；No prior knowledge is required	It relies heavily on data sets，requires a large number of different sample data and a wide range of reconstructed distance quantization models	Microbial 3D image reconstruction；3D particle field reconstruction；Microbial identification classification

Table 9. Application of deep learning in underwater imaging

View table

View in Article

Table 9. Application of deep learning in underwater imaging

Application field	Network structure	Input-output	details	Loss function	Application problems
Underwater Image Enhancement	CNN，GAN	Image-image	Residual connection，Dense connection，Inception，Fusion，	L1，LSE，MSE，SSIM，GAN Loss	Deblurring［24，26-27］，Color Correction［25，29］，Dehazing［28-30］，Image Generation［32-39］
Underwater Image Restoration	CNN，GAN	Image-image，Image-parameters	Dense connection，Residual connection，Skip connection，Fusion，Inception	L1，Perpetual loss，MSE，GAN	Color Correction［54-56，57，58］，Deblurring［59］，Dehazing［57］，Image Generation［60］
Underwater Polarization Imaging	CNN	Image-image	Residual connection，Dense connection，Skip connection，Fusion，	MSE，Perpetual loss	Deblurring［71，73］
Underwater Ghost Imaging	MLP，CNN，GAN	1D signal-image，Image-image	Residual Connection，Dense Connection，Fusion，Inception	MSE，Perpetual loss，self-designed	LowSampling Rate Imaging［84-88］，Deblurring［86，88］
Underwater Spectral Imaging	CNN	MS image-Image	Skip Connection	L1	Spectral Fusion［108-110］
Underwater Compressed Sensing Imaging	CNN，GAN	Image-Image，1D signal-image	Skip Connection	MSE，GAN	Low Sampling Rate Reconstruction［85，130，132-133］，Deblurring［130］
Underwater Laser Imaging	—	—	—	—	—
Underwater Holographic Imaging	CNN	Image-3D particle field，Image-classification result	Skip connection，Residual connection，Fusion，	Cross Entropy，MSE，L1，Huber loss［213］	Improve Efficiency［204，207，212］，3D Particle Field Reconstruction［207］，Classification［210-212］