Journal of Infrared and Millimeter Waves, Volume. 43, Issue 6, 775(2024)

DIFNet: SAR RFI suppression network based on domain invariant features

Wen-Hao LV2...3, Fu-Ping FANG1,* and Yuan-Rong TIAN1 |Show fewer author(s)
Author Affiliations
  • 1School of Electronic Science,National University of Defense Technology,Changsha 410073,China
  • 2University of Chinese Academy of Sciences,Beijing 100049,China
  • 3School of physics and optoelectronic Engineering,Hangzhou Institute for Advanced Study,University of Chinese Academy of Sciences,Hangzhou 310024,China
  • show less

    Synthetic aperture radar (SAR) is a high-resolution two-dimensional imaging radar. However, during the imaging process, SAR is susceptible to intentional and unintentional interference, with radio frequency interference (RFI) being the most common type, leading to a severe degradation in image quality. To address the above problem, numerous algorithms have been proposed. Although inpainting networks have achieved excellent results, their generalization is unclear. Whether they still work effectively in cross-sensor experiments needs further verification. Through time-frequency analysis to interference signals, this work finds that interference holds domain invariant features between different sensors. Therefore, this work reconstructs the loss function and extracts the domain invariant features to improve its generalization. Ultimately, this work proposes a SAR RFI suppression method based on domain invariant features, and embeds the RFI suppression into SAR imaging process. Compared to traditional notch filtering methods, the proposed approach not only removes interference but also effectively preserves strong scattering targets. Compared to PISNet, our method can extract domain invariant features and holds better generalization ability, and even in the cross-sensor experiments, our method can still achieve excellent results. In cross-sensor experiments, training data and testing data come from different radar platforms with different parameters, so cross-sensor experiments can provide evidence for the generalization.

    Keywords

    Introduction

    Synthetic aperture radar is an active microwave sensing system that adopts synthetic aperture and pulse compression techniques to acquire high-resolution images1. Different waveband SAR systems are suitable for different applications. P-band SAR is commonly used for underground imaging and vegetation penetration. L,S,and C-band SARs are widely used for ocean monitoring and agricultural management. X and Ka-band SARs are often employed for high-resolution imaging2-3. In addition,laser-based synthetic aperture radar is gradually receiving widespread attention4-5. During imaging,intentional and unintentional interferences often exist,with RFI being a widely common type,and the wide-range and high-intensity RFI significantly degrades SAR image quality6-7.

    To address these issues,numerous interference suppression algorithms have been proposed,broadly categorized into three types:non-parametric methods,parametric methods,and semi-parametric methods 8. Regarding non-parametric methods,the letter[9]proposed an eigen-subspace-based filtering approach and this method holds very good compatibility with existing SAR imaging algorithms. The paper[10]proposed a generic subspace model for characterizing a variety of RFI types,and next designed a block subspace filter to remove RFI in SLC data. Parameterization methods often use iterative methods to solve interference parameters,and then filter out interference11-12,which is often constrained by complex environment. The semi-parametric methods have gradually become mainstream due to its excellent performance,but they still face the drawback of high computational complexity. Common semi-parametric methods include sparse reconstruction13,variants of robust PCA14-15,and so on. Deep learning has been widely deployed in various fields due to its excellent performance16-17,and naturally it is introduced into interference suppression 18-19. The time-frequency domain radio frequency interference suppression method proposed in the work[20]achieved better performance than robust PCA,and the networks proposed in[18,20]are collectively referred to as the image inpainting network.

    Although image inpainting networks have achieved excellent results,their generalization is unclear,and whether they still work effectively in cross-sensor experiments needs further verification. What’s more,in SAR interference suppression,there is a significant issue of incomplete data. Typically,we either obtain clean data or interfered data. Clean data and interfered data lack a corresponding relationship. To solve the above problems,this paper proposes a RFI suppression network based on domain invariant features,which offers the following contributions:

    (1)Through time-frequency analysis of interference signals,we find that interference holds domain invariant features between different sensors. Therefore,this paper reconstructs the loss function and extracts the domain invariant features to improve its generalization. What’s more,we also found that interference holds global characteristics on time-frequency spectrogram. Therefore,we adopt Transformer as the backbone network,and reduce the computational complexity by limiting the attention mechanism into local windows.

    (2)Compared to traditional notch filtering methods,our network avoids mistakenly classifying strong scattering targets,and the proposed method achieves better interference suppression effect. Compared with image inpainting networks,this method holds stronger generalization ability in cross-sensor experiments. Even if the training data and testing data come from different sensors,the algorithm can still achieve excellent results. What’s more,our method only requires the interfered data to perform interference suppression. Therefore,this approach can bypass the issue of incomplete data.

    The organization of this paper is as follows:Section 1 introduces the signal model and the network. Section 2 presents the experimental results. And section 3 provides a summary of the total paper.

    1 Method

    This paper proposes a network based on domain invariant features(DIFNet),and embeds the RFI suppression into SAR imaging process. The overall process is illustrated in Fig. 1,and the algorithm is shown in Table 1. The first step is to locate interfered SAR echoes,and the method is as follows:the SAR imaging algorithm is similar to a linear transformation,which converts SAR echoes into SAR images. Therefore,there is a clear correspondence between SAR echoes and SAR images. In this paper,we first locate the interference area in the SAR images,and then find the corresponding SAR echoes. The second step is to transform signals into time-frequency domain one by one by short-time Fourier transform. The third step is to use the proposed method to suppress interference. The fourth step is to transform signals into original domain by inverse short-time Fourier transform. The fifth step is to transform SAR echoes into SAR images by SAR imaging algorithm. Moreover,interference suppression mainly consists of three steps. The first step is to model interference signals,and then construct training data based on the proposed model. The second step is to separate the interference from aliasing signals. The third step is to convert SAR echoes to SAR images by SAR imaging algorithms.

    • Table 1. DIFNet’s pipeline

      Table 1. DIFNet’s pipeline

      Algorithm I:DIFNet’s pipeline

      1. Detect RFI in SAR images;

      2. Perform STFT pulse-by-pulse;

      3. Predict RFI by DIFNet;

      4. Subtract RFI;

      5. Perform ISTFT pulse-by-pulse;

      6. Convert SAR echoes into SAR images.

    Flow chart of RFI suppression network based on DIFNet

    Figure 1.Flow chart of RFI suppression network based on DIFNet

    1.1 RFI signal models

    Common RFI can be categorized into narrowband interference,chirp broadband interference and sinusoidal broadband interference6.The narrowband interference can be expressed as follows:

    snbit=n=1NAnrect ttrexp j2πfct+j2πfnt,

    wherein,tr is the during time,fc is the carrier frequency,An is the amplitude,fn is the frequency offset,and N is the number of interference signals. The chirp modulation interference can be expressed as:

    scmt=n=1NBnrect ttrexp j2πfct+j2πknt2,

    where,Bn is the amplitude,and kn is the tuning rate. The sinusoidal modulation interference can be represented as follows:

    ssmt=n=1NCnrect ttrexp j2πfct+jβnsin 2πfnt,

    where,Cn is the amplitude,βn is the modulation coefficient and fn is the modulation frequency. And formulas(1),(2),(3)can be uniformly expressed as follows:

    sRFIt=n=1NDnrect ttrexp j2πfct+j2πkRFIt2,

    wherein,Dn is the amplitude,kRFI is the tuning rate. When kRFItr is small,RFI is a narrowband interference,and when kRFItr is large,RFI is a broadband interference. Referring to formula(4),it can be found that:

    f=kRFIt .

    In formula(5),it can be seen that the signal holds global characteristics on time-frequency spectrogram,so a Transformer network will work well. Moreover,in cross-sensor experiments,interference signals do not change with radar signals. Therefore,in the signal domain,interference signals hold domain invariant characteristics,which enlightens us to extract the homogeneous characteristics of interference,so that the algorithm may be generalized between different sensors.

    1.2 Network

    Fig. 2 illustrates the DIFNet,which consists of an encoder and a decoder,the input is the interfered image IR1×H×W,and the output is the label OR1×H×W. The input projection layer consists of three CNN layers and a ReLU activation layer,the channels’ number is C,and the extracted feature can be expressed as X0RC×H×W. The encoder consists of an input projection layer,multiple Transform block layers,and down-sampling layers. The decoder consists of an input projection layer,multiple Transform block layers,and up-sampling layers. Internal structures of DIFNet are shown in Fig. 3,the Transformer block consists of a local multi-head attention mechanism layer(Local-MAM)and a CNNs layer. In the Local-MAM,firstly,it divides the input into N windows,then,extracts global information within each window,and finally concatenates all windows. The CNNs consists of three CNN layers and a ReLU activation layer. Both encoder and decoder include L layers,each layer consists of a Transformer block and a down-sampling layer or up-sampling layer. The down-sampling layer will reduce the size of the image by half and double the number of channels,and the up-sampling layer will double the size of the image and reduce the number of channels by half. For a given input X0RC×H×W,the output feature map of the l-th stage can be represented as XlR2lC×H2k×W2k. And there is a skip connection between the encoder and decoder. In Fig. 2,in order to better balance performance and computational cost,we set the parameters as follows:H×W=512×512C=16M×M=8×8L=4. When the image size is too large,the computational load will increase rapidly,so we set the input image size as H×W=512×512. Increasing the number of channels C will further increase the amount of extracted information,but in our experiment,we found that the performance improvement is slower when C is further increased. Setting the window size to M×M=8×8 can effectively balance computation and performance.

    DIFNet diagram

    Figure 2.DIFNet diagram

    Internal structures of DIFNet

    Figure 3.Internal structures of DIFNet

    1.2.1 Transformer block

    The proposed Transformer Block consists of a Local-MAM and a CNNs,and its advantages are as follows:firstly,compared to traditional Transformer,it significantly reduces computational complexity,because our method limits the calculation into a non-overlapping local window. Secondly,the propose block can capture both global and local information by Local-MAM and CNNs. The Transformer block can be represented as follows:

    Xl=Local-MAMLNXl-1+Xl-1,
    Xl=CNNsLNXl+Xl .

    1.2.2 Local-MAM

    For traditional Transformer,due to its global receptive field,its computational cost is particularly high. However,there is a significant amount of redundancy information in images. Therefore,we can limit the attention mechanism to a local window. We first split XRC×H×W,into M×M non overlapping blocks,and the input data to Local-MAM is XiRC×M×M. Next,we calculate multi-head attention on each window. The computation is as follows:

    X=X1,X2,,XN,  N=HW/M2,
    Yki=AttentionXiWkQ,XiWkK,XiWkV,  i=1,,N .
    Y^k=Yk1,Yk2,,YkN .

    Lastly,we concatenate the output of all multi-head attention layers,and adopt a linearly projected layer to obtain final result. Similarly,we also introduced a relative position encoding B. The calculation of multi-head attention is as follows:

    AttentionQ,K,V=SoftmaxQKTdk+BV .

    Compared to traditional Transformer,the computational complexity of Local-MAM is reduced from OH2W2C to OM2HWC,and M is the window size. Usually,MminH,W,so this method can reduce complexity.

    1.2.3 CNNs

    For the standard Transformer,it is hard to capture local contextual information due to its equal distance between pixels. Considering the importance of neighboring pixels for image tasks,we introduce a cascade CNNs in the Transform block to capture local information,and the CNNs consists of three CNN layers and a ReLU activation layer.

    1.2.4 Loss function

    In inpainting network,the loss can be defined as:

    l=HX-Y2+ε,

    H is the network’s matrix,X is the input,and Y is the label. In order to extract domain invariant interference features,Y is optimized as follows:

    Ypixel=1,  pixelRFI0,  pixelRFI .

    Pixel represent pixel value of the image,and in equation(13),it can ensure that the measurement distance of RFI does not change in different sensors. Therefore,it can induce the network to learn domain invariant features.

    So,the loss can be expressed as follows:

    l=HXpixel-Ypixel2+ε,

    In Equation(12),the loss function contains two constraints. The first part is the interference constraint,and the second part is the target constraint. In Equation(14),we set the target pixels to 0 and the interference pixels to 1. According to Equation(5),the interference signal holds domain invariant features,while the target signal varies with radar parameters. Through the above constraints,and the network will learn domain invariant features,and can be migrated between different sensors. Lastly,we subtract the predicted interference from the original time-frequency spectrogram,so the interference can be filtered out.

    1.3 SAR imaging

    After removing the interference,we need to perform imaging processing on the clean echoes. The radar imaging process is shown in Fig. 4. The radar moves from point A to point B with a speed of v,and during this motion,it continuously detects the target. The vertical distance between the radar and the target is R0,and the distance equation between the radar and the target is represented as Rη.

    SAR imaging model

    Figure 4.SAR imaging model

    In the motion,the clean SAR echoes can be expressed as follows:

    srτ,η=A0wrτ-2Rη/cwaη-ηcexp-j4πfcRη/cexp jπKrτ-2Rη/c2,

    wherein,τ is the fast time,η is the slow time,A0 is the amplitude,w is rectangle window function,ηc is the Doppler center,Rη is range function,fc is the carrier frequency,and Kr is range tuning rate. The range function can be expressed as follows:

    Rη=R02+vη2,

    wherein,R0 is vertical distance between the radar and the target,and v is radar speed. By range compressing,the signal can be expressed as follows:

    srcτ,η=A0prτ-2Rη/cwaη-ηcexp -j4πfcR0/cexp -jπ2v2λR0η2,

    wherein,p is a sinc-function signal. By azimuth Fourier transform,the signal can be expressed as follows:

    Srcτ,fη=A0prτ-2Rfη/cWaη-ηcexp -j4πfcR0/cexp -jπfη2Ka,

    wherein,fη is azimuth frequence,and Ka=2v2λR0 is the azimuth tuning rate. By correcting range cell migration component,the signal can be expressed as follows:

    Srmτ,fη=A0prτ-2R0/cWaη-ηcexp -j4πfcR0/cexp -jπfη2Ka .

    By azimuth compressing,the signal can be expressed as follows:

    Sraτ,η=A0prτ-2R0/cpaηexp-j4πfcR0/cexp jπfηcη .

    From the above processing,the target is focused,and we can acquire the clean SAR images.

    2 Experiments

    In our experiments,training data comes from airborne MiniSAR,with an image size of 512×512,and training data includes a total of 2 048 images. In the training,the maximum epoch is 100,the batchsize is 4,the learning rate is 0.0002,the weight decay is 0.02,the optimizer is AdamW,and iterative loss curve and pixel accuracy is shown in Fig. 5. The interference parameters in the training are shown in Table 3. In the same sensor experiments,there are 512 testing images,and the testing data and training data come from different scenes,so there is no overlap between the training data and testing data. In the cross-sensor experiments,there are 512 testing images. The testing data was captured in the Korean region in 2019 from public Sentinel-1satellites. Therefore,there is no overlap between the testing data and training data. To validate the effectiveness of the proposed method,experiments are conducted on both MiniSAR dataset and Sentinel-1 dataset. The resolution of MiniSAR is 0.1 m,while the resolution of Sentinel-1 is 5✕20 m. All training data comes from MiniSAR,and the testing data comes from MiniSAR and Sentinel-1. The radar parameters of training and testing data are shown in Table 2. From Table 2,it can be seen that in the cross-sensor experiments,the training and testing data come from different radars with different parameters. Therefore,the cross-sensor experiments can be used to verify the generalization.

    Iterative loss curve of the network

    Figure 5.Iterative loss curve of the network

    • Table 3. Interference parameters in MiniSAR

      Table 3. Interference parameters in MiniSAR

      Bandwidth

      Parameters

      NarrowbandBroadband
      Interference bandwidth<30 MHz30 MHz~150 MHz
      SIR-15 dB~0 dB-15 dB~0 dB
      Interference source23~5
    • Table 2. Radar parameters of training and testing Data

      Table 2. Radar parameters of training and testing Data

      Data

      Parameters

      Training data

      Same-sensor

      experiments

      Cross-sensor

      experiments

      SourceMiniSARMiniSARSentinel-1A
      BandX-bandX-bandC-band
      Bandwidth1.5 GHz1.5 GHz100 MHz
      Polarization modeHHHHVV/VH

    2.1 Evaluation metrics

    To reasonably evaluate the test results,this paper adopts pixel accuracy(PA),intersection over union(IoU),PSNR18and ME12 as evaluation metric. PA and IoU are used to evaluate the DIFNet,and PSNR and ME is used to evaluate image quality. PA is defined as follows:

    PA=TP+TNTo,

    where,true positive(TP)represents the true positive instances,that is,the number of instances where the model predicts a positive class and the actual label is also positive. True negative(TN)represents the number of true negative examples,that is,the number of instances where the model predicts a negative class and the actual label is also negative. To is the total pixel numbers. IoU is defined as follows:

    IoU=ABAB,

    where,A represents the predicted interference area,and B represents the actual interference area. PSNR is usually used to evaluate the quality of images,and it is shown as follows:

    PSNRX,X^=20log10MaxXMSEMSEX,X^=1HWi=0W-1Xi,j-X^i,j2,

    X^ is the filtered image,and X is the label,MSE is root-mean-square,and H and W represent pixel numbers. It can be seen that PSNR represents evaluation index of noise level. The larger PSNR,the better filtering performance.

    ME is defined as follows:

    ME=EntX^MeanX^,

    EntX^ is the entropy,MeanX^ is the mean value. A smaller entropy indicates that the pixel values of the image are concentrated within a smaller range. A smaller mean value indicates a lower amplitude,suggesting that most of the interference has been filtered out. Therefore,a smaller ME indicates a better result.

    2.2 Same-sensor experiments

    In the MiniSAR experiments,the interference parameters are shown in Table 3. The interference is divided into two types:narrowband interference and broadband interference. The bandwidth of narrowband interference is less than 30 MHz,and the bandwidth of broadband interference is between 30 MHz and 150 MHz. The signal-to-interference ratio of both interferences is -15~0 dB,with 2 narrowband interference sources and 3~5 broadband interference sources.

    The filtering results on the time-frequency spectrogram are shown in Fig. 6. In Fig. 6(b),it adopts constant false alarm rate(CFAR)method to filter out interference. Some strong scattering points have high intensity,so they may be mistaken as interference,as marked in the red boxes. Compared Fig. 6(b)and Fig. 6(d),it can be observed that for the traditional notch filtering method,some strong scattering targets are mistakenly filtered out,but for DIFNet,these strong scattering targets are well preserved. Wherein,the image restoration network PISNet achieved the best result. Fig. 7 shows the imaging results,the horizontal axis represents the azimuth direction and the vertical axis represents the range direction. The size of the image is 512×512. The interference area includes the total range direction,and the interference area roughly ranges from the 100-th pixel to the 400-th pixel in the azimuth direction. It can be seen that the proposed algorithm preserves more details and produces a cleaner filtering result comparing with traditional notch filter. The evaluation results are presented in Table 4,it can be seen that the proposed method achieves a 1.6% improvement in PA,a 5.99% improvement in IoU,a 2.02 dB improvement in PSNR,and a 0.05 decrease in ME compared to the traditional algorithm. PISNet still achieved the best result.

    Time frequency spectrogram:(a)interfered time-frequency spectrogram in same-sensor experiment;(b)time frequency spectrogram by notch filtering;(c)time frequency spectrogram by PISNet;(d)the time-frequency spectrogram by DIFNet

    Figure 6.Time frequency spectrogram:(a)interfered time-frequency spectrogram in same-sensor experiment;(b)time frequency spectrogram by notch filtering;(c)time frequency spectrogram by PISNet;(d)the time-frequency spectrogram by DIFNet

    Suppression results in MiniSAR:(a)interfered image;(b)label;(c)result by notch filtered;(d)result by PISNet;(e)result by DIFNet

    Figure 7.Suppression results in MiniSAR:(a)interfered image;(b)label;(c)result by notch filtered;(d)result by PISNet;(e)result by DIFNet

    • Table 4. Same-sensor Result

      Table 4. Same-sensor Result

      Methods

      Indicators

      Interfered imageNotch filteringPISNetDIFNet
      PA/94.95%/96.55%
      IoU/74.41%/80.40%
      PSNR/dB11.3621.4925.0523.51
      ME0.240.090.040.04

    2.3 Cross-sensor experiments

    The interfered dataset is obtained from Sentinel-1,captured in the Korean region on 2/16,2019. The image area is cropped to a size of 512✕512,as shown in Fig. 9(a). The time-frequency spectrogram is shown in Fig. 8,and the filtering results and performance indicators are shown in Fig. 9 and Table 5. In the Sentinel-1 testing dataset,the training data still comes from MiniSAR. And in this cross-sensor experiment,PISNet does not work,so we do not present its results,while our method can still suppress the interference,demonstrating its good generalization. Since notch filter relies on intensity differences to filter out interference,and at the starting position of the interference in Fig 8(b),the interference power is relatively lower,the lower-intensity interference is not detected as shown in the red boxes,resulting in some residual components. Comparing Fig. 8(b)and Fig. 8(c),it can be observed that the proposed method acquires a better filtering result. Similarly,comparing Fig. 9(b)and Fig. 9(c),it can be seen that it is difficult to filter out low-intensity residual interference for traditional notch filter,while our method can effectively filter out the residual interference. Comparing with traditional notch filter,our method has achieved a 1.89% improvement in PA,a 2.60% improvement in IoU,a 0.15 decrease in ME.

    Time-frequency spectrogram in cross-sensor experiment:(a)interfered time-frequency spectrogram;(b)notch filtering result;(c)DIFNet filtering result

    Figure 8.Time-frequency spectrogram in cross-sensor experiment:(a)interfered time-frequency spectrogram;(b)notch filtering result;(c)DIFNet filtering result

    Cross-sensor results in Sentinel-1:(a)interfered image,ME=3.40;(b)notch filtering image,ME=2.34;(c)DIFNet filtering image,ME=2.19

    Figure 9.Cross-sensor results in Sentinel-1:(a)interfered image,ME=3.40;(b)notch filtering image,ME=2.34;(c)DIFNet filtering image,ME=2.19

    • Table 4. Same-sensor Result

      Table 4. Same-sensor Result

      Methods

      Indicators

      Interfered imageNotch filteringPISNetDIFNet
      PA/94.95%/96.55%
      IoU/74.41%/80.40%
      PSNR/dB11.3621.4925.0523.51
      ME0.240.090.040.04

    In the cross-sensor experiments,the training data and testing data come from different radar platforms with different radar parameters. From the above experimental results,it can be seen that image inpainting network does not even work,but our method can still acquire excellent results. The above results demonstrate that our method holds good generalization.

    3 Conclusions

    SAR is widely deployed as a high-resolution imaging radar,but it is susceptible to intentional and unintentional RFIs. For image inpainting networks,although they have acquired excellent results,their generalization is unclear. To address this problem,through time-frequency analysis to interference signals,we find that interference holds domain invariant features between different sensors,so we propose a SAR RFI suppression network based on domain invariant features. Compared to traditional notch filtering methods,the proposed method acquires better interference suppression performance. Furthermore,in the cross-sensor experiments,the training data and the testing dataset are from different radars with different resolutions,and the image inpainting networks do not work,but our method can still acquire excellent results. The above demonstrates that our method holds good performance and generalization. Moreover,this method can inspire self-supervised learning,as the segmented time-frequency spectrogram forms a masking task,which can be repaired by self-supervised networks.

    [4] Hongyi Zhang, Fei Li, Weiming Xu et al. Improved differential synthetic aperture lidar for vibration suppression. Journal of Infrared and Millimeter Waves.

    [5] B N Wang, J Y Zhao, W Li et al. Research on high-resolution imaging technology of array laser synthetic aperture radar. Journal of Radar.

    Tools

    Get Citation

    Copy Citation Text

    Wen-Hao LV, Fu-Ping FANG, Yuan-Rong TIAN. DIFNet: SAR RFI suppression network based on domain invariant features[J]. Journal of Infrared and Millimeter Waves, 2024, 43(6): 775

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Millimeter Wave and Terahertz Technology

    Received: Mar. 26, 2024

    Accepted: --

    Published Online: Dec. 13, 2024

    The Author Email: FANG Fu-Ping (capkoven@mail.ustc.edu.cn)

    DOI:10.11972/j.issn.1001-9014.2024.06.008

    Topics