Journal of Electronic Science and Technology, Volume. 22, Issue 2, 100248(2024)

Machine learning algorithm partially reconfigured on FPGA for an image edge detection system

Gracieth Cavalcanti Batista1...2,*, Johnny Öberg1, Osamu Saotome2, Haroldo F. de Campos Velho3, Elcio Hideiti Shiguemori2,4, and Ingemar Söderquist15 |Show fewer author(s)
Author Affiliations
  • 1Division of Electronic and Embedded Systems, KTH Royal Institute of Technology, Stockholm 164 40, Sweden
  • 2Electronic Engineering Division, Aeronautics Institute of Technology, São José dos Campos SP 12228-900, Brazil
  • 3Laboratory of Applied Computing and Mathematics, National Institute for Space Research, São José dos Campos SP 12227-900, Brazil
  • 4Department of C4ISR, Institute of Advanced Studies, São José dos Campos SP 12228-001, Brazil
  • 5Saab AB, Linköping 581 88, Sweden
  • show less
    Figures & Tables(28)
    Flowchart of the procedure to estimate the UAV position with the proposed SVR technique.
    Edge and non-edge image patterns for the training phase input.
    Block diagram of the SVR prediction phase (standardized to the SVM classification phase).
    Details of all machines of the SVM classifier.
    Designed neuron for the proposed project.
    Designed circuit used to load the α values.
    Proposed SVM architecture (FSM with the pipelined datapath).
    FSM specification responsible for controlling the SVM classifier datapath.
    Stages and output variables of the FSM controller.
    Diagram of the accumulator counter block.
    Diagram of the machine counter block in Architecture N#1.
    Illustration of the reconfigurable region and the static area of the three architectures.
    Designed FIFOs used to load (a) SV and (b) Test values into the neurons.
    Designed shifter used to load the SV and Test values into the neurons in Architecture N#3.
    Diagram of the machine counter block in Architecture N#3.
    Design of the circuit that loads the SV and Test values into the neurons in Architecture N#9.
    Planned UAV trajectory marked with the red line.
    SVM classification results with different kernel functions.
    Neuron’s grain size: (a) reference block and (b) report on the neuron’s cell usage from Vivado Design Suite.
    Block diagrams of the proposed setup: (a) Option A where the user can choose between Architecture N#1 and Architecture N#3 and (b) Option B where the user can choose between Architecture N#1 and Architecture N#9.
    • Table 1. Overview of the representative related studies.

      View table
      View in Article

      Table 1. Overview of the representative related studies.

      ApplicationsCharacteristicsMethods
      Edge detection using residual learningA residual deep neural network based on the VGG-16 architecture with deep supervision is developed.DCNN [29]
      Edge detection using two pyramid networksA down-sampling pyramid network and a lightweight up-sampling pyramid network are constructed to enrich the multi-scale representation from the encoder and decoder, respectively.Multi-stream learning approach [31]
      Real-time image filtering and edge detectionThe image information is collected by the camera, Gaussian filtering is applied to remove noise, then Sobel processing is performed, and the image edge processing is finally realized.Gaussian filtering and Sobel edge processing algorithms implemented on FPGA [35]
      Real-time image filtering and edge detectionThe image filtering and edge detection is investigated and analyzed where LUT is applied instead of a multiplier, and a distributed algorithm is used in terms of hardware.Method based on FPGA [36]
      Integrated navigation systemsThe proposed metaheuristic algorithms are reviewed compared with GA and PSO algorithms.Metaheuristic algorithms [8]
      Edge detectionA real-time data-driven fire propagator is used to support wildfire fighting operation and to facilitate the risk assessment and decision-making process.Mono-dimensional noise-resistant algorithm [37]
      Edge detectionThe acquisition, storage, and image display of image data are completed by an FPGA-based image processing system, and the Sobel edge detection algorithm is processed and implemented.Sobel edge detection algorithm implemented on FPGA [39]
      Edge detectionThe RFD mask used for edge detection is obtained by using various interpolation methods. The mask size is selected based on the figure of merit and edge preservation index. The edges obtained with the proposed approach in the FrFT domain are further used for image enhancement.RFD in the FrFT domain [38]
      Processing of colored UAV imagesA novel guiding equation is used to optimize the positions of the improved cuckoo algorithm before the Levi flight. And after the Levi flight, a novel disturbance equation is applied to obtain a varied location for the next location.Novel quaternion-based improved cuckoo algorithm [40]
      Edge extractionThis is the original strategy of applying image convolution from segmented images.Sobel’s algorithm [41]
      Image convolution for UAV positioning estimationIn terms of image edge identification, Sobel’s and Canny’s algorithms are compared with MLP-NN.Sobel’s, Canny’s, and MLP-NN algorithms, where the neural network is implemented on both CPU and FPGA [27]
    • Table 2. Description of Algorithm 1.

      View table
      View in Article

      Table 2. Description of Algorithm 1.

      Algorithm 1: SVR prediction phase
      Require: SV; Alpha; Bias; Sigma; Test
      1: for cont = 1:size(Test, 1) do
      2:  for j = 1:size(SV, 1) do
      3:   for i = 1:size(SV, 2) do
      4:    if (i ≥ 1) && (i < size(SV, 2)) do
      5:     aux = (SV(j, i) – Test(cont, i))²
      6:     aux1 = (SV(j, i+1) – Test(cont, i+1))²
      7:     SqDiff(j, i) = sqrt(aux + aux1)
      8:    else
      9:      aux = (SV(j, i) – Test(cont, i))²
      10:    aux1 = (SV(j, 1) – Test(cont, 1))²
      11:    SqDiff(j, i) = sqrt(aux + aux1)
      12:    end if
      13:    EXPin(j, i) = –SqDiff(j, i) / Sigma(i)
      14:    EXPout(j, i) = exp(EXPin(j, i))
      15:    AlphaMult(j, i) = Alpha(j) * EXPout(j, i)
      16:   end for
      17:  end for
      18:  adderTree(cont, 1) = sum(AlphaMult)
      19:  BiasSum(cont, 1) = adderTree(cont, 1) + Bias
      20: end for
      21: for i = 1:size(BiasSum, 1) do
      22:  if BiasSum(i, 1) ≥ 0 then
      23:   Class(i, 1) = 1
      24:  else
      25:   Class(i, 1) = 0
      26:  end if
      27: end for
    • Table 3. Description of control signals and the corresponding functions.

      View table
      View in Article

      Table 3. Description of control signals and the corresponding functions.

      SignalFunction
      Load_SVIt loads FIFOs with the SV values.
      Load_TestIt loads FIFOs with the Test values.
      Clear_FIFOsIt is the command to clear all FIFOs.
      Load_SquareIt loads the D-flip-flop registers of S2: Square difference.
      Load_AdderEXPIt loads the D-flip-flop registers of S3: Adder + EXP_function.
      Load_AlphaMultIt loads the D-flip-flop registers of S4: Alpha_Mult.
      Load_AccumIt loads the D-flip-flop registers of S5: Accumulator.
      Load_AdderSGNIt loads the D-flip-flop registers of S6: Adder_Bias + SGN.
      Clear_AccumIt clears the accumulator.
      Reset_ALL_RegsIt resets all datapath registers.
    • Table 4. Summarized features of the hardware implementation.

      View table
      View in Article

      Table 4. Summarized features of the hardware implementation.

      ItemFeature
      Input data88 frames of 3×3 pixels
      Classification typeFrame by frame
      Kernel functionGaussian—using the exponential function
      Multi-class techniqueOne-vs-all
      Word size and type18-bit fixed-point
      ArchitecturesFSM + pipelined datapath
      ResultBinary
      Description language VHDL
      Simulation and synthesisVivado 2019.1
      FPGA deviceXilinx ZYNQ-7 ZC702
    • Table 5. Comparison of the results from the SVM classifier processed in CPU, GPU, and both.

      View table
      View in Article

      Table 5. Comparison of the results from the SVM classifier processed in CPU, GPU, and both.

      ParameterCPUGPUCPU + GPU
      Frequency5020 MHz420 MHz5372 MHz
      Processing time86.06 ms83.1 s64.7 μs
    • Table 6. Comparison of the results from Architecture N#1, Architecture N#3, and Architecture N#9 without and with DPR.

      View table
      View in Article

      Table 6. Comparison of the results from Architecture N#1, Architecture N#3, and Architecture N#9 without and with DPR.

      ArchitectureFeatureWithout DPRWith DPR
      N#1Clock period100 ns50 ns
      Latency0.19 s96 μs
      LUTs-static area96986
      Flip-flops-static area535111
      LUTs-reconfigurable region/924
      Flip-flops-reconfigurable region/169
      Power consumption5 mW7 mW
      N#3Clock period50 ns120 ns
      Latency32.10 μs77.04 μs
      LUTs-static area286588
      Flip-flops-static area637110
      LUTs-reconfigurable region/927
      Flip-flops-reconfigurable region/273
      Power consumption19 mW9 mW
      N#9Clock period50 ns100 ns
      Latency10.95 μs21.60 μs
      LUTs-static area8328138
      Flip-flops-static area79280
      LUTs-reconfigurable region/955
      Flip-flops-reconfigurable region/275
      Power consumption7 mW4 mW
    • Table 7. DPR information of Architecture N#1, Architecture N#3, and Architecture N#9.

      View table
      View in Article

      Table 7. DPR information of Architecture N#1, Architecture N#3, and Architecture N#9.

      ArchitectureTotal DPR run timePartial bitstream size
      N#174.10 μs875 KB
      N#326.00 μs913 KB
      N#97.38 μs831 KB
    • Table 8. Performance comparison of our proposed architecture with the one reported in Ref. [54].

      View table
      View in Article

      Table 8. Performance comparison of our proposed architecture with the one reported in Ref. [54].

      FeaturesProposal in Ref. [54]N#1 with DPRN#9 without DPR
      Clock frequency50 MHz20 MHz20 MHz
      Latency1.231 ms2.79 s0.29 s
      Image size512×51229127 times one frame of 3×329127 times one frame of 3×3
      FPGA deviceAltera’s Cyclone IV E: EP4CE10F17C8Xilinx ZYNQ-7 ZC702Xilinx ZYNQ-7 ZC702
      Edge detection techniqueImproved Canny algorithmMLML
    Tools

    Get Citation

    Copy Citation Text

    Gracieth Cavalcanti Batista, Johnny Öberg, Osamu Saotome, Haroldo F. de Campos Velho, Elcio Hideiti Shiguemori, Ingemar Söderquist. Machine learning algorithm partially reconfigured on FPGA for an image edge detection system[J]. Journal of Electronic Science and Technology, 2024, 22(2): 100248

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Aug. 15, 2023

    Accepted: Mar. 30, 2024

    Published Online: Aug. 8, 2024

    The Author Email: Batista Gracieth Cavalcanti (gracieth@kth.se)

    DOI:10.1016/j.jnlest.2024.100248

    Topics