Laser & Optoelectronics Progress, Volume. 57, Issue 8, 081021(2020)

Convolutional Neural Network Based Indoor Microphone Array Sound Source Localization

Chen Jiao*, Tao Zhang, and Jianhong Sun
Author Affiliations
  • School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China
  • show less
    Figures & Tables(6)
    Space cluster classification for SSL
    Flow chart of localization method
    Comparison of classification accuracy of different algorithms in different environments
    • Table 1. CNN structure

      View table

      Table 1. CNN structure

      Network layerNetwork parameter
      Input layerDimension: 112×2688
      Convolution layerNumber of convolution kernel: 8; kernel_size: 5×5; stride: 1; pad: 0
      Pooling layerPooling: max pooling; kernel_size: 3×3; stride: 1; pad: 0; dropout: 50%
      Convolution layerNumber of convolution kernel: 16; kernel_size: 5×5; stride: 1; pad 0
      Pooling layerPooling: max pooling; kernel_size: 3×3; stride: 1; pad 0; dropout: 50%
      Convolution layerNumber of convolution kernel: 32; kernel_size: 5×5; stride: 1; pad 0
      Pooling layerPooling: max pooling; kernel_size: 3×3; stride: 1; pad 0; dropout: 50%
      Convolution layerNumber of convolution kernel: 64; kernel_size: 5×5; stride: 1; pad 0
      Pooling layerPooling: max pooling; kernel_size: 3×3; stride: 1; pad 0; dropout: 50%
      Connection layerNumber of neurons: 1024; activation function: ReLU; dropout: 50%
      Output layerActivation function: softmax; learning rate : 0.001; iterations: 1000; batch_size: 64
    • Table 2. Classification accuracy of different algorithms in different environments

      View table

      Table 2. Classification accuracy of different algorithms in different environments

      SignalReverberationtime /msAccuracyof TDOA /%Accuracyof SVM /%Accuracyof PNN /%Accuracyof BP /%Accuracyof CNN /%
      Clean voice062.6178.5896.8796.3296.67
      Clean voice30061.6475.2896.3095.5395.03
      Clean voice60060.0274.8693.2592.8493.80
      SNR: 10 dB042.9244.9390.1694.0895.83
      SNR: 10 dB30046.3149.8289.4490.3693.87
      SNR: 10 dB60036.5346.3588.7387.6992.79
      SNR: 0 dB038.6145.5789.8988.8194.32
      SNR: 0 dB30034.6144.0389.0986.3793.49
      SNR: 0 dB30024.8744.6088.6185.2090.78
    • Table 3. Real-time localization time of different algorithms

      View table

      Table 3. Real-time localization time of different algorithms

      SignalReverberationtime /msIterationsLocalizationtime /sIterationsLocalizationtime ofPNN /sIterationsLocalizationtime ofBP /sIterationsLocalizationtime ofCNN /s
      Clean voice044967.446716.81050410.31872311.2
      Clean voice30042157.853877.1136619.81902510.8
      Clean voice60046158.160626.12257210.32978110.9
      SNR: 10 dB048539.352837.4219739.73367112.4
      SNR: 10 dB30049.828.262317.3346039.74210111.8
      SNR: 10 dB60054309.157357.13852410.94877611.2
      SNR: 0 dB045519.250896.93587110.65478611.2
      SNR: 0 dB30048769.460868.34165410.85762310.5
      SNR: 0 dB30055769.364878.14772610.85834111.9
    Tools

    Get Citation

    Copy Citation Text

    Chen Jiao, Tao Zhang, Jianhong Sun. Convolutional Neural Network Based Indoor Microphone Array Sound Source Localization[J]. Laser & Optoelectronics Progress, 2020, 57(8): 081021

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Aug. 29, 2019

    Accepted: Sep. 19, 2019

    Published Online: Apr. 3, 2020

    The Author Email: Chen Jiao (jiaochen@tju.edu.cn)

    DOI:10.3788/LOP57.081021

    Topics