Journal of Electronic Science and Technology, Volume. 22, Issue 4, 100287(2024)

Chinese named entity recognition with multi-network fusion of multi-scale lexical information

Yan Guo1, Hong-Chen Liu1, Fu-Jiang Liu2、*, Wei-Hua Lin2, Quan-Sen Shao1, and Jun-Shun Su3
Author Affiliations
  • 1School of Computer Science, China University of Geosciences, Wuhan, 430078, China
  • 2School of Geography and Information Engineering, China University of Geosciences, Wuhan, 430078, China
  • 3Xining Comprehensive Natural Resources Survey Centre, China Geological Survey (CGS), Xining, 810000, China
  • show less
    Figures & Tables(19)
    An example of nested entities.
    An example of an ambiguous entity word.
    Comprehensive architecture of the BCWC model, consisting of four main layers: Embedding layer, feature extraction layer, feature fusion layer, and CRF layer. The embedding layer incorporates BERT and word embedding models for character and word embeddings, respectively. In the feature extraction layer, varied colored dashed boxes represent the range of word sequences captured by convolutional kernels of different scale sizes. The outcomes of these convolutions are seamlessly concatenated to form the final output. Subsequently, the feature fusion layer employs a multi-head attention mechanism for word weighting, culminating in the transmission of the output to the CRF layer for decoding. This layered structure ensures a comprehensive and efficient approach to information processing in the BCWC model.
    Multi-scale IDCNN process diagram. The input consists of word embedding vectors. The diagram showcases two different scale iterative dilated convolution blocks. When , a regular convolution kernel is used, and when , the dilated convolution is employed. Each convolution block comprises two stacked layers. The results are concatenated and activated using the ReLU function, followed by normalization with LayerNorm. The outputs from different scales are concatenated to obtain the final output.
    [in Chinese]
    Correctly identified named entities of the sample.
    Tokenization and multi-scale convolution kernel capturing process.
    Entity recognition under the same information capture window size.
    Results of different models on the three datasets.
    Quantitative analysis of different structures: Results of IDCNN (a) with different structures on various datasets and (b) with the same structure but different convolutional kernel sizes on various datasets. In the notation m×n, m represents the number of iterative layers and n represents the size of the DCNN blocks.
    Comparison of different structures on three datasets.
    • Table 1. Some common hyperparameter settings about the experiments.

      View table
      View in Article

      Table 1. Some common hyperparameter settings about the experiments.

      HyperparameterValue
      RNN dimension64
      BERT learning rate2×10–5
      BERT dropout rate0.35
      RNN(CNN) learning rate1×10–3
      Kernel size3
      Boundary embedding dimension16
      BERT modelBERT-base-Chinese
      Epoch30
      OptimizerAdamW
      Multi-head attention head8
    • Table 1. Statistics of datasets.

      View table
      View in Article

      Table 1. Statistics of datasets.

      DatasetTrainingEvaluationTestCategorySampleCharacter
      CLUENER10.74K1.34K1.34K1013.00K503K
      Weibo1.35K0.27K0.27K81.89K103K
      Youku8.00K1.00K1.00K910.00K170K
    • Table 2. Some hyperparameter settings about the experiments on the three datasets.

      View table
      View in Article

      Table 2. Some hyperparameter settings about the experiments on the three datasets.

      Datasetmax_seq_lenBatch sizemax_word_len
      CLUENER1283225
      Weibo641620
      Youku1281620
    • Table 2. Performance on CLUENER.

      View table
      View in Article

      Table 2. Performance on CLUENER.

      ModelPrecision (%)Recall (%)F1-score (%)
      BiLSTM-CRF (2020)71.0668.9770.00 [32]
      stkBiGRU-CRF (2021)73.4170.0271.68
      TextCNN+CRF (2022)75.0769.7872.12
      Lattice-LSTM (2018)74.1373.8474.41 [44]
      BERT-CRF (2019)76.4777.0176.73
      BERT (2018)77.2480.4678.82 [32]
      ALBERT (2019)78.9664.5871.05
      ALBERT-BiLSTM (2020)77.2581.2879.22
      RoBERTa-WWM-BiLSTM-CRF (2021)78.8180.8279.80
      DSpERT (2023)78.2479.8279.02
      Ours80.2979.4379.86
    • Table 3. Some hardware and software environments about the experiments.

      View table
      View in Article

      Table 3. Some hardware and software environments about the experiments.

      EnvironmentValue
      Operating systemWindows 11
      Processor12th Gen Intel(R) Core(TM) i5-12600KF
      Random access memory32.0 GB
      Graphics processing unitNVIDIA GeForce RTX 3070Ti GPU 8 GB
      Python version3.8.17
      PyTorch version2.0.0
    • Table 3. Performance on Weibo.

      View table
      View in Article

      Table 3. Performance on Weibo.

      ModelPrecision (%)Recall (%)F1-score (%)
      BiLSTM-CRF (2020)60.8052.9056.58 [11]
      Lattice-LSTM (2018)53.0462.2558.79
      LR-CNN (2019)57.1466.6759.92
      LGN (2019)56.4464.5260.21
      BERT-CRF (2019)67.1266.8867.00 [11]
      FGN (2021)69.0273.6571.25
      MTL-HWS (2023)73.0373.2173.12
      DSpERT (2023)69.5268.8069.12
      KCB-FLAT (2024)72.3670.4171.37
      LkLi-CNER (2023)77.4368.2372.54
      Ours71.9675.3273.60
    • Table 4. Performance on Youku.

      View table
      View in Article

      Table 4. Performance on Youku.

      ModelPrecision (%)Recall (%)F1-score (%)
      BiLSTM-CRF (2020)80.3179.2279.76
      BERT (2018)85.0676.7580.69
      BERT-CRF (2019)83.0081.7082.40 [34]
      Lattice-LSTM (2018)84.4381.2882.82
      BiLSTM+SSCNN-CRF (2023)87.0085.1086.10
      DSpERT (2023)86.6280.1783.27
      Ours87.4186.9787.19
    • Table 5. F1-score results of the ablation study.

      View table
      View in Article

      Table 5. F1-score results of the ablation study.

      ModelF1-score (%)
      BERTCLUENERWeiboYouku
      BCWC+79.8673.6087.19
      CM+76.81 (↓3.05)69.19 (↓4.41)85.79 (↓1.40)
      70.00 (↓9.86)56.58 (↓17.02)79.76 (↓7.43)
      WM+76.48 (↓3.38)68.75 (↓4.85)85.46 (↓1.73)
      MSWM+76.72 (↓3.14)69.75 (↓3.85)85.64 (↓1.55)
      CM⊕MSWM⊕FCF+77.63 (↓2.23)70.39 (↓3.21)86.12 (↓1.07)
      CM⊕WM⊕MHAF+79.51 (↓0.35)73.07 (↓0.53)86.82 (↓0.37)
    Tools

    Get Citation

    Copy Citation Text

    Yan Guo, Hong-Chen Liu, Fu-Jiang Liu, Wei-Hua Lin, Quan-Sen Shao, Jun-Shun Su. Chinese named entity recognition with multi-network fusion of multi-scale lexical information[J]. Journal of Electronic Science and Technology, 2024, 22(4): 100287

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Jun. 23, 2024

    Accepted: Oct. 23, 2024

    Published Online: Jan. 23, 2025

    The Author Email: Liu Fu-Jiang (liufujiang@cug.edu.cn)

    DOI:10.1016/j.jnlest.2024.100287

    Topics