Spectroscopy and Spectral Analysis, Volume. 45, Issue 4, 1008(2025)

Traceability Analysis of Penicillin G Acylase Genus Classification Using High-Throughput Infrared Spectroscopy Based on the Weighted KNN

WANG Yan, ZHANG Pei-pei, and ZHAO Yu*
Author Affiliations
  • National Institutes for Food and Drug Control, Chemical Drug Identification Institute, Beijing102629, China
  • show less

    β-lactam antibiotics (BAs) are an important class of anti-infective drugs in clinical practice. Penicillin G acylase (PGA) is a key technology used in the new enzymatic process to produce of BAs. PGAs derived from different bacterial origins have different protein sequence structures, thermal stability, and stereo-selectivity, which cause different catalytic activity and are crucial for antibiotic synthesis and production. Infrared spectroscopy (IR) can be used to characterize the structure of high molecular weight proteins. Proteomics-based mass spectrometry can identify different protein substances at the peptide level, but its complexity makes it harder to operate. The simple IR method provides a powerful analytical tool for rapidly characterizing PGAs. This article explored the selection of ultrafiltration and drying membrane preparation methods for the pre-treatment of PGA samples. This way could purify PGA raw solutions and remove matrix interference, while it could also overcome the problem of low PGA solution concentration to enhance IR signal response. Besides, a high-through put IR method was optimized and established to analyze 11 batches of PGAs from different sources. All IR spectra of PGAs showed classical IR absorption peaks of amide groups at the characteristic region (1 700~1 500 cm-1). There were still differential IR absorption peaks within the 1 200~750 cm-1 fingerprint region. A traceability model was established by selecting differentiated absorption peak spectral bands at fingerprint regions (830~795, 1 027~1 020, 1 085~1 080 cm-1). Based on the analysis of proteomics mass spectrometry, a weighted k-nearest neighbor (KNN) algorithm was employed to analyze different PGAs. It showed that 11 batches of PGAs were divided into two classes: those including PGA 1, PGA 3, PGA 7, and PGA 8 belonged to class Ⅰ and were identified as the proteins fermented from E. coli, while the rest of those—PGA 2, PGA 4~6, and PGA 9~11 belonged to class Ⅱ and were produced from Achromobacter sp. CCM 4824. This result verified the applicability and feasibility of the established traceability model, which was consistent with the proteomics result. Then, three batches of unknown PGAs were collected for determination by the traceability model to externally validate the accuracy. Finally, the robustness of the established model was further validated by examining 11 batches of PGAs on different days. The results demonstrate that the high-throughput IR method based on the weighted KNN could rapidly trace PGAs from different bacterial origins. This method is simple, accurate, and durable. It provides a new detection tool for the structural characterization and protein classification of the catalytic enzymes used in producing BAs by enzymatic process.

    Tools

    Get Citation

    Copy Citation Text

    WANG Yan, ZHANG Pei-pei, ZHAO Yu. Traceability Analysis of Penicillin G Acylase Genus Classification Using High-Throughput Infrared Spectroscopy Based on the Weighted KNN[J]. Spectroscopy and Spectral Analysis, 2025, 45(4): 1008

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Received: Jan. 9, 2024

    Accepted: Apr. 24, 2025

    Published Online: Apr. 24, 2025

    The Author Email: ZHAO Yu (zhaoyu@nifdc.org.cn)

    DOI:10.3964/j.issn.1000-0593(2025)04-1008-07

    Topics