Surface-enhanced Raman spectroscopy (SERS), as a great potential label-free tool in metabolite detection, offers a strategy for rapid bacterial identification. However, it still lacks experimentally supported spectral interpretation at the metabolite level for complex biosamples. We present a SERS-based method for reliable bacterial intracellular metabolic profiling using plasmonic colloids with high rapidness and cost-efficiency. A convolutional neural network model was constructed to accurately classify eight types of bacteria with an overall accuracy as high as 90.44% and identify the key spectral features for classification by Shapley Additive Explanations. Molecule-level interpretation of the SERS metabolic profiles has been further realized in combination with laser desorption/ionization mass spectrometry, evidencing the primary metabolite contribution to the bacterial spectral signatures and molecule-level distinctions among different bacterial types. We provide insights into the mechanism of bacterial identification by label-free SERS and pave the way for interpretable SERS diagnostic tools for various diseases.
【AIGC One Sentence Reading】:SERS enables rapid, cost-efficient bacterial ID with 90.44% accuracy, revealing metabolite-level spectral signatures via neural network & mass spectrometry.
【AIGC Short Abstract】:We developed a SERS-based method for rapid, cost-efficient bacterial metabolic profiling, achieving 90.44% accuracy in classifying eight bacteria types using a convolutional neural network. Molecule-level interpretation, aided by mass spectrometry, reveals metabolite contributions to bacterial spectral signatures.
Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.
Bacterial infections have constituted a substantial global public health threat, particularly in healthcare settings.1 Bacterial infection–related diseases have experienced a remarkable increase after the COVID-19 pandemic which has exacerbated the general decline in immunity.2 An annual mortality exceeding 10 million deaths has been reported to be associated with bacterial infections.3 Rapid and precise early diagnosis of bacterial infections becomes essential for rational interventions and treatments, considering the rapid reproduction rate in bloodstream infections.4 Mass spectrometry has been a widely adopted and commercialized clinical modality for bacterial diagnosis based on the advancing property of peptide mass fingerprinting. Quantitative polymerase chain reaction (qPCR) and next-generation sequencing (NGS) provide alternative identification approaches, which significantly reduce culture time by targeting specific gene sequences. Despite the remarkable progress, these techniques highly depend on expensive investment, rigorous experimental conditions, and well-qualified operators.5,6 In addition, due to the imbalanced deployment of diagnostic services, patients in less developed areas are less likely to benefit from advanced early diagnosis devices.7,8
Spectroscopy-based detection techniques such as Fourier transform infrared spectroscopy, optical photothermal infrared spectroscopy, and Raman spectroscopy measure bacterial light absorption or scattering, uncovering unique biological fingerprints for bacterial phenotypic identification.9–13 Surface-enhanced Raman spectroscopy (SERS) has recently become an emerging ultra-sensitive optical approach for clinical early diagnosis, which provides non-destructive vibrational molecular fingerprinting with rapidness and easy operation.14–17 Regarding equipment costs, SERS instruments have been progressively miniaturized with lower cost, facilitating accessibility in various regions including developing countries.18,19 Moreover, apart from peptide and gene sequences, bacterial species exhibit distinct metabolic preferences, affected by both genetic and environmental factors.20 The correlation between bacterial metabolic phenotypes and the phylogenetic tree has been systematically investigated by untargeted metabolomics.21,22 These metabolic distinctions offer a scientific foundation for differentiating bacterial species.23 Therefore, increasing efforts have proposed label-free SERS-based rapid bacterial identification, mainly contributed by bacterial components’ signals, either bacteria itself or metabolites released by them.24–27 However, it is still a common problem that current label-free SERS detection methods lack identification and detailed analysis of the spectral data from the molecular perspective yet, indicating that the relevant diagnostic decision based on SERS spectra is still less interpretable and trustworthy. This also makes it very difficult to conduct further metabolic pathway analysis and understanding at the molecule level. In addition, substrates, laser wavelength, and bacterial culture processes may have different impacts on SERS measurements, resulting in low experimental reproduction. These limitations have hindered the application of SERS for bacterial identification in practice.
Our very recent work introduced SERSome as a label-free technique using a group of SERS spectra to characterize metabolite profiles in complex biofluid samples, such as serum, urine, and cell culture media.28 This approach enables sensitive and reliable investigation of metabolic changes in biofluids by capturing rare events and low-abundance signals in the SERS measurement process more comprehensively, facilitating the diagnosis of a wide range of diseases through metabolite monitoring.29–31 In this work, we developed a simple and low-cost metabolite-level interpretable SERSome method specialized for bacterial identification with high rapidness (). Specifically, thermal lysis was performed for sample pretreatment, ensuring high efficiency in metabolite release and compatibility with the subsequent SERS-based metabolic profiling. Metallic colloids as effective SERS substrates were particularly applied for reliable profiling, reducing measurement costs under guaranteed reproducibility. This study investigated eight types of common bacterial species (Table S1 in the Supplementary Material), which have been reported to account for of clinical cases.32 We then established a convolutional neural network (CNN)-based diagnostic network and achieved a bacterial identification accuracy of over 90%. With interpretability analysis of the CNN model, we identified dominant features accounting for accurate classification. To further gain a molecular-level insight into the SERS spectra of different bacteria, laser desorption/ionization mass spectrometry (LDI-MS) was utilized to identify the contributing molecular species to the SERS spectral signatures, succeeding in ascertaining a metabolite panel to accurately decompose the bacterial SERS profiles for the specific metabolite contribution. Overall, our findings substantiate that SERS is capable of precise bacterial identification through the analysis of intracellular metabolite profiles. This work may provide molecule-level insights for many researches on bacterial identification by label-free SERS methods and promote the development of analytical and diagnostic techniques with high practical value.
Sign up for Advanced Photonics TOC Get the latest issue of Advanced Photonics delivered right to you!Sign up now
2 Materials and Methods
2.1 Materials and Instrumentations
(Aladdin, 99.8%), (Aladdin, 98%), adenine (Macklin, 99.5%), cyclic-AMP (Grat, 99%), glycyl-L-phenylalanine (Adamas, 98%), indolepropionic acid (Adamas, 99%), L-tryptophan (Adamas, 99%), uracil (Aladdin, 98%), guanosine 5′-monophosphate (GMP) (Meryer, 98%), and coenzyme A (CoA) (Yuanye, 85%) were commercially purchased and used directly.
The extinction spectrum of Ag nanoparticles (NPs) was measured by a UV1900 ultraviolet-visible spectrophotometer (Aucybest, Shanghai, China). Transmission electron microscope (TEM) images were collected from a JEM-2100F TEM (JEOL, Tokyo, Japan). The zeta potential and hydrodynamic diameter of the Ag NPs were characterized by a Zetasizer Nano ZSP (Malvern, United Kingdom). Some image icons were created in BioRender. Frost, E. (2025) https://BioRender.com with permission.
2.2 Synthesis and Characterization of Ag NPs
The citrate Ag NPs were synthesized according to Lee and Meisel’s33 method with slight modifications. The precursor solutions were prepared by 18 mg (Aladdin, 99.8%) in 100 mL and 0.1 g (Aladdin, 98%) in 10 mL of . The Ag precursor solution was heated and stirred to boil at . When the solution started boiling, 2 mL solution was added continuously over 30 s. The mixture was kept boiling for another 60 min, with the solution gradually becoming yellow-green. The product was stored at 4°C away from light.
2.3 Bacteria Collection, Culture, and Sample Preparation
All bacterial strains were collected from Shanghai Children’s Medical Center, School of Medicine, Shanghai Jiao Tong University and stored at . Each bacterial strain was identified by 16S rRNA gene sequence (primer pairs: 5′-AGAGTTTGATCMTGGCTCAG-3′ and 5′-TACGGYTACCTTGTTACGACTT-3′). The cultivation of bacteria was carried out under strictly controlled conditions, using blood agar plates incubated at the temperature of 32°C for 12 h.
Bacterial cultures were first harvested from blood agar plates and resuspended in . The optical density at 600 nm () of the suspension was measured using a spectrophotometer. Subsequently, the suspension was diluted to achieve the desired values (e.g., 0.5 and 0.1). The certain concentration bacterial suspension was subjected to heat treatment at 95°C for 5 min. The heat treatment has a similar effect to ultrasonication and freeze-thawing while providing a more economical and time-efficient alternative. Following heat treatment, the suspension was centrifuged to pellet the bacterial cells. The resulting supernatant was carefully collected and used for subsequent analysis.
2.4 SERS Measurement
To prepare the samples, Ag NPs were concentrated by centrifugation at 5000g for 10 min to achieve a 10-fold concentration. Five-microliter lysate supernatant was mixed with of the concentrated Ag NPs, followed by the addition of 1 mol/L NaCl. The resulting mixture was then injected into a quartz capillary (internal diameter: 1 mm and external diameter: 2 mm) for analysis using a confocal Raman spectroscopy system (Horiba, XploRA INV, laser wavelength: 638 nm, laser power: 10.36 mW, and acquisition time: 1 s).
SERSome measurements for each bacterium were performed for 10 independent experiments at different times. In each experiment, 100 spectra were collected for each bacterium. Consequently, a total of 1000 spectra were obtained for each bacterium, resulting in a comprehensive dataset of 8000 spectra encompassing all bacterial types. The average of the 1000 spectra for each type of bacteria was calculated and used for display. The SERS spectra used for principal component analysis (PCA) were derived from the average of the 100 spectra collected in each experiment, resulting in 10 data points per bacteria type.
For the standard SERS spectra of pure metabolites, we used the same detection protocol as the sample: adenine (), cyclic-AMP (), glycyl-L-phenylalanine (), indolepropionic acid (), L-tryptophan (), uracil (), GMP (), and CoA ().
2.5 MS Measurement
To accurately test the metabolites that were actually bound to Ag NPs during SERS detection, the preparation of the lysate (three replications per bacterial strain) and Ag NPs was performed in the same manner as for the SERS analysis. After aggregation, the mixture was centrifuged at for 10 min. The supernatant was collected for further SERS analysis to confirm the complete adsorption of metabolites onto the Ag NPs. The pellet was then resuspended in and sonicated to re-suspend the particles. One-microliter aliquot of the suspension was drop cast on a mass spectrometry plate, and the metabolites were analyzed using a Bruker MALDI-FT spectrometer (SolariX 7.0T, 53 to , Nd:YAG lasers, positive model).
2.6 SERS Spectrum Processing
All spectra were processed following a standardized workflow: (1) cosmic ray removal, (2) smoothing, and (3) baseline correction. The spectral preprocessing was performed using MATLAB R2021a. Spectral smoothing was carried out using the Savitzky–Golay filter method. Baseline correction was performed using an adaptive iteratively reweighted penalized least squares (airPLS) algorithm. During implementation, lambda was set as , the order of the difference of penalties was 3, and the maximum iteration time was 25.
2.7 Construction of Diagnostic CNN
The CNN model was implemented in PyTorch. The optimal architecture for bacterial diagnosis consists of two one-dimensional convolutional layers, each followed by rectified linear unit (ReLU) activation and a max-pooling layer. The first convolutional layer used 16 filters with a kernel size of 3, and the second convolutional layer employed 32 filters with the same kernel size. The output of the convolutional layers was flattened and passed through three fully connected layers, followed by a softmax output layer to predict the probabilities of each bacterial class. For optimization, the Adam optimizer was used with a learning rate of and a weight decay of to prevent overfitting. The learning rate was dynamically adjusted using a step-wise scheduler by a factor of 0.5 every 20 epochs. The maximum epoch was 100, and the training stopped at minimum validation loss. We further used Shapley Additive Explanations (SHAP) values to interpret the model’s predictions using the testing set. A background set of 100 random spectra from the training set was used.
2.8 MS Data Processing
The raw mass spectrometry files (.d) from the Bruker mass spectrometer were converted to mzML format using MSconvert.34 The mass spectrometry data were then analyzed using the “MALDIquant v1.22.2” R package.35 The data underwent square root transformation followed by median value correction. Based on the characteristics of the data, the full width at half maximum was set to 8, and a signal-to-noise ratio (SNR) threshold of 7 was applied for further processing. The selected features were queried for metabolites using the “MSbox” R package, with the search conducted on the Human Metabolome Database (HMDB). The search mode was set to “positive” ion mode, with a mass error tolerance of 10 ppm (parts per million). Subsequently, metabolites were filtered via the “MetOrigin” website, identifying those specifically expressed in bacteria.36 The metabolite superclasses were determined based on the classifications on the HMDB website.
2.9 SERS Spectrum Demultiplexing and Fitting
Based on the non-negativity of the metabolite spectral coefficients, we performed coefficient decomposition using least squares to minimize the Euclidean norm between the linear superposition of spectra of multiple substances and the sample spectrum. The similarity between the original spectrum and the fitting spectrum was assessed using cosine similarity. These operations were performed in MATLAB. The spectrum demultiplexing is formulated as where denotes spectral fitting coefficients corresponding to metabolites, the metabolic SERS database, and the bacterial SERS spectrum. The cosine similarity (cossim) is defined by where is the reference spectrum, and is the fitted spectrum.
2.10 Data Analysis
To calculate the SNR of the SERS spectra, the spectral range between 699 and where presented the strongest peak in most spectra was selected as the peak region, and the mean intensity () was recorded. The spectral range between 1940 and with no characteristic peak was selected as the noise region where the mean intensity () and the standard deviation () were computed. Twenty spectra were selected randomly from the spectra set. The SNR was computed using the following equation:
PCA was performed in R. For SERS, the average spectrum of the 216 to region and the fingerprint region (550 to ) from each independent experiment was used as input. For LDI-MS, all identified features were used as input. Differential analysis was performed using the “limma v3.50.3” package,37 and the data were log-transformed. The Spearman correlation of the MS data was computed in R.
3 Results
3.1 SERSome Measurement Workflow for Bacterial Intercellular Metabolites
To achieve rapid, simple, and reliable SERSome measurements, we established an optimized workflow encompassing sample preparation and detection to obtain comprehensive bacterial metabolite profiles as a foundation of accurate bacterium identification [Fig. 1(a)]. In detail, bacteria cultured on blood agar plates were harvested, washed, and resuspended in . The suspension was then heated at 95°C for 5 min to ensure complete bacterial inactivation and metabolite release, using a method widely adopted for its convenience and safety.38,39 As expected, heating notably altered bacterial morphology including a reduction in volume and turning black in color [Fig. S1(a) in the Supplementary Material]. The heated suspension was centrifuged to remove the bacterial debris, and the supernatant, enriched with metabolites, was collected. Then, the supernatant was mixed with plasmonic NPs in a 1:1 volumetric ratio followed by adding 1 mol/L NaCl solution to induce slight NP aggregation and generation of more electromagnetic hotspots for Raman enhancement.40–43 SERSome measurements were implemented using a quartz capillary (inner diameter: 1 mm) loaded with a mixture of bacterial metabolites and NPs [Fig. 1(a)]. Particularly, an aqueous colloidal NP system was selected for its reliability and uniformity in metabolic analysis.28 Citrate-reduced Ag NPs were selected as the colloidal substrate due to their high stability, scalability for mass production, and superior Raman enhancement capability.44 Characterization revealed an extinction peak at 419 nm [Fig. 1(b)], a zeta potential of , and a hydrodynamic diameter of 95 nm () for Ag NPs [Figs. S1(b) and S1(c) in the Supplementary Material].
Figure 1.SERSome acquisition of bacteria lysate. (a) Schematic workflow including thermal lysis of bacteria, removal of debris, addition of Ag NPs and salt ions, injection of the mixture into a capillary tube, and detection using Raman spectroscopy. (b) Extinction spectrum and TEM image of Ag NPs. (c) Heatmaps of SERSomes and (d) the typical single SERS spectra from S. aureus at various concentrations. All spectra were normalized. (e) Spectral SNRs of S. aureus at different concentrations () calculated using the Raman band indicated by yellow in panel (d). values were calculated using the Wilcoxon rank-sum test.
Considering heating-induced bacterial inactivation would release not only metabolites but also macromolecules such as proteins and nucleic acids, we explicitly examined the metabolite detection capability of our technique by comparing SERSome measurements with and without ultrafiltration, which can effectively remove macromolecules. Herein, we used Staphylococcus aureus (S. aureus, SAU) andEscherichia coli (E. coli, ECO) as representative Gram-positive (Gram+) and Gram-negative (Gram−) bacterial species, respectively. The spectra from filtered and unfiltered samples were nearly identical, thus suggesting that the intracellular metabolites are the primary contributors to the spectral signatures in the SERSome of bacteria [Fig. S1(d) in the Supplementary Material].
To enhance the efficiency of our workflow, we investigated the optimal bacterial concentrations. Bacterial concentration was determined by measuring optical density (OD) at 600 nm. We compared SERS signals from S. aureus lysates at cell concentrations of 0.5, 0.1, and 0.02 OD.45 One hundred spectra were collected for each sample. As illustrated in Figs. 1(c) and 1(d) and Figs. S2(a) andS2(b) in the Supplementary Material, the suspensions with two higher concentrations (0.5 and 0.1 OD) yielded stable SERS signals probably contributed by competitive interactions of high-concentration metabolites on the surface of Ag NPs,46 whereas for the 0.02 OD concentration, there presented significant fluctuations. For quantitative comparisons, 20 spectra were randomly selected from each spectral set to calculate the SNR at the characterized peak highlighted in yellow in Fig. 1(d) (more details in Sec. 2). The results showed that the averaged SNRs at 0.5 and 0.1 OD were relatively close to each other, and they both were above 3 except the 0.02 OD concentration [Fig. 1(e), Fig. S2(c) in the Supplementary Material]. Therefore, in the subsequent study, we chose a concentration of 0.1 OD, equivalent to per milliliter, and the sample volume required for our SERS measurements was merely 3 to , which indicates that the total bacterial amount needed for our experiments is remarkably low. In addition, given the high consistency across different acquisitions of one sample, a single spectrum was finally acquired for each sample to further accelerate the measurement process.
3.2 SERS Profiling of Eight Common Types of Clinical Pathogenic Bacteria
We collected eight types of bacterial species including A. baumannii (ABA), E. coli (ECO), K. pneumoniae (KPN), P. aeruginosa (PAE), S. aureus (SAU), E. faecium (EFM), E. faecalis (EFS), and S. capitis (SCA) from Shanghai Children’s Medical Center (School of Medicine, Shanghai Jiao Tong University), all of which are typical pathogens associated with bacterial infections. For each species, 10 independent experiments were conducted on separate days, and 100 SERS spectra were acquired for each experiment, resulting in a total of 8000 spectra (detailed in Sec. 2 and Fig. S3 in the Supplementary Material). Figure 2(a) depicted the average SERS spectra shaded with standard deviations of all bacteria. By applying PCA for dimensionality reduction, it can be found that Gram+ (i.e., SAU, EFM, EFS, and SCA) and Gram− bacteria (i.e., ABA, ECO, KPN, and PAE) exhibit distinct spectral distributions, suggesting that SERS could possibly capture underlying biological information [Fig. 2(b)]. PCA was further performed to visualize the distribution of eight bacterial species. Although each species exhibited relatively tight clustering, some overlap was observed among species [Fig. 2(c)]. This suggests that a more sophisticated algorithm is necessary for accurate bacterial identification. It can be noticed that the peak intensity of Gram+ and Gram− bacteria at exhibited distinct characteristics with the relatively higher intensity for the former [Figs. 2(d) and 2(e)]. From the PC loadings, we can also see that the 216 to region has important contributions to the top four PCs (Fig. S4 in the Supplementary Material). Notably, this peak typically corresponds to the Raman signal of the Ag-Cl bond,47 resulting from the addition of ions during the aggregation generation process. A lower intensity of the Ag-Cl peak potentially implies more metabolites on the surface of Ag NPs [Fig. 2(f)]. Gram+ bacteria, due to their cell structure, possess a thicker peptidoglycan layer in their cell walls, which may enhance their heat resistance and impede the release of metabolites.48 Moreover, lysates from lower-concentration bacterial samples containing less metabolites exhibit a stronger peak at [Fig. 2(g)]. A similar result was found using the metabolites of CoA and adenine [Fig. 2(h)], which manifest stronger competitive effects compared with .49 Overall, SERS spectra managed to reflect variations in metabolite concentration and composition in the bacterial lysates.
Figure 2.Analysis of SERS spectra of eight bacteria. (a) Average SERS spectra of eight bacteria. The 200 to region was scaled across bacterial species. The fingerprint region was scaled to [0,1]. The shaded area represents the standard deviation (). PCA plot of the spectra of the eight bacteria: (b) points are colored blue for Gram− and red for Gram+ bacteria with the confidence intervals indicating their distributions; (c) points are colored by eight types of bacteria instead with the confidence intervals indicating the distinctive types. (d) Average spectra in the range of 200 to . (e) Box plots for the intensities at of the eight bacteria. For the data in panels (b), (c), and (e), each point indicates the mean value from an independent test (). The color used for each bacterium in panels (c) and (d) can be referred to in panel (e). (f) Schematic illustration of the ligand replacement on the surface of Ag NPs after the addition of NaCl. ions occupy the surface of Ag NPs (left) and metabolites with strong competitiveness displace the ions (right). (g) Spectra (200 to ) for two bacteria (ECO and SAU) and (h) two metabolites (CoA and adenine) at different concentrations.
3.3 Interpretable CNN for Accurate Bacterial Identification
CNNs have been employed for robust noise resistance and classification of SERS spectra by extracting latent features from the raw data.50–53 For accurate bacterial identification, we constructed a CNN with two convolutional layers each followed by ReLU activation and a max-pooling layer [Fig. 3(a)]. The high-level features learned by the convolutional layers were then fed into a stack of four fully connected layers and a softmax layer to output the final prediction on the bacteria. SERS data collected from the eight bacterial species were used to train our CNN (60% for training, 20% for validation, and 20% for testing). Our network achieved an overall diagnostic accuracy of 90.44%, and an area under the curve (AUC) of 0.945 [Fig. S5(a) in the Supplementary Material]. All Gram− bacteria can be precisely recognized with an accuracy of above 95%. For Gram+ bacteria, some misclassifications occurred due to the similarity among SAU, EFM, and EFS [Fig. 3(b)]. Then, we used SHAP to identify and explain the key features primarily contributing to the high performance of the CNN54,55 [Fig. 3(a)]. SHAP is a unified framework for interpreting black-box models and understanding predictions. A positive SHAP value indicates that the corresponding feature contributes to an increase in the likelihood of the predicted outcome. The validation set was used to calculate the SHAP value, according to which, we present heatmaps to show the importance of different wavelengths. These heatmaps are then overlaid onto an average spectrum for visualization, and brighter features are more important for bacterial classification [Fig. 3(c)]. The top 30 features for all bacteria were reported in Figs. S5(b) and S5(c) in the Supplementary Material. We explicitly listed the top five most critical peaks by grouping these features in Table 1, which have been strongly correlated to metabolites containing heterocyclic rings, such as nucleotides (i.e., adenine, guanine, and uracil) and tryptophan according to the literature56–67 (Table 2). Previous studies by Premasiri et al.25 have also indicated that these nucleotides may contribute a lot to the SERS spectra of bacteria. Tryptophan is an active metabolite in bacterial metabolism and has a large scattering cross-section.68
Figure 3.Bacteria identification using interpretable CNN. (a) Architecture of CNN for bacterial classification. SHAP is used for model interpretation. (b) Confusion matrix on the testing set. (c) SHAP values of eight bacteria species. For clarity, an average spectrum overlaid with SHAP values is also shown at the top.
Table 2. Assignments to some typical SERS bands for important characteristic peaks.
Peak ()
Vibration mode
Assignment
728
Ring breathing
Nucleotide and adenine45–47
1261
H bending and ring stretching
Adenosine48
1025
CH3 wagging and NH2 rocking
Pyridine49,50
1499
N═C─C═N51
—
1142
C─H52 and N─H bending50
Adenine50
1671
C─C stretching53 and C═O stretching50
Guanine50
765
Ring breathing50
Nucleotide, uracil, and tryptophan50
1329
bending55 and rocking56
Purine50
3.4 Identification of Metabolites Adsorbed onto Ag NPs by LDI-MS
Despite the preliminary insights from CNN, the metabolic implications of these peaks require more definitive verifications to convince speculations from the literature. LDI-MS, a soft ionization technique with minimal fragmentation, effectively preserves molecular ion information, thereby enabling accurate identification of metabolites in complex biological samples.69 Ag NPs employed as SERS substrates are also widely applied in NP-assisted LDI-MS due to their functions in molecule enrichment and enhancement in desorption and ionization efficiency,70,71 which provides a crucial cross-validation strategy for identifying the molecular origins of the SERS signal. Via the same process in SERS measurements, Ag NPs were incubated with bacterial lysate to adsorb metabolites. The mixture was then centrifuged, and the obtained Ag NPs pellet was resuspended in . The metabolite-adsorbed Ag NPs were subsequently profiled by LDI-MS [Fig. 4(a)]. Before LDI-MS measurement, a control experiment was conducted to validate sufficient metabolite adsorption on the surface of Ag NPs. Following the acquisition of bacterial spectra from the lysate (the first measurement), the Ag NPs were removed via centrifugation. The resulting supernatant was then mixed with fresh Ag NPs, and the spectra were subsequently recorded (the second measurement) [Fig. S6(a) in the Supplementary Material]. The dramatic reduction of SERS signals of ECO and SAU from the second measurement compared with the first measurement indicates the strong affinity of metabolites to NPs [Fig. S6(b) in the Supplementary Material]. Certainly, centrifugation effectively enriched the Ag NP–metabolite complex, which is crucial for ensuring the reliability of subsequent mass spectrometry analysis.
Figure 4.Metabolite identification by Ag NP-assisted LDI-MS. (a) Workflow of Ag NP-assisted LDI-MS. The Ag NP-adsorbed metabolites were separated and dropped on the LDI plate and measured by LDI-MS. (b) Spearman’s correlations among samples (). (c) Different features in Gram+ and Gram− bacteria. (d) PCA plot based on the LDI-MS features ( for Gram+ and Gram−). (e) Distribution of metabolite superclasses and the total number of metabolites identified from HMDB (top) in each bacterium. (f) Boxplot of MS intensities for Gram+ and Gram− bacteria in different superclasses. All intensities are standardized according to the median. The correlation of the intensity at with the (g) total MS intensity of each bacterium and (h) MS intensities of seven metabolite superclasses. (i) Detailed correlation with (I) organoheterocyclic compounds, (II) benzenoids, and (III) nucleosides, nucleotides, and analogs. The point color for each bacterium can be referred to panel (g). (j) Original mass spectra (174.010 to ) (left) and corresponding replicate intensities at (right) in each bacterium.
Each bacterial species was tested in triplicate, generating over data points per sample. Given the data background, a typical SNR threshold of 7 was used,35 resulting in each sample yielding over signals. Spearman correlation analysis was used for the quality assessment of the MS experiment. The results showed that the three replicates of each bacterium exhibited great consistency, and there was a large difference in metabolite profiles between Gram+ and Gram− bacteria [Fig. 4(b)]. A comparison of differential features revealed a greater number of metabolites with higher intensities in Gram− than in Gram+ bacteria [Fig. 4(c)]. We also used PCA to observe the data distribution of the MS results [Fig. 4(d) and Fig. S6(c) in the Supplementary Material]. Similar to the SERS results, we found that Gram− and Gram+ bacteria were clearly distinguished. The tighter clustering of Gram+ bacteria in the PCA [Figs. 4(b) and 4(d)] reinforces the findings from SERS spectra [Figs. 2(a) and 3(b)], all of which point to greater similarity within the Gram+ group and, consequently, more identification mistakes. These distinct LDI-MS profiles consolidated the ability of SERS to differentiate bacterial species. We further aligned the features with metabolites and classified them into seven superclasses and others according to the chemical functional ontology in HMDB.72 Obviously, these metabolic superclasses of Gram+ and Gram- bacteria demonstrated different distributions [Fig. 4(e), Fig. S6(d) in the Supplementary Material]. Subsequently, the entire normalized intensity for each superclass was calculated. Gram− bacteria exhibited consistently higher intensities, indicating a greater release of metabolites compared with Gram+ bacteria [Fig. 4(f)]. This observation aligns with our hypothesis that Gram− bacteria were more susceptible to lysis. The increased metabolic release ultimately contributes to the intensity attenuation of the Raman peak at [Fig. 4(g)], though the . Then, the correlation between the intensities of the seven superclasses and the Ag-Cl bond intensity was also calculated [Fig. 4(h)]. The results showed that three superclasses of metabolites, including (I) organoheterocyclic compounds, (II) benzenoids, and (III) nucleosides, nucleotides, and analogs, are significantly correlated with the peak decrease at (, ), detailed in Fig. 4(i). The significant role of these three metabolite superclasses is probably due to their ability to interact with heavy metal–generated molecule NPs. These interactions may involve coordinate bond formation between nucleotide metabolites/heterocyclic compounds and the NPs, as well as –metal interactions with benzene derivatives.73–76 As an example, we present the raw mass spectra of adenine, specifically the peaks, from eight different bacteria [Fig. 4(j)]. Gram− bacteria exhibited higher adenine abundance, particularly for PAE, which displayed the lowest SERS peak intensity at [see Fig. 2(e)]. This observation is consistent with previous reports of adenine’s stronger competitive binding than .49
3.5 Metabolite-Level Interpretation of the SERS Metabolic Profiles
Based on the metabolites on the Ag NP surface provided by LDI-MS, we then investigated the contribution of each metabolite to the characteristic spectra and evaluated the efficiency of LDI-MS for a deeper understanding of the SERS-based bacterial identification method. We hypothesized that the SERS spectrum of the bacterial lysate could be formulated by a linear combination of individual SERS spectra of multiple metabolites adsorbed onto the NP surface [Fig. 5(a)]. First, we established the SERS panel of pure metabolites, constructed by the following screening criteria: (1) metabolites existing in more than four bacterial species were included; (2) some metabolites with small scattering cross-sections, such as long-chain fatty acids with weak SERS signals, were not considered in this study; (3) the metabolites selected for this panel primarily were nucleotides, benzenoids, and organoheterocyclic compounds based on the literature search of the important Raman shift features (see Table 2) and the related metabolite superclasses identified by LDI-MS [see Fig. 4(h)]. In addition, we employed a common substitution for metabolites with similar structures. For instance, tryptophan was used to replace alanyltryptophan, tryptophyl-tryptophan, and 5-hydroxy-L-tryptophan (Table S2 in the Supplementary Material). Ultimately, our SERS panel containing eight metabolites was used to reconstruct the bacterial SERS spectra [Fig. S7(a) in the Supplementary Material]. Specifically, the non-negative least squares (NNLS) algorithm was utilized to minimize the error between the linear superposition of spectra (fitting spectrum) and the original spectrum, incorporating an additional non-negative coefficient constraint [Fig. 5(a)]. Thereafter, the coefficients of each metabolite in the panel could be obtained and used to describe the relative contribution to the formation of the mixed spectrum, particularly, rather than the absolute concentration considering large variations in NP enhancement capability and competitive adsorption among different metabolites on the NP surface.
Figure 5.Metabolite-level interpretation of bacterial SERS spectra. (a) Schematic workflow for spectral decomposition. The metabolite panel was established based on the results of LDI-MS. Coefficients were derived by spectral decomposition based on NNLS. (b) Representative original bacterial spectra (black) and fitting spectra (blue) of the eight bacteria. (c) Box plots based on cosine similarity between the original spectrum and the fitting spectrum. Confusion matrices of (d) random forest and (e) support vector machine for bacterial identification based on coefficients from spectral decomposition.
Comparing the fitting and original spectra, we found that the majority of bacterial species can be fitted with high quality, whereas ABA and PAE showed discrepancies at some critical peaks [Fig. 5(b)]. Quantitatively, the cosine similarity in Fig. 5(c) also demonstrated the strong reconstruction performance based on our database for most bacteria (average similarity ). However, ABA (0.879) and PAE (0.852) had lower similarity scores compared with the other bacteria. Similar to our findings in Fig 2(e), ABA and PAE exhibited weaker Ag-Cl bond strengths, suggesting a higher degree of metabolite adsorption on their surfaces. Furthermore, LDI-MS results showed a greater variety of metabolites adsorbed on Ag NPs in these two bacterial species. This also suggests that if we desire to analyze or fit the SERS spectra of ABA and PAE in a better way, we need to consider more metabolites. Certainly, these above results indicate that interpreting the SERS spectra at the molecular level using LDI-MS is a reliable approach.
We then leveraged the fitting coefficients of metabolites in our panel to perform bacterial classification from the perspective of metabolite characterization of SERS [Fig. S7(b) in the Supplementary Material]. We employed random forest and support vector machine for the classification task, both of which achieved an accuracy of over 90% and an AUC of over 0.99 [Figs. S7(c) and S7(d) in the Supplementary Material]. The confusion matrices in Figs. 5(d) and 5(e) were similar to the result in Fig. 3(b), where the misclassification predictions among Gram+ bacteria from metabolite characterization were consistent with those from the raw SERS spectra. The findings presented herein imply that molecular-level SERSome has the potential to effectively extract information from SERS spectra for bacterial identification and subsequent analysis.
4 Discussion and Conclusion
In this study, we introduced a robust SERS-based approach for the rapid and economical identification of bacteria with model and molecular interpretability. Bacterial lysis released metabolites which replaced the chemical modifiers on the surface of NPs, thus being enhanced by the SERS substrate. Because of the heterogeneity of bacterial metabolism and the differing lysis susceptibilities of various bacteria, the resulting spectral differences allowed for precise bacterial identification. To reduce experimental variability, we replicated our experiment ten times and established a dataset for training, validation, and testing sets. This design underscores the reproducibility of our approach. We then employed a CNN to extract subtle spectral variations, thereby facilitating interpretability in our diagnostic model. The contributing peaks were predominantly linked to nucleic acid-related metabolites. The underlying principle of SERS-based bacterial identification differs from that of existing methods. Molecular diagnostic methods, such as NGS and MALDI, target nucleic acid sequences or ribosomal proteins which are highly conserved during bacterial evolution, enabling bacterial classification.77 However, the rapid turnover of metabolites and their vulnerability to external stimuli pose limitations to the reliability of SERS for bacterial diagnosis. We implemented rigorous controls over the bacterial culture process, including standardized incubation time and cell density, to ensure consistent bacterial conditions for detection. Our results showed that intrinsic metabolic differences among bacterial strains are sufficient for precise diagnosis under these controlled conditions. The main advantages of this method are lower cost and higher speed. Benefiting from our simple and universal synthesis scheme of Ag NPs, a single synthesis yields enough material for 2000 to 10,000 detections. The detection cost per patient of this method can be dozens of times lower than that of quantitative qPCR, NGS, and MALDI (∼$50 per patient).78 Compared with qPCR, which requires specific primers,79 the reagents used in SERS detection are universal. In terms of instrument setup, miniaturized Raman spectrometers are far more suitable for deployment in remote areas than mass spectrometers and NGS sequencers.
In addition, we have extensively considered the enhancement mechanism of SERS in this study. By inducing aggregation of Ag NPs using NaCl, we generated more hot spots and facilitated the replacement of citrate ions on the Ag surface with ions. Our results showed that certain types of metabolites could displace and bind to the NP surface, as evidenced by both SERS spectroscopy and LDI-MS. The weaker Ag-Cl intensity in SERS spectra of Gram− bacteria, coupled with higher intensity in LDI-MS, suggested that these bacteria released more metabolites to compete for adsorption sites. This observation aligns with the physiological understanding that Gram+ bacteria are more resistant to heat than Gram− bacteria. LDI-MS revealed that the complexity of bacterial spectra was directly correlated with the number of surface-adsorbed metabolites. The integration of surface chemistry with LDI-MS provided a foundation for molecular-level interpretation in SERS-based metabolic studies. When fitting these spectra using a panel of pure metabolite spectra, we found that samples with a higher density of adsorbed metabolites were more difficult to fit accurately, presumably due to the limitations of our spectral panel and the increased complexity of molecular interactions.
Despite the feasibility of using SERS for bacterial identification in principle, there are still some issues that need to be addressed before clinical application. It is essential to establish a more comprehensive clinical samples database, which would contribute to the development of more standardized and accurate detection methods. Specifically, this involves gathering strains from numerous medical centers and those originating from diverse infection sites, including but not limited to blood, respiratory tract, and urinary tract. In addition, drug-resistant strains should also be incorporated into the collection. Our results suggest that careful control of the bacterial and NP concentration ratio is also necessary for the successful implementation of standard methods and applications. In real-world scenarios, our method is currently only applicable to isolated and purified bacterial samples with higher abundance. In complex biological matrices such as blood, bacteria populations are typically mixed and their detection can be easily interfered from numerous components, such as red blood cells. A potential solution is to combine SERS detection with microfluidic chips for micro-culturing. In the context of urinary tract infections, where bacterial counts are higher and bacteria can be enriched through centrifugation to reduce matrix effects, our technology may allow direct detection.80 Moreover, the establishment of a more standardized SERS database with a broader range of metabolites is essential for providing a more comprehensive molecular-level interpretation.
Haoran Chen is currently a PhD candidate at the School of Biomedical Engineering, Shanghai Jiao Tong University, under the supervision of Prof. Jian Ye. He received his MS degree in immunology from Peking Union Medical College in 2023. His research interests focus on the metabolites profiling by surface-enhanced Raman spectroscopy.
Ruike Zhao is a technologist in the Infection Research Laboratory of Shanghai Children’s Medical Center affiliated with School of Medicine, Shanghai Jiao Tong University. She obtained her MS degree from the Clinical Laboratory Diagnostics at Soochow University in 2016. Currently, her research primarily focuses on pathogen microbiology culture and molecular diagnostics.
Xinyuan Bi is currently a PhD candidate at the School of Biomedical Engineering, Shanghai Jiao Tong University, under the supervision of Jian Ye. She received her BE degree in biomedical engineering from Shanghai Jiao Tong University in 2020. Her current research focuses on highly sensitive and multiplexed molecular detection and related diagnostic methods.
Yue Tao received her PhD from the Department of Biology at the University of Science and Technology of China in 2013. From 2013 to 2015, she pursued postdoctoral research at Shanghai Children’s Medical Center affiliated with Shanghai Jiao Tong University. Currently, she is an associate researcher at Shanghai Children’s Medical Center, School of Medicine, Shanghai Jiao Tong University, focusing on molecular diagnosis of infectious diseases and the exploration of novel molecular markers
Zhou Chen received her BE degree (Hons.) in electronics and electrical engineering from the University of Edinburgh, Edinburgh, UK, and the Nanjing University of Aeronautics and Astronautics, Nanjing, China, in 2018, her MS degree in artificial intelligence from the University of Edinburgh, in 2019, and the PhD in engineering from the University of Edinburgh, in 2023. She is a research associate at the School of Biomedical Engineering, Shanghai Jiao Tong University. Her research interest focuses on machine learning for surface-enhanced Raman spectroscopy.
Jian Ye received his BE and MS degrees from Zhejiang University and PhD from KU Leuven. He was a material engineer at Intel Products (Shanghai) Ltd. (2003–2005) and a postdoctoral fellow in IMEC (Belgium) supported by Research Foundation–Flanders (FWO) (2010–2013). In 2011, he was at Rice University as a visiting scholar. He joined the School of Biomedical Engineering of Shanghai Jiao Tong University in 2013 and was promoted as a full professor in 2017. His research interests focus on plasmonic nanostructures and surface-enhanced Raman spectroscopy for biomedical applications.
Biographies of the other authors are not available.