On large language models safety, security, and privacy: A survey

Category		Feature	Example
Safety	Toxicity and bias	Toxicity and bias refer to the inappropriate or prejudiced content that may be generated or amplified by LLMs due to biased training data. This can result in harmful or discriminatory outcomes.	[12,13]
	Hallucination	Hallucination refers to the generation of text that is nonsensical, irrelevant, or factually incorrect, typically arising from the model’s inability to accurately understand or process the context of the input data.	[14,15]
	Jailbreak	Jailbreak attack refers to the exploitation of vulnerabilities to bypass the model’s intended constraints and generate content that violates its operational guidelines or ethical safeguards.	[16,17]
Security	Backdoor & poisoning attacks	Backdoor and poisoning attacks refer to the malicious insertion of hidden triggers or corrupted data during the training process, which can cause the model to produce harmful or targeted outputs when prompted with specific inputs.	[6,18]
Security	Adversarial attack	Adversarial attack is a method where carefully crafted inputs are used to deceive the model into making errors or generating unintended outputs, often by exploiting the model’s weaknesses or vulnerabilities.	[19,20]
Privacy	Privacy leakage	Privacy leakage occurs when sensitive or personal information is inadvertently disclosed through the model’s responses, due to the model’s exposure to or training on data containing such information.	[7,21]
	Inference attack	Inference attacks involve the exploitation of model responses to deduce sensitive information about the training data or the underlying algorithms, potentially compromising privacy or security.	[22,23]
	Extraction attack	Extraction attacks are attempts to reverse-engineer or illicitly obtain proprietary information, such as training data or model parameters, by interacting with the model’s outputs.	[24,25]

Tools

Get Citation

Copy Citation Text

Ran Zhang, Hong-Wei Li, Xin-Yuan Qian, Wen-Bo Jiang, Han-Xiao Chen. On large language models safety, security, and privacy: A survey[J]. Journal of Electronic Science and Technology, 2025, 23(1): 100301

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category:

Received: Sep. 25, 2024

Accepted: Jan. 9, 2025

Published Online: Apr. 7, 2025

The Author Email: Hong-Wei Li (hongweili@uestc.edu.cn)

DOI:10.1016/j.jnlest.2025.100301

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology