MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models

Tiantong Wang,Xinyu Yan,Tiantong Wu,Yurong Hao,Yong Jiang,Fei Huang,Wei Yang Bryan Lim

Published 2026 in Unknown venue

ABSTRACT

Machine unlearning for large language models often faces a privacy dilemma in which strict constraints prohibit sharing either the server's parameters or the client's forget set. To address this dual non-disclosure constraint, we propose MPU, an algorithm-agnostic privacy-preserving Multiple Perturbed Copies Unlearning framework that primarily introduces two server-side modules: Pre-Process for randomized copy generation and Post-Process for update aggregation. In Pre-Process, the server distributes multiple perturbed and reparameterized model instances, allowing the client to execute unlearning locally on its private forget set without accessing the server's exact original parameters. After local unlearning, the server performs Post-Process by inverting the reparameterization and aggregating updates with a harmonic denoising procedure to alleviate the impact of perturbation. Experiments with seven unlearning algorithms show that MPU achieves comparable unlearning performance to noise-free baselines, with most algorithms'average degradation well below 1% under 10% noise, and can even outperform the noise-free baseline for some algorithms under 1% noise. Code is available at https://github.com/Tristan-SHU/MPU.

PUBLICATION RECORD

Publication year
2026
Venue
Unknown venue
Publication date
2026-02-27
Fields of study
Computer Science
Identifiers
arXiv 2602.23798
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

LUNAR: LLM Unlearning via Neural Activation Redirection
2025cited by this paper
A Survey on Unlearning in Large Language Models
2025cited by this paper
OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics
2025cited by this paper
Balancing Forget Quality and Model Utility: A Reverse KL-Divergence Knowledge Distillation Approach for Better Unlearning in LLMs
2025cited by this paper
Exploring Criteria of Loss Reweighting to Enhance LLM Unlearning
2025influential reference
On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm
2024cited by this paper
TOFU: A Task of Fictitious Unlearning for LLMs
2024cited by this paper
Rethinking machine unlearning for large language models
2024cited by this paper
UNDIAL: Self-Distillation with Adjusted Logits for Robust Unlearning in Large Language Models
2024cited by this paper
Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning
2024cited by this paper
MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic
2024cited by this paper
The Llama 3 Herd of Models
2024cited by this paper
Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models
2024cited by this paper
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning
2024influential reference
LLM Unlearning via Loss Adjustment with Only Forget Data
2024cited by this paper
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate
2024cited by this paper
Qwen2.5 Technical Report
2024cited by this paper
Multi-Objective Large Language Model Unlearning
2024cited by this paper
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
2023cited by this paper
Large Language Model Unlearning
2023cited by this paper
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
2022influential reference
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
2022cited by this paper
The Right to be Forgotten in Federated Learning: An Efficient Realization with Rapid Retraining
2022cited by this paper
Continual Learning and Private Unlearning
2022cited by this paper
Federated Unlearning
2020cited by this paper
Machine Unlearning
2019cited by this paper
Making AI Forget You: Data Deletion in Machine Learning
2019cited by this paper
Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks
2019cited by this paper
Attention is All you Need
2017cited by this paper
Understanding Black-box Predictions via Influence Functions
2017cited by this paper
Towards Making Systems Forget with Machine Unlearning
2015cited by this paper
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
2014cited by this paper
ROUGE: A Package for Automatic Evaluation of Summaries
2004cited by this paper
Functionally Equivalent Feedforward Neural Networks
1994cited by this paper

CITED BY

No citing papers are available for this paper.