MEraser: An Effective Fingerprint Erasure Approach for Large Language Models

Under Construction🏠

Authors: Jingxuan Zhang, Zhenhua Xu, Rui Hu, Wenpeng Xing, Xuhong Zhang, Meng Han

Large Language Models (LLMs)· have become increasingly prevalent across various sectors, raising critical concerns about model ownership and intellectual property protection. Although backdoor-based fingerprinting has emerged as a promising solution for model authentication, effective attacks for removing these fingerprints remain largely unexplored. Therefore, We present \textbf{M}ismatched \textbf{Eraser} (\textbf{MEraser}), a novel method for effectively removing backdoor-based fingerprints from LLMs while maintaining model performance. Our approach leverages a two-phase fine-tuning strategy utilizing carefully constructed mismatched and clean datasets. Through extensive evaluation across multiple LLM architectures and fingerprinting methods, we demonstrate that MEraser achieves complete fingerprinting removal while maintaining model performance with minimal training data of fewer than 1,000 samples. Furthermore, we introduce a transferable erasure mechanism that enables effective fingerprinting removal across different models without repeated training. In conclusion, our approach provides a practical solution for fingerprinting removal in LLMs, reveals critical vulnerabilities in current fingerprinting techniques, and establishes comprehensive evaluation benchmarks for developing more resilient model protection methods in the future.

🚀 News

[2025/05] Our paper has been accepted by the ACL 2025 Main Conference!

🙌Quick Start

1. Environment Setup

First, ensure you have installed all necessary dependencies.

pip install -r requirements.txt

2. Configure Paths

Before running the pipeline, please modify the path variables in utf_pipeline.sh according to your local environment. The example below is for meta-llama/Llama-2-7b-chat-hf. Here we only take UTF fingerprinting method as a example. You can find UTF at xxxx

# -------------llama----------
base_model='meta-llama/Llama-2-7b-chat-hf'
fingerprint_model="<YOUR_PATH>/Llama2-utf-fingerprint-model"  # Path to the fingerprinted model
fingerprint_adapter='<YOUR_PATH>/Llama2-utf-fingerprint-adapter'
test_UTF_fingerprint_dataset="Llama2_utf_dataset.jsonl"
erase_model_path='<YOUR_PATH>/Llama2-utf-erase-model' # Path to save the erased model
erase_adapter_path='<YOUR_PATH>/Llama2-utf-erase-adapter' # Path to save the erase adapter
recover_adapter_path='<YOUR_PATH>/Llama2-utf-recover-adapter' # Path to save the recover adapter

3. Run the Pipeline

Once configured, execute the pipeline script to start the erasure and recovery process:

bash utf_pipeline.sh

This script will automate the following steps:

Run Erasure (cf.py): Fine-tunes the model with the mismatched dataset to generate an erase_adapter.
Test Erasure: Executes test_uft.py and test_ppl_guanaco_adapter.py to evaluate fingerprint removal and model performance.
Merge Model (merge.py): Merges the erase adapter with the original fingerprinted model.
Run Recovery (recover.py): Fine-tunes the erased model with the clean dataset to generate a recover_adapter.
Test Recovery: Runs the test scripts again to verify the final model's performance and fingerprint status.

Citation

If you find our work useful for your research, please cite our paper:

@misc{zhang2025merasereffectivefingerprinterasure,
      title={MEraser: An Effective Fingerprint Erasure Approach for Large Language Models}, 
      author={Jingxuan Zhang and Zhenhua Xu and Rui Hu and Wenpeng Xing and Xuhong Zhang and Meng Han},
      year={2025},
      eprint={2506.12551},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2506.12551}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.vscode		.vscode
README.md		README.md
cf.py		cf.py
environment.yml		environment.yml
merge.py		merge.py
mismatched_dataset.json		mismatched_dataset.json
recover.py		recover.py
recover_dataset.json		recover_dataset.json
requirements.txt		requirements.txt
test_ppl_guanaco_adapter.py		test_ppl_guanaco_adapter.py
test_uft.py		test_uft.py
utf_pipeline.sh		utf_pipeline.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MEraser: An Effective Fingerprint Erasure Approach for Large Language Models

Under Construction🏠

🚀 News

🙌Quick Start

1. Environment Setup

2. Configure Paths

3. Run the Pipeline

Citation

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

JingxuanZhang77/MEraser

Folders and files

Latest commit

History

Repository files navigation

MEraser: An Effective Fingerprint Erasure Approach for Large Language Models

Under Construction🏠

🚀 News

🙌Quick Start

1. Environment Setup

2. Configure Paths

3. Run the Pipeline

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages