DeepLeak: Privacy Enhancing Hardening of Model Explanations Against Membership Leakage

This repository contains the implementation of our IEEE SaTML 2026 paper DeepLeak: Privacy Enhancing Hardening of Model Explanations Against Membership Leakage. DeepLeak is a system to audit and mitigate privacy risks in post-hoc explanation methods. It advances the state-of-the-art in three ways: (1) comprehensive leakage profiling: we develop a stronger explanation-aware membership inference attack to quantify how much representative explanation methods leak membership information under default configurations; (2) lightweight hardening strategies: we introduce practical, model-agnostic mitigations, including sensitivity-calibrated noise, attribution clipping, and masking, that substantially reduce membership leakage while preserving explanation utility; and (3) root-cause analysis: through controlled experiments, we pinpoint algorithmic properties that drive leakage in ML explanation methods.

This repository containts code to reproduce results for the paper on CIFAR-100.

Attack Phase: Select top-k seeds based on the most vulnerable settings per XAI method.
Optimization Phase: Explore parameter configurations that minimize privacy leakage while preserving explanation utility.

Directory Structure

DeepLeak/
│
├── data/                     # Dataset pickle files and attribution outputs
├── datasets/                 # Dataset loading and splitting utilities
├── models/                   # Model definitions and pretrained weights
├── utils/                    # Training and evaluation utilities
├── xai_methods/              # Attribution generators and wrappers
│   └── captum_wrappers.py    # Captum-specific method modifications
│   └── attribution_wrappers.py
│   └── generate_attributions.py

How to Run

Step 1: Set Up Environment

Install the required dependencies:

pip install -r requirements.txt

Step 2: Run the Pipeline

XAI methods keywords: SMAP: Saliency Map, GBackProp: Guided BackProp, IG: Integrated Gradients, SHAP: SHAP, LIME: LIMESmoothGrad: SmoothGrad, VarGrad: VarGrad, DeepLift: DeepLIFT, Occlusion: Occlusion, GGC: GradCam++, GC:GradCAM, KSHAP: K, DCAttr: Deconvolution, INGRAttr: InputXGrad, ProtoDa: ProtoDash, Anchor: Anchors

Attack Phase Only

python main.py --mode attack --xai_methods SMAP SHAP LIME --topk 3 --trials 20

Optimization Phase Only

python main.py --mode optimize

Run Both Phases

python main.py --mode both --xai_methods SMAP SHAP LIME --topk 3 --trials 20

Notes on Captum XAI Methods

For XAI methods based on Captum (guided_backprop_deconvnet, guided_grad_cam.py, input_x_gradient, saliency), please make sure to modify your captum library with these privacy applied mechanisms.

Output

Attack phase results will be saved to top_3_tprs.csv, listing the top seeds and their respective TPR scores.
Optimized parameter settings and performance metrics are saved as optuna_results_<XAI>+<seed>.pkl.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepLeak: Privacy Enhancing Hardening of Model Explanations Against Membership Leakage

Directory Structure

How to Run

Step 1: Set Up Environment

Step 2: Run the Pipeline

Attack Phase Only

Optimization Phase Only

Run Both Phases

Notes on Captum XAI Methods

Output

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Captum with Privacy/captum/attr/_core		Captum with Privacy/captum/attr/_core
data		data
datasets		datasets
models		models
utils		utils
xai_methods		xai_methods
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

DeepLeak: Privacy Enhancing Hardening of Model Explanations Against Membership Leakage

Directory Structure

How to Run

Step 1: Set Up Environment

Step 2: Run the Pipeline

Attack Phase Only

Optimization Phase Only

Run Both Phases

Notes on Captum XAI Methods

Output

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages