An Empirical Exploration of
Continual Unlearning for Image Generation

The Ohio State University, Michigan State University
ICML 2025 Workshop on Machine Unlearning for Generative AI
Main figure showing sequential vs simultaneous unlearning comparison

In real-world applications, unlearning requests often arrive sequentially. For example, when artists request the removal of their style or companies request the removal of copyrighted characters.

Main figure showing sequential vs simultaneous unlearning comparison

Unlearning a concept from the previously unlearned model checkpoint, leads to model degradation, the unintentional unlearning of other concepts. This occurs for unlearning both styles (left) and objects (right).

Abstract

Machine unlearning---the ability to remove designated concepts from a pre-trained model---has advanced swiftly. However, existing methods typically assume that unlearning requests arrive all at once, whereas in practice, they often occur sequentially. In this paper, we present the first systematic study of continual unlearning in text-to-image generation. We show that popular unlearning methods suffer from rapid retention failures: after only a few requests, the model drastically forgets retained knowledge and produces degraded images. Our analysis attributes this behavior to cumulative parameter drift, which causes successive unlearned models to progressively diverge from the pre-training manifold. Motivated by this insight, we investigate add-on mechanisms that (1) mitigate drift and (2) crucially, remain compatible with existing unlearning methods. Extensive experiments demonstrate that constraining model updates and merging independently unlearned models are effective solutions, suggesting promising directions for future exploration. Taken together, our study positions continual unlearning as a fundamental problem in image generation, offering insights, accessible baselines, and open challenges to advance safe and accountable generative AI.

Sequential vs Simultaneous

Style: Sequential vs Simultaneous Unlearning

Unlearning 12 Styles

Object: Sequential vs Simultaneous Unlearning

Unlearning 12 Objects

Unlearning sequentially leads to faster model degradation compared to unlearning simultaneously. Sequential unlearns from the previous checkpoint while simultaneous, restarts from the base model and unlearns all requests (previous + current).

Training Costs

Simultaneous unlearning requires significantly more resources.

Main figure showing sequential vs simultaneous unlearning comparison

We find that simultaneous unlearning uses much smaller parameter updates. The x-axis shows the parameters updated during unlearning, with more red indicating a greater L2 distance from the base model parameters. This motivates us to explore methods that constrain update magnitude.

Exploring Methods to Constrain Update Magnitude

Main figure showing sequential vs simultaneous unlearning comparison

Methods to constrain update magnitude. We explore regularizers (L1/L2) with respect to the previous checkpoint. Selective Finetuning which chooses a small subset of parameters to update based off one forward pass of the unlearning loss. TIES Model Merging which independently unlearns each request, prunes the checkpoint, and merges the weights.

Style: Sequential vs Simultaneous Unlearning

Unlearning 12 Styles

Object: Sequential vs Simultaneous Unlearning

Unlearning 12 Objects

Qualitative Results: Styles

Qualitative Results: Styles

Qualitative Results: Objects

Qualitative Results: Objects

We see strong improvements to retaining the cross-domain. When unlearning style we do well retaining objects and vice versa. It seems disentangling concepts within the same domain is a harder challenge.

Developing New Methods

Main figure showing sequential vs simultaneous unlearning comparison

Unlearning methods typically update just the cross attention layers to minimize model degradation. Understanding that cross-attention is the relationship between text and images, we find that text embedding cosine similarity with the target concept (e.g. Abstractionism Style) is a strong indicator for which concepts will be unintentionally unlearned.

Main figure showing sequential vs simultaneous unlearning comparison

We develop a new method that builds a text embedding subspace of concepts we want to retain. Then we project the unelarning gradient to be orthogonal to this subspace. This allows us to unlearn the target concept while retaining the concepts we want to keep.

Style: Sequential vs Simultaneous Unlearning

Unlearning 12 Styles

Object: Sequential vs Simultaneous Unlearning

Unlearning 12 Objects

Using our projection method we see strong improvements to retaining within-domain concepts.

Poster

BibTeX

@inproceedings{lee2025empirical,
  title={An Empirical Exploration of Continual Unlearning for Image Generation},
  author={Lee, Justin and Mai, Zheda and Fan, Chongyu and Chao, Wei-Lun},
  booktitle={ICML 2025 Workshop on Machine Unlearning for Generative AI}
}