CDL

Abstract

Prompt-based continual learning has shown strong performance in rehearsal-free class-incremental learning by adapting learnable prompts while freezing a pre-trained Vision Transformer (ViT) backbone. However, the effect of backbone scale remains underexplored. We observe that larger ViT backbones consistently yield better continual learning performance, which motivates us to study how to transfer such capability from a larger model to a smaller one. In this paper, we introduce Continual Distillation Learning (CDL), a new setting for knowledge distillation in rehearsal-free prompt-based continual learning. We show that conventional distillation methods provide only limited gains in CDL, mainly because task-specific prompts are forced to encode both continual adaptation and distillation knowledge, while lacking a persistent mechanism for cross-task knowledge transfer. To address this problem, we propose Decoupled Continual Distillation Learning (D-CDL), which introduces persistent Knowledge-Distillation prompts (KD-prompts) and a dedicated KD branch to explicitly decouple distillation from task adaptation. The proposed KD-prompts are propagated across tasks as a global carrier of teacher knowledge, while the original prompts remain responsible for continual learning. D-CDL is simple, general, and can be integrated into various prompt-based continual learning frameworks. Extensive experiments on Split CIFAR-100 and Split ImageNet-R across four representative continual learning methods show that D-CDL consistently outperforms existing distillation baselines and substantially improves student performance under different teacher-student settings.

Decoupled Continual Distillation Learning

we propose Decoupled Continual Distillation Learning (DCDL), which introduces persistent Knowledge-Distillation prompts (KD-prompts) and a dedicated KD branch to explicitly decouple distillation from task adaptation. The proposed KD-prompts are propagated across tasks as a global carrier of teacher knowledge, while the original prompts remain responsible for continual learning.

Experiment Results

Experiment Results: We conducted knowledge distillation experiments on three prompt-based continual learning methods, providing inspiration for future research in CDL, which can be extended to more continual learning methods.

Cifar-100

#	Teacher	Student	Baseline	KD-Method	Task-Number	Accuarcy(%)	Forgetting(%)

ImageNet-R

#	Teacher	Student	Baseline	KD-Method	Task-Number	Accuarcy(%)	Forgetting(%)

References

Code

CDL

The code for CDL.

BibTeX

Please cite CDL if it helps your research:

@misc{2024CDL,
title={Decoupled Prompting for Continual Distillation in Rehearsal-Free Class-Incremental Learning},
author={Qifan Zhang and Yunhui Guo and Yu Xiang},
year={2024},
eprint={2407.13911},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

Contact

Send any comments or questions to Qifan Zhang: qifan.zhang@utdallas.edu

Acknowledgements

This work was supported in part by the DARPA Perceptually-enabled Task Guidance (PTG) Program under contract number HR00112220005, the Sony Research Award Program, and the National Science Foundation (NSF) under Grant No. 2346528.