Continual Distillation Learning: Knowledge Distillation in Prompt-based Continual Learning

Abstract

We introduce the problem of continual distillation learning (CDL) in order to use knowledge distillation (KD) to improve prompt-based continual learning (CL) models. The CDL problem is valuable to study since the use of a larger vision transformer (ViT) leads to better performance in prompt-based continual learning. The distillation of knowledge from a large ViT to a small ViT can improve the inference efficiency for prompt-based CL models. We empirically found that existing KD methods such as logit distillation and feature distillation cannot effectively improve the student model in the CDL setup. To this end, we introduce a novel method named Knowledge Distillation based on Prompts (KDP), in which globally accessible prompts specifically designed for knowledge distillation are inserted into the frozen ViT backbone of the student model. We demonstrate that our KDP method effectively enhances the distillation performance in comparison to existing KD methods in the CDL setup.

CDL

We introduce the problem of Continual Distillation Learning (CDL) that considers differetn KD methods in the Prompt-based Continual Learning (CL) setup. The following figure illustrates the overall workflow of our experiment. The blue dashed area highlights our proposed KDP method, which addresses the limitations of other KD approaches and achieves state-of-the-art (SOTA) performance.

gto

Experiment Results

Experiment Results: We conducted knowledge distillation experiments on three prompt-based continual learning methods, providing inspiration for future research in CDL, which can be extended to more continual learning methods.
Cifar-100
# Teacher Student Baseline KD-Method Task-Number Accuarcy(%) Forgetting(%)
ImageNet-R
# Teacher Student Baseline KD-Method Task-Number Accuarcy(%) Forgetting(%)

References

    Code

    CDL

    The code for CDL.

    BibTeX

    Please cite CDL if it helps your research:
    @misc{2024CDL,
    title={Continual Distillation Learning: Knowledge Distillation in Prompt-based Continual Learning},
    author={Qifan Zhang and Yunhui Guo and Yu Xiang},
    year={2024},
    eprint={2407.13911},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
    }

    Contact

    Send any comments or questions to Qifan Zhang: qifan.zhang@utdallas.edu

    Acknowledgements

    This work was supported in part by the DARPA Perceptually-enabled Task Guidance (PTG) Program under contract number HR00112220005, the Sony Research Award Program, and the National Science Foundation (NSF) under Grant No. 2346528.