DKT

Distributed Knowledge Transfer (DKT) method for Distributed Continual Learning

Master thesis project for University of Pisa. This work introduces the Distributed Continual Learning (DCL) research area and the Distributed Knowledge Transfer (DKT) architecture.

Please notice that the code uploaded here needs pre-trained model to work, do not esitate to contact me for further information.

Distributed Continual Learning

In Distributed Continual Learning (DCL), the term ”distributed” refers to a complex and highly interconnected environment that involves multiple agents working together to improve their performance through the exchange of information during the training process. What distinguishes the DCL approach is fusion with the continual learning environment, whereby models continuously give and take their state with each other at regular intervals, creating a highly dynamic and adaptive training process.

Distributed Knowledge Transfer

The proposed method applies knowledge distillation to the distributed continual scenario. The proposed architecture attaches two distinct classification heads (fig. 3.2) to a feature extractor. The first head, called continual learning head (CL), uses cross- entropy loss to optimize the model performance on the hard targets of the current experience, while the second head, called student head (ST), adopts another loss function (typically KD loss or MSE) using as target the predictions of another model on the very same experience.

The loss function is the sum of two head-specific loss functions:

$$\begin{equation} \mathcal{L} = \mathcal{L}_{cl} + \mathcal{L}_{kd} \end{equation}$$

The first term of the sum is relative to the Continual Learning (CL) head. It is the classic cross-entropy between the target $t_i$ and $q_{cl}$, the soft-max of the cl head

$$\begin{equation} \mathcal{L}_{cl} = \mathcal{L}_{ce}(q^{cl}) = - \sum_i^C t_i \log{q_{cl}^{(i)}} \end{equation}$$

The second term is relative to the Student head (ST). It is the KD loss (in this case MSE has been used) between q̂_tc the soft-targets of the teacher model, q̂_cl the soft-targets of the student head distilled at the same temperature

$$\begin{equation} \mathcal{L}_{st} = \mathcal{L}_{kd}(q^{st}) = - \sum_i^n \frac{(\hat{q}_{tc}^{(i)} - \hat{q}_{st}^{(i)})^2}{n} \end{equation}$$

Requirements

The requirements are contained in the requirements.txt file and can be installed via pip:

pip install -r requirements.txt

The project has been developed using the Avalanche Continual Learning Library, which is based on Pytorch. Most of the dependencies are already contained in the Avalanche installation.

Please notice that the requirements were rawly extracted from the conda environment hence they need to be pruned.

You can use newer version of Cuda/Pytorch as I was limited by the NVIDIA driver of the machine I was working on.

How to run

The thesis consisted of three experiments and each one can be run with the specific script

Experiment1 -----> cifar100_training.py Experiment2 -----> splitcifar100_pretrained.py Experiment3 -----> step_training.py

They can be executed via python:

python cifar100_training.py

Further information

If you want to know more about this project you can consult my Master Thesis

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
loss		loss
metrics		metrics
models		models
plugins		plugins
src		src
utils		utils
README.md		README.md
cifar100_training.py		cifar100_training.py
cka.ipynb		cka.ipynb
cka.py		cka.py
feature_extraction.py		feature_extraction.py
lwf_train.py		lwf_train.py
naive_finetuning.py		naive_finetuning.py
requirement.txt		requirement.txt
splitcifar100_pretrained.py		splitcifar100_pretrained.py
step_training.py		step_training.py
teacher_test.py		teacher_test.py
teacher_training.py		teacher_training.py
test_given_path.py		test_given_path.py
tsne_grid_search.py		tsne_grid_search.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DKT

Distributed Continual Learning

Distributed Knowledge Transfer

Requirements

How to run

Further information

About

Releases

Packages

Languages

AndreZupp/DKT

Folders and files

Latest commit

History

Repository files navigation

DKT

Distributed Continual Learning

Distributed Knowledge Transfer

Requirements

How to run

Further information

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages