Can not transfer-learning with different number of "classes" #156

thusinh1969 · 2021-08-05T19:19:24Z

Describe the bug
Do not seem to be able to transfer learning from my own pretrained model (both are conditional-training models). The pretrained model has 20 "conditional classes" and was performing well. I then tried to use the same model to transfer learning to another dataset but with 34 "conditional classes" and got errors:

Resuming from "./results/CHECKPOINT/network-snapshot-001400.pkl"
Traceback (most recent call last):
File "train_GPU_0.py", line 547, in
main() # pylint: disable=no-value-for-parameter
File "C:\ProgramData\Anaconda3\lib\site-packages\click\core.py", line 829, in call
return self.main(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\click\core.py", line 782, in main
rv = self.invoke(ctx)
File "C:\ProgramData\Anaconda3\lib\site-packages\click\core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\ProgramData\Anaconda3\lib\site-packages\click\core.py", line 610, in invoke
return callback(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\click\decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "train_GPU_0.py", line 540, in main
subprocess_fn(rank=0, args=args, temp_dir=temp_dir)
File "train_GPU_0.py", line 389, in subprocess_fn
training_loop.training_loop(rank=rank, **args)
File "D:\AI\Furnitures\dataset_AA\GAN_DATA_for_training\GAN_data_NEW_COMBINED_HOUSE_ROOM\Individual_Style_to_Context_dataset_corrected\Contemporary\StyleGANV2-pytorch\training\training_loop.py", line 163, in training_loop
misc.copy_params_and_buffers(resume_data[name], module, require_all=False)
File "D:\AI\Furnitures\dataset_AA\GAN_DATA_for_training\GAN_data_NEW_COMBINED_HOUSE_ROOM\Individual_Style_to_Context_dataset_corrected\Contemporary\StyleGANV2-pytorch\torch_utils\misc.py", line 160, in copy_params_and_buffers
tensor.copy(src_tensors[name].detach()).requires_grad_(tensor.requires_grad)
RuntimeError: The size of tensor a (34) must match the size of tensor b (20) at non-singleton dimension 1_

To Reproduce
Have tried a few time with different dataset, same kind of error.

I though that we should be able to transfer learning regardless of number of class to take advantages of the pretrained weights for the most of network ?

Any help is highly appreciated.
Steve

thusinh1969 · 2021-08-05T19:24:01Z

Answer in #98 already but for non-conditioning vs conditioning. Any idea how to transfer learning between 2 conditioning-models ?

Would something like this work? Where do we apply this change ?
(from Lightning-AI/pytorch-lightning#4690 (comment))

_def on_load_checkpoint(self, checkpoint: dict) -> None:
state_dict = checkpoint["state_dict"]
model_state_dict = self.state_dict()
is_changed = False
for k in state_dict:
if k in model_state_dict:
if state_dict[k].shape != model_state_dict[k].shape:
logger.info(f"Skip loading parameter: {k}, "
f"required shape: {model_state_dict[k].shape}, "
f"loaded shape: {state_dict[k].shape}")
state_dict[k] = model_state_dict[k]
is_changed = True
else:
logger.info(f"Dropping parameter {k}")
is_changed = True

    if is_changed:
        checkpoint.pop("optimizer_states", None)_

Thanks,
Steve

Gass2109 · 2021-08-06T08:25:53Z

We can transfer learning between 2 conditional models with different number of classes, but in this case we will not copy the parameters of the embedding layer "mapping.embed" in G and D (its shape depends on the number of classes taken as an input). For this, you need to modify the function "copy_params_and_buffers" in "torch_utils/misc.py" in such a way that it does not copy all the parameters of the pretrained model.
For example, you can use

if name in src_tensors and "embed" not in name:   
        tensor.copy_(src_tensors[name].detach()).requires_grad_(tensor.requires_grad)

thusinh1969 · 2021-08-06T16:53:29Z

https://github.com/Gass2109 I have changed the code into what you proposed, and got errors:
def copy_params_and_buffers(src_module, dst_module, require_all=False):
assert isinstance(src_module, torch.nn.Module)
assert isinstance(dst_module, torch.nn.Module)
src_tensors = {name: tensor for name, tensor in named_params_and_buffers(src_module)}
for name, tensor in named_params_and_buffers(dst_module):
assert (name in src_tensors) or (not require_all)
if name in src_tensors and "embed" not in name:
tensor.copy_(src_tensors[name].detach()).requires_grad_(tensor.requires_grad)

------------ ERROR ---------------

File ".\training\networks.py", line 602, in forward
y = x.reshape(G, -1, F, c, H, W) # [GnFcHW] Split minibatch N into n groups of size G, and channels C into F groups of size c.
RuntimeError: shape '[8, -1, 1, 512, 4, 4]' is invalid for input of size 163840

Any hint please.

Thanks,
Steve

thusinh1969 · 2021-08-06T17:05:12Z

https://github.com/Gass2109 I have changed the code into what you proposed, and got errors:
def copy_params_and_buffers(src_module, dst_module, require_all=False):
assert isinstance(src_module, torch.nn.Module)
assert isinstance(dst_module, torch.nn.Module)
src_tensors = {name: tensor for name, tensor in named_params_and_buffers(src_module)}
for name, tensor in named_params_and_buffers(dst_module):
assert (name in src_tensors) or (not require_all)
if name in src_tensors and "embed" not in name:
tensor.copy_(src_tensors[name].detach()).requires_grad_(tensor.requires_grad)

------------ ERROR ---------------

File ".\training\networks.py", line 602, in forward
y = x.reshape(G, -1, F, c, H, W) # [GnFcHW] Split minibatch N into n groups of size G, and channels C into F groups of size c.
RuntimeError: shape '[8, -1, 1, 512, 4, 4]' is invalid for input of size 163840

Any hint please.

Thanks,
Steve

My BAD !!! I screwed up batch size to 20, should divide by 8, change it to 24 and it works now, thank you.

Steve

MationPlays · 2022-04-26T12:17:41Z

Answer in #98 already but for non-conditioning vs conditioning. Any idea how to transfer learning between 2 conditioning-models ?

Would something like this work? Where do we apply this change ? (from PyTorchLightning/pytorch-lightning#4690 (comment))

_def on_load_checkpoint(self, checkpoint: dict) -> None: state_dict = checkpoint["state_dict"] model_state_dict = self.state_dict() is_changed = False for k in state_dict: if k in model_state_dict: if state_dict[k].shape != model_state_dict[k].shape: logger.info(f"Skip loading parameter: {k}, " f"required shape: {model_state_dict[k].shape}, " f"loaded shape: {state_dict[k].shape}") state_dict[k] = model_state_dict[k] is_changed = True else: logger.info(f"Dropping parameter {k}") is_changed = True
    if is_changed:
        checkpoint.pop("optimizer_states", None)_
Thanks, Steve

have you managed to implement stylegan2 into lightning? Do you have a repo for this? This would be great!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can not transfer-learning with different number of "classes" #156

Can not transfer-learning with different number of "classes" #156

thusinh1969 commented Aug 5, 2021 •

edited

Loading

thusinh1969 commented Aug 5, 2021 •

edited

Loading

Gass2109 commented Aug 6, 2021 •

edited

Loading

thusinh1969 commented Aug 6, 2021 •

edited

Loading

thusinh1969 commented Aug 6, 2021 •

edited

Loading

MationPlays commented Apr 26, 2022

Can not transfer-learning with different number of "classes" #156

Can not transfer-learning with different number of "classes" #156

Comments

thusinh1969 commented Aug 5, 2021 • edited Loading

thusinh1969 commented Aug 5, 2021 • edited Loading

Gass2109 commented Aug 6, 2021 • edited Loading

thusinh1969 commented Aug 6, 2021 • edited Loading

thusinh1969 commented Aug 6, 2021 • edited Loading

MationPlays commented Apr 26, 2022

thusinh1969 commented Aug 5, 2021 •

edited

Loading

thusinh1969 commented Aug 5, 2021 •

edited

Loading

Gass2109 commented Aug 6, 2021 •

edited

Loading

thusinh1969 commented Aug 6, 2021 •

edited

Loading

thusinh1969 commented Aug 6, 2021 •

edited

Loading