Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant Difference Between Weight Values in Alpha When I test the finetune weight you priovide #141

Open
Polaris7777777 opened this issue Jul 26, 2024 · 0 comments

Comments

@Polaris7777777
Copy link

Polaris7777777 commented Jul 26, 2024

Hi,

Thanks for your excellent work! I'm currently testing the provided finetune weights and have encountered an issue with the alpha weights . It appears that the two components of the alpha weights are significantly different in value.

Here is a snippet of my code and the relevant output:

# lib/model/DSTformer.py - forward()
x = x_st * alpha[:,:,0:1] + x_ts * alpha[:,:,1:2]
print("layer_{} alpha_1_max:{}, alpha_2_min:{}".format(idx, torch.max(alpha[:,:,0:1]).item(), torch.min(alpha[:,:,1:2]).item()))

The printed output shows that the weights alpha[:,:,0:1] and alpha[:,:,1:2] differ substantially:

layer_0 alpha_1_max:9.99598737116969e-10, alpha_2_min:1.0
layer_1 alpha_1_max:4.02332503852541e-23, alpha_2_min:1.0
layer_2 alpha_1_max:1.9260750772076562e-09, alpha_2_min:1.0
layer_3 alpha_1_max:1.2977922027508117e-14, alpha_2_min:1.0
layer_4 alpha_1_max:0.0, alpha_2_min:1.0
layer_0 alpha_1_max:2.1676853645402616e-09, alpha_2_min:1.0
layer_1 alpha_1_max:2.673770831954615e-23, alpha_2_min:1.0
layer_2 alpha_1_max:7.599877394071086e-10, alpha_2_min:1.0
layer_3 alpha_1_max:1.0472003852176476e-14, alpha_2_min:1.0
layer_4 alpha_1_max:0.0, alpha_2_min:1.0
layer_0 alpha_1_max:1.7226050585961161e-09, alpha_2_min:1.0
layer_0 alpha_1_max:1.1236861441332735e-09, alpha_2_min:1.0
layer_1 alpha_1_max:9.52468132608953e-24, alpha_2_min:1.0
layer_1 alpha_1_max:7.056281280022704e-23, alpha_2_min:1.0
layer_2 alpha_1_max:1.3182201996642107e-09, alpha_2_min:1.0
layer_2 alpha_1_max:1.484916811733683e-08, alpha_2_min:1.0
layer_3 alpha_1_max:9.93305372849751e-15, alpha_2_min:1.0
layer_3 alpha_1_max:1.0573479246290488e-14, alpha_2_min:1.0
layer_4 alpha_1_max:0.0, alpha_2_min:1.0
layer_4 alpha_1_max:0.0, alpha_2_min:1.0
......

As you can see, one of the weights is always near zero while the other is near one.

Could you provide some insights into why this might be happening ?

Thank you for your assistance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant