Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ScalarMult using endomorphism #8

Merged
merged 1 commit into from
Feb 5, 2015

Conversation

jimmysong
Copy link
Contributor

Optimize ScalarMult using endomorphism

This implements a speedup to ScalarMult using the endomorphism available to secp256k1.

Note the constants lambda, beta, a1, b1, a2 and b2 are from here:

https://bitcointalk.org/index.php?topic=3238.0

Preliminary tests indicate a speedup of between 23%-28% (BenchScalarMult).

More speedup can probably be achieved once splitK uses something more like what fieldVal uses. Unfortunately, the prime for this math is the order of G (N), not P.

Note the NAF optimization was specifically not done as that's the purview of another issue.

This closes #1

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.57%) when pulling 672ab72 on jimmysong:1 into 4ca0daa on conformal:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.37%) when pulling 672ab72 on jimmysong:1 into 4ca0daa on conformal:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.57%) when pulling 672ab72 on jimmysong:1 into 4ca0daa on conformal:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.35%) when pulling 8843597 on jimmysong:1 into 4ca0daa on conformal:master.

@jimmysong
Copy link
Contributor Author

I figured out why the splitK function needed the +3. It's because fieldVal doesn't take negative values. This optimization makes negative k values possible. However, adding enough to the c_n's at the right time seems to always produce positive numbers.

@jimmysong
Copy link
Contributor Author

No more fudging add 3. There was also another error besides the sign which wasn't passed through. That was another error related to left-to-right addition that I fixed.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.12%) when pulling 5daa225 on jimmysong:1 into 4ca0daa on conformal:master.

@jimmysong
Copy link
Contributor Author

Removed the modulo logic (actually unnecessary) from the splitK function. Gave a 8-9% speed boost.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.33%) when pulling 5aecdcf on jimmysong:1 into 4ca0daa on conformal:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.34%) when pulling b4cadcb on jimmysong:1 into 4ca0daa on conformal:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.07%) when pulling 7425341 on jimmysong:1 into 4ca0daa on conformal:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.14%) when pulling 611c140 on jimmysong:1 into 4ca0daa on conformal:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.14%) when pulling 4f948a9 on jimmysong:1 into 4ca0daa on conformal:master.

@davecgh
Copy link
Member

davecgh commented Sep 27, 2014

Thanks for this Jimmy.

I'll review it in detail over the weekend, but I briefly looked over it so far and confirmed the choices for λ and β are accurate. That is to say the following equations hold:

β^3 (mod P) = 1
λ^3 (mod N) = 1
λ^2 + λ + 1 (mod N) = 0

@coveralls
Copy link

Coverage Status

Coverage increased (+0.32%) when pulling 8848df1 on jimmysong:1 into d694428 on conformal:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.11%) when pulling 386a945 on jimmysong:1 into d694428 on conformal:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.11%) to 97.97% when pulling 2206dff on jimmysong:1 into f9365fd on btcsuite:master.

@jimmysong
Copy link
Contributor Author

@davecgh I've rebased this one so the merge is smooth. Would it be possible to look at this and #10 soonish? I'd hate for these pr's to just sit there.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.33%) to 97.25% when pulling 69d8e09 on jimmysong:1 into 9535058 on btcsuite:master.

@jimmysong
Copy link
Contributor Author

@davecgh, I finally understand how lambda and beta are derived using Fermat's Little Theorem:

http://bitcoin.stackexchange.com/questions/35814/how-do-you-derive-the-lambda-and-beta-values-for-endomorphism-on-the-secp256k1-c/

Might be of interest to you since you were wondering this yourself.

// G^N = 1 and thus any other valid point on the elliptical curve has the
// same order.
func (curve *KoblitzCurve) moduloReduce(k []byte) []byte {
var newK []byte
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do without this additional local here. Inside the first if branch, just return tmpK.Bytes(). Then just eliminate the else branch and return k.

if len(k) > curve.BitSize/8 {
    tmpK := new(big.Int).SetBytes(k)
    tmpK.Mod(tmpK, curve.N)
    return tmpK.Bytes()
}
return k

@jimmysong
Copy link
Contributor Author

@davecgh @jrick , made changes to address your concerns.

@jimmysong
Copy link
Contributor Author

@davecgh, @jrick, something is weird with travis. doesn't know the command "go get". Can you rerun?

@davecgh
Copy link
Member

davecgh commented Feb 3, 2015

Thanks for updating @jimmysong. It appears Travis just updated their release version of Go, so I had to modify the .travis.yml. If you rebase on top of master, it will build again.

Also, I'll just modify it after the PR goes in to avoid going back and forth, but I absolutely do not like the const curveSizestuff. I know that currently it only does secp256k1, but I want to move the package more towards being able to support other curves, not further away like that change does.

The constant needs to be defined on the curve, so other curves can work as well.

@jimmysong
Copy link
Contributor Author

@davecgh, rebased and took out the const stuff.

Is there a way to make a const field in a struct in go or is the way I did it acceptable?

@coveralls
Copy link

Coverage Status

Coverage increased (+0.34%) to 97.05% when pulling 3a91c2f on jimmysong:1 into 46829e8 on btcsuite:master.

@davecgh
Copy link
Member

davecgh commented Feb 3, 2015

@jimmysong The way you did it is great. Thanks!

This implements a speedup to ScalarMult using the endomorphism available to secp256k1.

Note the constants lambda, beta, a1, b1, a2 and b2 are from here:

https://bitcointalk.org/index.php?topic=3238.0

Preliminary tests indicate a speedup of between 17%-20% (BenchScalarMult).

More speedup can probably be achieved once splitK uses something more like what fieldVal uses. Unfortunately, the prime for this math is the order of G (N), not P.

Note the NAF optimization was specifically not done as that's the purview of another issue.

Changed both ScalarMult and ScalarBaseMult to take advantage of curve.N to reduce k.
This results in a 80% speedup to large values of k for ScalarBaseMult.
Note the new test BenchmarkScalarBaseMultLarge is how that speedup number can
be checked.

This closes btcsuite#1
@@ -18,8 +18,7 @@ var secp256k1BytePoints = []byte{}
// 0..n-1 where n is the curve's bit size (256 in the case of secp256k1)
// the coordinates are recorded as Jacobian coordinates.
func (curve *KoblitzCurve) getDoublingPoints() [][3]fieldVal {
bitSize := curve.Params().BitSize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bitSize is used below in the loop too. It should be updated to curve.BitSize as well or the generation code will fail to compile.

You can run rm secp256k1.go; go generate to test.

@davecgh
Copy link
Member

davecgh commented Feb 4, 2015

Alright, so I'm letting this run every signature on the block chain before merging just to be paranoid, but I've independently derived and double checked all of the math and everything looks accurate. In particular:

The possible values for λ and ß are derived with:

λ = 2^((N-1) / 3) = ac9c52b33fa3cf1f5ad9e3fd77ed9ba4a880b9fc8ec739c2e0cfc810b51283ce
λ = 3^((N-1) / 3) = 5363ad4cc05c30e0a5261c028812645a122e22ea20816678df02967c1b23bd72
ß = 2^((P-1) / 3) = 7ae96a2b657c07106e64479eac3434e99cf0497512f58995c1396c28719501ee
ß = 3^((P-1) / 3) = 851695d49a83f8ef919bb86153cbcb16630fb68aed0a766a3ec693d68e6afa40

The two possible λ and ß values are indeed squares of one another:

λ^2 = ac9c52b33fa3cf1f5ad9e3fd77ed9ba4a880b9fc8ec739c2e0cfc810b51283ce^2 (mod N) = 5363ad4cc05c30e0a5261c028812645a122e22ea20816678df02967c1b23bd72
λ^2 = 5363ad4cc05c30e0a5261c028812645a122e22ea20816678df02967c1b23bd72^2 (mod N) = ac9c52b33fa3cf1f5ad9e3fd77ed9ba4a880b9fc8ec739c2e0cfc810b51283ce
ß^2 = 7ae96a2b657c07106e64479eac3434e99cf0497512f58995c1396c28719501ee^2 (mod P) = 851695d49a83f8ef919bb86153cbcb16630fb68aed0a766a3ec693d68e6afa40
ß^2 = 851695d49a83f8ef919bb86153cbcb16630fb68aed0a766a3ec693d68e6afa40^2 (mod P) = 7ae96a2b657c07106e64479eac3434e99cf0497512f58995c1396c28719501ee

Benchmarking the available options for λ and ß empirically show that the values chosen in this PR provide the greatest speedup. In particular, the generator for λ chosen is 3 while the generator for ß chosen is 2.

The values chosen for the linearly independent vectors used during computation of the endomorphism have been independently derived and verified to satisfy the equation f(v) = a+bλ mod N = 0 for the chosen λ:

a1 = 3086d221a7d46bcde86c90e49284eb15
b1 = -e4437ed6010E88286f547fa90abfe4c3
a2 = 114ca50f7a8e2f3f657c1108d9d44cfd8
b2 = 3086d221a7d46bcde86c90e49284eb15

3086d221a7d46bcde86c90e49284eb15 + -e4437ed6010E88286f547fa90abfe4c3 * λ (mod N) = 0
114ca50f7a8e2f3f657c1108d9d44cfd8 + 3086d221a7d46bcde86c90e49284eb15 * λ (mod N) = 0

Finally, the following equations hold as required:

λ^3 (mod N) = 1
ß^3 (mod P) = 1
λ^2 + λ + 1 (mod N) = 0

@conformal-deploy conformal-deploy merged commit 95b23c2 into btcsuite:master Feb 5, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make use of the secp256k1 endormorphism
5 participants