Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vivado-equivalent implementation of Softmax on Quartus #540

Merged
merged 5 commits into from
May 10, 2022

Conversation

bo3z
Copy link
Contributor

@bo3z bo3z commented May 5, 2022

  • Implemented fixed_point_utils.py, emulating some fixed-point and bit-manipulation operations in Python. These are usually done through HLS, but are needed in Python for generating LUTs. For reasoning behind Softmax LUT, see Vivado implemenation, in templates/vivado/nnet_utils/nnet_activation.h .
  • Stable and Latency implementation of Softmax. Stable performs well (>98%). Like Vivado, Latency approach needs further work to improve accuracy.
  • Expanded PyTest to include Softmax tests on Quartus.

@thesps
Copy link
Contributor

thesps commented May 5, 2022

How about moving the computation of the table data out of the writer and into a backend optimizer pass? You could attach the table data to the layer in an analogous way to weights for other layers.

@bo3z
Copy link
Contributor Author

bo3z commented May 5, 2022

I don't see why not. But then, for consistency sake, we should do that for all of the other activations as well - which, I'm more than happy to do, but feel like this should be a part of a separate PR? Not necessarily softmax-specific?

@thesps
Copy link
Contributor

thesps commented May 6, 2022

for consistency sake, we should do that for all of the other activations as well - which, I'm more than happy to do, but feel like this should be a part of a separate PR?

I agree, let's not do that here. But I think it still makes sense to do it for softmax in an optimizer pass. I think these write_exp_table and write_inv_table methods are for softmax specifically right?

I think the logic using this new FixedPointEmulator is not quite right. For example for the inversion:

        for i in range(table_size):
            f = FixedPointEmulator(fp_bits, fp_integer, signed=fp_signed)
            f.set_msb_bits(uint_to_binary(i, N))
            if f.to_float()!=0 :
                real_val = f.inv_float()

It looks essentially like you're using the same type for the address (i) and the data stored at that location (f) by setting f.set_msb_bits(uint_to_binary(i, N)) . The logic of the Vivado version is like:

for each address:
   get the float value corresponding to the address (given its type)
   apply the float function to that float (exp or inv)
   write the fixed-precision representation of that float to the table at the address

, and I think what's happening here in the Quartus code is a little different.

The pytest for Softmax just checks the accuracy by taking the argmax in the end, but could you take a look at the values produced to check this (e.g. here)?

hls4ml/writer/quartus_writer.py Outdated Show resolved Hide resolved
@bo3z
Copy link
Contributor Author

bo3z commented May 6, 2022

It looks essentially like you're using the same type for the address (i) and the data stored at that location (f) by setting f.set_msb_bits(uint_to_binary(i, N))

Re: The logic breakdown behind this is:

  1. Given an integer i (the address or index in this case), extract its bits as an array of length N (the number of bits needed for integer i is always less than N, because the setup is such that the index loops from 0 to N); N is usually 10 (table of size 1024)

  2. Create a fixed point number, with some number of total bits (default 16) and integer bits (default 6) and set it's first N bits to the bits of i. So in the default case, all of the integer bits (the first 6, take the values of bits of i) and then another 4 take the remaining bits of i. The remaining bits (6 in this case) are unaffected.

  3. This is internally converted to float before doing any operation (inv or exp)

To my understanding, this is equivalent to the Vivado C++ function below:

template<class data_T, typename CONFIG_T>
inline float softmax_real_val_from_idx(unsigned i){
    // Treat the index as the top N bits
    static constexpr int N = ceillog2(CONFIG_T::table_size); // number of address bits for table
    data_T x(0);
    x(x.width-1, x.width-N) = i;
    return (float) x;
} 

I did also check the outputs with Vivado HLS and a stand-alone Python script to test this functionality, and the values were equal.

  1. Final step, given a float, obtained by such bit manipulation (steps 2-3), do the function. This is done in Python, by the two functions, exp_float and inv_float. However, this produces a floating point result, which is saved to an external file. This is fine, because Quartus HLS can inherently handle floating to fixed point conversion at run-time.

However, there seems to be a discrepancy in the raw values between Quartus and Vivado, so I will fix that.

@bo3z
Copy link
Contributor Author

bo3z commented May 6, 2022

I agree, let's not do that here. But I think it still makes sense to do it for softmax in an optimizer pass. I think these write_exp_table and write_inv_table methods are for softmax specifically right?

No, this was recent change. Look-up tables for Quartus need to be generated from Python and written to external files before compile-time. Unlike Vivado (which can generate tables only once and re-use them), on Quartus this doesn't work (or so I've been told). Therefore, all look-up tables (Sigmoid, Softmax, Tanh, Softplus etc.) are generated from quartus_writer.py. This used to be one big method, generating all of these tables, one by one, but was recently split up in a PR attempting to improve sigmoid accuracy. The exact commit is seen here: 0258f23
So I agree we should definitely do the change, but these methods are not softmax-specific.

@thesps
Copy link
Contributor

thesps commented May 6, 2022

4. the

I think the issue is just that, in the Vivado code you showed, the computation is done for data_T (the data type of the layer input), whereas the new Quartus writer code is doing this on the exp_table_t and inv_table_t. But in general those three could all be different.

@bo3z
Copy link
Contributor Author

bo3z commented May 6, 2022

I think the issue is just that, in the Vivado code you showed, the computation is done for data_T (the data type of the layer input), whereas the new Quartus writer code is doing this on the exp_table_t and inv_table_t. But in general those three could all be different.

Exactly, I just realised that an hour ago. So it needs to be changed from exp_table_t and inv_table_t to the precision used by that layer.
Adding on to this, so the exponentials should be generated with a fixed point type of data_T, and inverses with inv_table_t, as in Vivado. Should be straightforward to fix.

@bo3z
Copy link
Contributor Author

bo3z commented May 6, 2022

Addressed discrepancy between data and tables, as explained above. Manually checked results to Keras prediction for 5 tests, on 5 data points. Results are accurate. RegEx was removed, in favour of attributes. Commit with main changes here: 937cc49

hls4ml/writer/quartus_writer.py Outdated Show resolved Hide resolved
@thesps thesps merged commit 6f6e3b2 into fastmachinelearning:master May 10, 2022
calad0i pushed a commit to calad0i/hls4ml that referenced this pull request Jul 1, 2023
Vivado-equivalent implementation of Softmax on Quartus
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants