Vivado-equivalent implementation of Softmax on Quartus #540

bo3z · 2022-05-05T11:28:05Z

Implemented fixed_point_utils.py, emulating some fixed-point and bit-manipulation operations in Python. These are usually done through HLS, but are needed in Python for generating LUTs. For reasoning behind Softmax LUT, see Vivado implemenation, in templates/vivado/nnet_utils/nnet_activation.h .
Stable and Latency implementation of Softmax. Stable performs well (>98%). Like Vivado, Latency approach needs further work to improve accuracy.
Expanded PyTest to include Softmax tests on Quartus.

thesps · 2022-05-05T12:46:26Z

How about moving the computation of the table data out of the writer and into a backend optimizer pass? You could attach the table data to the layer in an analogous way to weights for other layers.

bo3z · 2022-05-05T13:13:14Z

I don't see why not. But then, for consistency sake, we should do that for all of the other activations as well - which, I'm more than happy to do, but feel like this should be a part of a separate PR? Not necessarily softmax-specific?

thesps · 2022-05-06T10:38:44Z

for consistency sake, we should do that for all of the other activations as well - which, I'm more than happy to do, but feel like this should be a part of a separate PR?

I agree, let's not do that here. But I think it still makes sense to do it for softmax in an optimizer pass. I think these write_exp_table and write_inv_table methods are for softmax specifically right?

I think the logic using this new FixedPointEmulator is not quite right. For example for the inversion:

        for i in range(table_size):
            f = FixedPointEmulator(fp_bits, fp_integer, signed=fp_signed)
            f.set_msb_bits(uint_to_binary(i, N))
            if f.to_float()!=0 :
                real_val = f.inv_float()

It looks essentially like you're using the same type for the address (i) and the data stored at that location (f) by setting f.set_msb_bits(uint_to_binary(i, N)) . The logic of the Vivado version is like:

for each address:
   get the float value corresponding to the address (given its type)
   apply the float function to that float (exp or inv)
   write the fixed-precision representation of that float to the table at the address

, and I think what's happening here in the Quartus code is a little different.

The pytest for Softmax just checks the accuracy by taking the argmax in the end, but could you take a look at the values produced to check this (e.g. here)?

hls4ml/writer/quartus_writer.py

bo3z · 2022-05-06T12:12:00Z

It looks essentially like you're using the same type for the address (i) and the data stored at that location (f) by setting f.set_msb_bits(uint_to_binary(i, N))

Re: The logic breakdown behind this is:

Given an integer i (the address or index in this case), extract its bits as an array of length N (the number of bits needed for integer i is always less than N, because the setup is such that the index loops from 0 to N); N is usually 10 (table of size 1024)
Create a fixed point number, with some number of total bits (default 16) and integer bits (default 6) and set it's first N bits to the bits of i. So in the default case, all of the integer bits (the first 6, take the values of bits of i) and then another 4 take the remaining bits of i. The remaining bits (6 in this case) are unaffected.
This is internally converted to float before doing any operation (inv or exp)

To my understanding, this is equivalent to the Vivado C++ function below:

template<class data_T, typename CONFIG_T>
inline float softmax_real_val_from_idx(unsigned i){
    // Treat the index as the top N bits
    static constexpr int N = ceillog2(CONFIG_T::table_size); // number of address bits for table
    data_T x(0);
    x(x.width-1, x.width-N) = i;
    return (float) x;
}

I did also check the outputs with Vivado HLS and a stand-alone Python script to test this functionality, and the values were equal.

Final step, given a float, obtained by such bit manipulation (steps 2-3), do the function. This is done in Python, by the two functions, exp_float and inv_float. However, this produces a floating point result, which is saved to an external file. This is fine, because Quartus HLS can inherently handle floating to fixed point conversion at run-time.

However, there seems to be a discrepancy in the raw values between Quartus and Vivado, so I will fix that.

bo3z · 2022-05-06T12:16:31Z

I agree, let's not do that here. But I think it still makes sense to do it for softmax in an optimizer pass. I think these write_exp_table and write_inv_table methods are for softmax specifically right?

No, this was recent change. Look-up tables for Quartus need to be generated from Python and written to external files before compile-time. Unlike Vivado (which can generate tables only once and re-use them), on Quartus this doesn't work (or so I've been told). Therefore, all look-up tables (Sigmoid, Softmax, Tanh, Softplus etc.) are generated from quartus_writer.py. This used to be one big method, generating all of these tables, one by one, but was recently split up in a PR attempting to improve sigmoid accuracy. The exact commit is seen here: 0258f23
So I agree we should definitely do the change, but these methods are not softmax-specific.

thesps · 2022-05-06T13:24:28Z

4. the

I think the issue is just that, in the Vivado code you showed, the computation is done for data_T (the data type of the layer input), whereas the new Quartus writer code is doing this on the exp_table_t and inv_table_t. But in general those three could all be different.

bo3z · 2022-05-06T14:07:56Z

I think the issue is just that, in the Vivado code you showed, the computation is done for data_T (the data type of the layer input), whereas the new Quartus writer code is doing this on the exp_table_t and inv_table_t. But in general those three could all be different.

Exactly, I just realised that an hour ago. So it needs to be changed from exp_table_t and inv_table_t to the precision used by that layer.
Adding on to this, so the exponentials should be generated with a fixed point type of data_T, and inverses with inv_table_t, as in Vivado. Should be straightforward to fix.

bo3z · 2022-05-06T15:18:51Z

Addressed discrepancy between data and tables, as explained above. Manually checked results to Keras prediction for 5 tests, on 5 data points. Results are accurate. RegEx was removed, in favour of attributes. Commit with main changes here: 937cc49

hls4ml/writer/quartus_writer.py

…n nnet_helpers.

Vivado-equivalent implementation of Softmax on Quartus

thesps requested changes May 6, 2022

View reviewed changes

hls4ml/writer/quartus_writer.py Outdated Show resolved Hide resolved

bo3z force-pushed the quartus-softmax branch from 9211914 to 5acd593 Compare May 6, 2022 15:17

bo3z commented May 9, 2022

View reviewed changes

hls4ml/writer/quartus_writer.py Outdated Show resolved Hide resolved

bo3z added 5 commits May 9, 2022 09:33

Improved Quartus Softmax LUT - Vivado-equivalent approach

773956e

Quartus balanced reduce tree implementation. Remove circular import i…

c269042

…n nnet_helpers.

Quartus stable Softmax

8496fc5

Quartus latency Softmax

7351c76

Quartus tests for Softmax

a7753fb

bo3z force-pushed the quartus-softmax branch from 5acd593 to a7753fb Compare May 9, 2022 08:38

thesps approved these changes May 10, 2022

View reviewed changes

thesps merged commit 6f6e3b2 into fastmachinelearning:master May 10, 2022

calad0i pushed a commit to calad0i/hls4ml that referenced this pull request Jul 1, 2023

Merge pull request fastmachinelearning#540 from bo3z/quartus-softmax

8874d07

Vivado-equivalent implementation of Softmax on Quartus

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vivado-equivalent implementation of Softmax on Quartus #540

Vivado-equivalent implementation of Softmax on Quartus #540

bo3z commented May 5, 2022

thesps commented May 5, 2022

bo3z commented May 5, 2022 •

edited

Loading

thesps commented May 6, 2022

bo3z commented May 6, 2022 •

edited

Loading

bo3z commented May 6, 2022

thesps commented May 6, 2022

bo3z commented May 6, 2022 •

edited

Loading

bo3z commented May 6, 2022

Vivado-equivalent implementation of Softmax on Quartus #540

Vivado-equivalent implementation of Softmax on Quartus #540

Conversation

bo3z commented May 5, 2022

thesps commented May 5, 2022

bo3z commented May 5, 2022 • edited Loading

thesps commented May 6, 2022

bo3z commented May 6, 2022 • edited Loading

bo3z commented May 6, 2022

thesps commented May 6, 2022

bo3z commented May 6, 2022 • edited Loading

bo3z commented May 6, 2022

bo3z commented May 5, 2022 •

edited

Loading

bo3z commented May 6, 2022 •

edited

Loading

bo3z commented May 6, 2022 •

edited

Loading