Improve the hash join performance by replacing the RawTable to a simple Vec for JoinHashMap #6910

yahoNanJing · 2023-07-11T07:33:20Z

Is your feature request related to a problem or challenge?

When testing the TPCH q17 on my PC, based on #6800, it costs around 2.4s. Among them, it costs around 800ms for constructing the RawTable of JoinHashMap.

To only find the hash location, the structure, RawTable, we used is too complex and the overhead is too heavy.

Describe the solution you'd like

We can simple use a fixed size of vector with a hash mask to look up the hash location.

After applying the replacement, it only cost less than 100ms for the construction of JoinHashMap. And the overall time cost for the q17 is reduced from 2.4s to 1.8s.

Describe alternatives you've considered

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

Dandandan · 2023-07-11T07:47:50Z

That's interesting, wonder if you're not getting regressions on other queries (as probing might be slower?) 🤔 At least that's what I got in some earlier experiments.

Q17 is also interesting because it fails to plan/estimate the join correctly, it puts the largest side on the build side.

Dandandan · 2024-08-23T17:58:34Z

I think it makes sense to first work on making equality faster, then testing this approach again #12131

Dandandan · 2024-08-23T18:14:59Z

I am testing this currently to see how this approach is running.

It seems #6724 and #6679 improved this already for quite a bit (maybe not enough)

Dandandan · 2024-08-23T18:22:52Z

Current results (closer to mixed results, but more regressions than wins)

--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ vectorized_hashtable_join ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  74.26ms │                   74.10ms │     no change │
│ QQuery 2     │  15.71ms │                   15.78ms │     no change │
│ QQuery 3     │  25.31ms │                   28.83ms │  1.14x slower │
│ QQuery 4     │  20.70ms │                   26.90ms │  1.30x slower │
│ QQuery 5     │  37.37ms │                   44.74ms │  1.20x slower │
│ QQuery 6     │   5.46ms │                    6.41ms │  1.17x slower │
│ QQuery 7     │  78.58ms │                   93.71ms │  1.19x slower │
│ QQuery 8     │  19.23ms │                   15.31ms │ +1.26x faster │
│ QQuery 9     │  46.49ms │                   50.21ms │  1.08x slower │
│ QQuery 10    │  42.36ms │                   43.06ms │     no change │
│ QQuery 11    │  30.29ms │                   32.25ms │  1.06x slower │
│ QQuery 12    │  23.48ms │                   22.19ms │ +1.06x faster │
│ QQuery 13    │  28.99ms │                   29.42ms │     no change │
│ QQuery 14    │   7.79ms │                    8.03ms │     no change │
│ QQuery 15    │  13.77ms │                   12.72ms │ +1.08x faster │
│ QQuery 16    │  17.16ms │                   16.70ms │     no change │
│ QQuery 17    │  61.71ms │                   54.08ms │ +1.14x faster │
│ QQuery 18    │ 138.43ms │                  186.12ms │  1.34x slower │
│ QQuery 19    │  22.48ms │                   22.02ms │     no change │
│ QQuery 20    │  29.27ms │                   28.51ms │     no change │
│ QQuery 21    │  93.84ms │                  107.18ms │  1.14x slower │
│ QQuery 22    │   9.03ms │                    9.97ms │  1.10x slower │
└──────────────┴──────────┴───────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary                        ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (main)                        │ 841.72ms │
│ Total Time (vectorized_hashtable_join)   │ 928.25ms │
│ Average Time (main)                      │  38.26ms │
│ Average Time (vectorized_hashtable_join) │  42.19ms │
│ Queries Faster                           │        4 │
│ Queries Slower                           │       10 │
│ Queries with No Change                   │        8 │
└──────────────────────────────────────────┴──────────┘
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ vectorized_hashtable_join ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 141.17ms │                  140.95ms │     no change │
│ QQuery 2     │  64.63ms │                   65.74ms │     no change │
│ QQuery 3     │  64.88ms │                   71.23ms │  1.10x slower │
│ QQuery 4     │  48.98ms │                   49.53ms │     no change │
│ QQuery 5     │  75.78ms │                   81.51ms │  1.08x slower │
│ QQuery 6     │  30.14ms │                   30.58ms │     no change │
│ QQuery 7     │  98.22ms │                  110.42ms │  1.12x slower │
│ QQuery 8     │  87.66ms │                   83.27ms │ +1.05x faster │
│ QQuery 9     │ 132.67ms │                  134.53ms │     no change │
│ QQuery 10    │ 120.88ms │                  120.26ms │     no change │
│ QQuery 11    │  43.79ms │                   49.48ms │  1.13x slower │
│ QQuery 12    │  82.68ms │                   80.17ms │     no change │
│ QQuery 13    │ 185.70ms │                  185.21ms │     no change │
│ QQuery 14    │  46.67ms │                   46.54ms │     no change │
│ QQuery 15    │  59.51ms │                   59.43ms │     no change │
│ QQuery 16    │  45.77ms │                   49.33ms │  1.08x slower │
│ QQuery 17    │ 122.20ms │                  111.54ms │ +1.10x faster │
│ QQuery 18    │ 160.71ms │                  171.14ms │  1.06x slower │
│ QQuery 19    │  88.44ms │                   87.32ms │     no change │
│ QQuery 20    │  71.72ms │                   72.27ms │     no change │
│ QQuery 21    │ 124.59ms │                  138.07ms │  1.11x slower │
│ QQuery 22    │  36.48ms │                   37.85ms │     no change │
└──────────────┴──────────┴───────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                        ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (main)                        │ 1933.26ms │
│ Total Time (vectorized_hashtable_join)   │ 1976.38ms │
│ Average Time (main)                      │   87.88ms │
│ Average Time (vectorized_hashtable_join) │   89.84ms │
│ Queries Faster                           │         2 │
│ Queries Slower                           │         7 │
│ Queries with No Change                   │        13 │
└──────────────────────────────────────────┴───────────┘

Dandandan · 2024-08-24T06:05:35Z

I am going to experiment with a slight variation on this this weekend to reduce nr of collisions greatly while still using Vec-based indexing.

yahoNanJing added the enhancement New feature or request label Jul 11, 2023

yahoNanJing mentioned this issue Jul 11, 2023

Improve the hash join performance by replacing the RawTable to a simple Vec for JoinHashMap #6913

Closed

yahoNanJing closed this as completed Jul 12, 2023

Dandandan reopened this Aug 23, 2024

Dandandan added the performance Make DataFusion faster label Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the hash join performance by replacing the RawTable to a simple Vec for JoinHashMap #6910

Improve the hash join performance by replacing the RawTable to a simple Vec for JoinHashMap #6910

yahoNanJing commented Jul 11, 2023

Dandandan commented Jul 11, 2023 •

edited

Loading

Dandandan commented Aug 23, 2024

Dandandan commented Aug 23, 2024

Dandandan commented Aug 23, 2024 •

edited

Loading

Dandandan commented Aug 24, 2024

Improve the hash join performance by replacing the RawTable to a simple Vec for JoinHashMap #6910

Improve the hash join performance by replacing the RawTable to a simple Vec for JoinHashMap #6910

Comments

yahoNanJing commented Jul 11, 2023

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Dandandan commented Jul 11, 2023 • edited Loading

Dandandan commented Aug 23, 2024

Dandandan commented Aug 23, 2024

Dandandan commented Aug 23, 2024 • edited Loading

Dandandan commented Aug 24, 2024

Dandandan commented Jul 11, 2023 •

edited

Loading

Dandandan commented Aug 23, 2024 •

edited

Loading