Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Demonstrate new GroupHashAggregate stream approach (runs more than 2x faster!) #6800

Closed
wants to merge 98 commits into from

Commits on Jul 2, 2023

  1. Configuration menu
    Copy the full SHA
    9b22745 View commit details
    Browse the repository at this point in the history
  2. complete accumulator

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    4ce6671 View commit details
    Browse the repository at this point in the history
  3. touchups

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    5694190 View commit details
    Browse the repository at this point in the history
  4. Add comments

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    a58b006 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    73cb33f View commit details
    Browse the repository at this point in the history
  6. factor out accumulate

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    0b5d74f View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    c30874d View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    2370220 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    26570f9 View commit details
    Browse the repository at this point in the history
  10. update more comments

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    bed990e View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    25787a0 View commit details
    Browse the repository at this point in the history
  12. more tets

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    8433d6f View commit details
    Browse the repository at this point in the history
  13. more tests

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    7e9b92e View commit details
    Browse the repository at this point in the history
  14. comments

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    bb37e77 View commit details
    Browse the repository at this point in the history
  15. Implement fuzz testing

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    add7b36 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    53aa18b View commit details
    Browse the repository at this point in the history
  17. Zero copy into array

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    00aac24 View commit details
    Browse the repository at this point in the history
  18. fix spelling of indices

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    d760a5f View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    8811fa6 View commit details
    Browse the repository at this point in the history
  20. Implement filtering

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    93a4e6f View commit details
    Browse the repository at this point in the history
  21. Add null handling in avg

    alamb committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    966d3d0 View commit details
    Browse the repository at this point in the history

Commits on Jul 3, 2023

  1. WIP count

    Daniël Heres committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    316c781 View commit details
    Browse the repository at this point in the history
  2. WIP count

    Daniël Heres committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    754a9ff View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    e708723 View commit details
    Browse the repository at this point in the history
  4. More new adapter interface

    alamb committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    677160e View commit details
    Browse the repository at this point in the history
  5. WIP sum

    Daniël Heres committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    689e51b View commit details
    Browse the repository at this point in the history
  6. WIP sum

    Daniël Heres committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    7b20155 View commit details
    Browse the repository at this point in the history
  7. Use Rows API

    Daniël Heres committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    6275a9f View commit details
    Browse the repository at this point in the history
  8. Update adapter

    alamb committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    8902c91 View commit details
    Browse the repository at this point in the history
  9. Add docs, refactor

    alamb committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    6cab205 View commit details
    Browse the repository at this point in the history
  10. Merge branch 'alamb/hash_agg_spike' of github.com:alamb/arrow-datafus…

    …ion into alamb/hash_agg_spike
    alamb committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    587dc0e View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    7683350 View commit details
    Browse the repository at this point in the history
  12. Merge

    Daniël Heres committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    52c62ec View commit details
    Browse the repository at this point in the history
  13. WIP count

    Daniël Heres committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    1684916 View commit details
    Browse the repository at this point in the history
  14. WIP count

    Daniël Heres committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    a94c346 View commit details
    Browse the repository at this point in the history
  15. WIP count

    Daniël Heres committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    1ba625a View commit details
    Browse the repository at this point in the history
  16. WIP count

    Daniël Heres committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    c2f955d View commit details
    Browse the repository at this point in the history

Commits on Jul 4, 2023

  1. Support sum

    Daniël Heres committed Jul 4, 2023
    Configuration menu
    Copy the full SHA
    9ff91cb View commit details
    Browse the repository at this point in the history
  2. Complete adapter

    alamb committed Jul 4, 2023
    Configuration menu
    Copy the full SHA
    180903b View commit details
    Browse the repository at this point in the history
  3. Instantiate all types

    alamb committed Jul 4, 2023
    Configuration menu
    Copy the full SHA
    5d8bb35 View commit details
    Browse the repository at this point in the history
  4. Implement memory accounting

    alamb committed Jul 4, 2023
    Configuration menu
    Copy the full SHA
    51b0243 View commit details
    Browse the repository at this point in the history
  5. cleanup memory accounting

    alamb committed Jul 4, 2023
    Configuration menu
    Copy the full SHA
    68f62d1 View commit details
    Browse the repository at this point in the history

Commits on Jul 5, 2023

  1. Configuration menu
    Copy the full SHA
    ad6d4f3 View commit details
    Browse the repository at this point in the history
  2. Add float support for sum

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    87b54c9 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    eb919a9 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    917c050 View commit details
    Browse the repository at this point in the history
  5. Merge branch 'alamb/hash_agg_spike' of github.com:alamb/arrow-datafus…

    …ion into alamb/hash_agg_spike
    alamb committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    9eb6822 View commit details
    Browse the repository at this point in the history
  6. fix fmt

    alamb committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    c041ecc View commit details
    Browse the repository at this point in the history
  7. Fix clippy

    alamb committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    f973a65 View commit details
    Browse the repository at this point in the history
  8. Fix docs

    alamb committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    24abb14 View commit details
    Browse the repository at this point in the history
  9. Min/Max for primitives

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    6e740a4 View commit details
    Browse the repository at this point in the history
  10. Min/Max for primitives

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    9d2c7bf View commit details
    Browse the repository at this point in the history
  11. Min/Max initialization

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    ecc980d View commit details
    Browse the repository at this point in the history
  12. Min/Max initialization

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    fede032 View commit details
    Browse the repository at this point in the history
  13. Initial min/max support for primitive

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    5076245 View commit details
    Browse the repository at this point in the history
  14. Refactor

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    8de4ada View commit details
    Browse the repository at this point in the history
  15. Clippy

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    09b9329 View commit details
    Browse the repository at this point in the history
  16. Clippy

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    ea0ce25 View commit details
    Browse the repository at this point in the history
  17. Cleanup

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    be8a1e2 View commit details
    Browse the repository at this point in the history
  18. Fmt

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    890b517 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    ffd5cbe View commit details
    Browse the repository at this point in the history
  20. Speed up avg

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    6846970 View commit details
    Browse the repository at this point in the history
  21. Fmt

    Daniël Heres committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    2f4907a View commit details
    Browse the repository at this point in the history

Commits on Jul 6, 2023

  1. Add clickbench queries to sqllogictest coverage (apache#6836)

    * Add clickbench queries to sqllogictest coverage
    
    * rowsort
    
    * Update datafusion/core/tests/sqllogictests/test_files/clickbench.slt
    
    Co-authored-by: Daniël Heres <danielheres@gmail.com>
    
    * fix typo -- 🤦
    
    * Update queries now that they pass
    
    ---------
    
    Co-authored-by: Daniël Heres <danielheres@gmail.com>
    alamb and Dandandan committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    7ecf148 View commit details
    Browse the repository at this point in the history
  2. feat: implement posgres style encode/decode (apache#6821)

    * feat: add encode, decode functions
    
    * add test
    
    * add licenses
    
    * fix return types
    
    * delete files
    
    * toml fmt
    
    * add logical expr
    
    * fix NULL case, add test for NULL and empty
    
    * add sqllogic tests
    
    * update error msgs, run cargo update in cli dir
    
    * update sqllogictest
    
    * add more tests
    ozgrakkurt authored and alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    9adcf97 View commit details
    Browse the repository at this point in the history
  3. chore(deps): update rstest requirement from 0.17.0 to 0.18.0 (apache#…

    …6847)
    
    Updates the requirements on [rstest](https://github.com/la10736/rstest) to permit the latest version.
    - [Release notes](https://github.com/la10736/rstest/releases)
    - [Changelog](https://github.com/la10736/rstest/blob/master/CHANGELOG.md)
    - [Commits](la10736/rstest@0.17.0...v0.18.0)
    
    ---
    updated-dependencies:
    - dependency-name: rstest
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored and alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    4aa1656 View commit details
    Browse the repository at this point in the history
  4. [minior] support serde for some function (apache#6846)

    * fill some scalar function for serde
    
    * fix fmt
    liukun4515 authored and alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    c02d4e4 View commit details
    Browse the repository at this point in the history
  5. Support fixed_size_list for make_array (apache#6759)

    * support make_array for fixed_size_list
    
    Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
    
    * add arrow-typeof in test
    
    Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
    
    * fix schema mismatch
    
    Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
    
    * cleanup code
    
    Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
    
    * create array data with correct len
    
    Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
    
    ---------
    
    Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
    jayzhan211 authored and alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    e044b5c View commit details
    Browse the repository at this point in the history
  6. Improve median performance. (apache#6837)

    * Improve median performance.
    
    * Fix formatting.
    
    * Review feedback
    
    * Renamed arrays size.
    vincev authored and alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    e8d5c17 View commit details
    Browse the repository at this point in the history
  7. Mismatch in MemTable of Select Into when projecting on aggregate wind…

    …ow functions (apache#6566)
    
    * Schema check of partitions and input plan is removed for newly registered tables.
    
    * minor changes
    
    * In Select Into queries, aggregate windows are realiased with physical_name()
    
    * debugging
    
    * display_name() output is simplified for window functions
    
    * Windows are displayed in long format
    
    * Window names in tests are edited
    
    * Create table as test is added
    
    ---------
    
    Co-authored-by: Mustafa Akur <mustafa.akur@synnada.ai>
    2 people authored and alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    cf72ea0 View commit details
    Browse the repository at this point in the history
  8. feat: column support for array_append, array_prepend, `array_posi…

    …tion` and `array_positions` (apache#6805)
    
    * test: sqllogictests with columns for array_append, array_prepend, array_position and array_positions
    
    * feat: column support for array_append and array_prepend
    
    * feat: column support for array_position and array_positions
    
    * fix: error type
    izveigor authored and alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    aab9103 View commit details
    Browse the repository at this point in the history
  9. MINOR: Fix ordering of the aggregate_source_with_order table (apache#…

    …6852)
    
    * Fix ordering of the source
    
    * rename file path more decriptive
    mustafasrepo authored and alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    0fb5de7 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    5705b3a View commit details
    Browse the repository at this point in the history
  11. Deprecate ScalarValue::and, ScalarValue::or (apache#6842) (apache#6844)

    * Deprecate ScalarValue::and, ScalarValue::or (apache#6842)
    
    * Review feedback
    tustvold authored and alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    e324e9f View commit details
    Browse the repository at this point in the history
  12. chore(deps): update bigdecimal requirement from 0.3.0 to 0.4.0 (apach…

    …e#6848)
    
    * chore(deps): update bigdecimal requirement from 0.3.0 to 0.4.0
    
    Updates the requirements on [bigdecimal](https://github.com/akubera/bigdecimal-rs) to permit the latest version.
    - [Commits](https://github.com/akubera/bigdecimal-rs/commits)
    
    ---
    updated-dependencies:
    - dependency-name: bigdecimal
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    
    * Update tests for decimal rounding
    
    * Update datafusion-cli
    
    ---------
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
    dependabot[bot] and alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    dec1b97 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    49fc6c1 View commit details
    Browse the repository at this point in the history
  14. Merge branch 'alamb/hash_agg_spike' of github.com:alamb/arrow-datafus…

    …ion into alamb/hash_agg_spike
    alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    b326b68 View commit details
    Browse the repository at this point in the history
  15. fix doc comments

    alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    b137df6 View commit details
    Browse the repository at this point in the history
  16. add ticket referece

    alamb committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    1d3185c View commit details
    Browse the repository at this point in the history

Commits on Jul 7, 2023

  1. Configuration menu
    Copy the full SHA
    d9cca24 View commit details
    Browse the repository at this point in the history
  2. Improve aggregate_fuzz output

    alamb committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    c68c39b View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    0127917 View commit details
    Browse the repository at this point in the history
  4. Fix and simplify min/max

    alamb committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    e36a972 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    a96c3a0 View commit details
    Browse the repository at this point in the history
  6. Improve memory accounting

    alamb committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    b6bde8d View commit details
    Browse the repository at this point in the history
  7. feat: Add graphviz display format for execution plan. (apache#6726)

    * Implement graphviz format for execution plan
    
    * Update cargo.lock
    
    * fix ci
    
    * fix test
    
    * Fix comment
    
    * Resolve conflicts with main
    liurenjie1024 authored and alamb committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    cb5b8cb View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    07f8d77 View commit details
    Browse the repository at this point in the history
  9. Implement groups accumulators for bit operations

    simplify
    alamb committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    4dcac2a View commit details
    Browse the repository at this point in the history
  10. Almost there

    alamb committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    5d6f815 View commit details
    Browse the repository at this point in the history
  11. it compiles

    alamb committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    60ee2ef View commit details
    Browse the repository at this point in the history

Commits on Jul 8, 2023

  1. Reuse hashes buffer

    Daniël Heres committed Jul 8, 2023
    Configuration menu
    Copy the full SHA
    f2fc450 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b781910 View commit details
    Browse the repository at this point in the history
  3. Fix doc

    alamb committed Jul 8, 2023
    Configuration menu
    Copy the full SHA
    aebe77f View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    7c17638 View commit details
    Browse the repository at this point in the history
  5. Merge branch 'alamb/hash_agg_spike' of github.com:alamb/arrow-datafus…

    …ion into alamb/hash_agg_spike
    alamb committed Jul 8, 2023
    Configuration menu
    Copy the full SHA
    0a5a749 View commit details
    Browse the repository at this point in the history
  6. clippy

    alamb committed Jul 8, 2023
    Configuration menu
    Copy the full SHA
    f684ae8 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    e798074 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    afcab34 View commit details
    Browse the repository at this point in the history