Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Add lowering pass to remove output repacking in convert_method_to_trt_engine calls #1945

Merged
merged 1 commit into from
May 31, 2023

Commits on May 31, 2023

  1. fix: Add lowering pass to remove output repacking

    - Automatically remove output repacking for
    `convert_method_to_trt_engine` calls, to improve parity between models
    which can be converted directly to TRT engines, and models which can be
    fully compiled
    - Add new internal `CompileSpec` argument for lowering which indicates
    whether the lowering passes originate from a
    `convert_method_to_trt_engine` call or a regular `compile` call, which
    affects whether the lowering pass is applied
    - Regular TorchScript graphs cannot have this pass applied, as it can
    otherwise break the output graph. Newer versions of Torch disallow graph
    outputs with 0 or 2+ arguments which are not packed in a struct
    - Current lowering pass detects outputs which are flat Lists or Tuples
    of Tensors and returns the outputs as-is (direct from the TRT Engine),
    so the entire model can be converted to a single TRT engine
    gs-olive committed May 31, 2023
    Configuration menu
    Copy the full SHA
    62371d9 View commit details
    Browse the repository at this point in the history