feat: Client-side input shape/element validation #742

yinggeh · 2024-07-09T01:22:39Z

What does the PR do?

Add client input size check to make sure input shape byte size matches input data byte size.

Checklist

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

feat

Related PRs:

triton-inference-server/server#7427

Where should the reviewer start?

src/c++/library/common.cc
src/python/library/tritonclient/grpc/_infer_input.py
src/python/library/tritonclient/http/_infer_input.py

Test plan:

n/a

CI Pipeline ID:
17202351

Caveats:

Shared memory byte size checks for string inputs is not implemented.

Background

Stop malformed input request at client side before sending to the server.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Relates to triton-inference-server/server#7171

src/c++/library/common.cc

rmccorm4 · 2024-07-25T03:29:08Z

src/c++/library/CMakeLists.txt

@@ -275,6 +277,10 @@ if(TRITON_ENABLE_CC_HTTP OR TRITON_ENABLE_PERF_ANALYZER)
      http-client-library EXCLUDE_FROM_ALL OBJECT
      ${REQUEST_SRCS} ${REQUEST_HDRS}
  )
+  add_dependencies(
+    http-client-library
+    proto-library


Isn't proto-library used for protobuf<->grpc? Why is it needed for HTTP client?

edit: guessing the requirement is here.

Are there any concerns with introducing the new protobuf dependency to the HTTP client, or any alternatives? CC @GuanLuo @tanmayv25

src/c++/library/common.cc

src/python/library/tritonclient/http/_infer_input.py

yinggeh · 2024-07-26T22:01:46Z

There is a known issue with TensorRT (Jira DLIS-6805 ) which causes TRT tests to fail again at client (CI job 102924904). There is no way to know the platform of inference model at the client side. Should we wait until @pskiran1 finish his change first?
CC @tanmayv25 @GuanLuo @rmccorm4

pskiran1 · 2024-07-27T16:11:42Z

There is a known issue with TensorRT (Jira DLIS-6805 ) which causes TRT tests to fail again at client (CI job 102924904). There is no way to know the platform of inference model at the client side. Should we wait until @pskiran1 finish his change first? CC @tanmayv25 @GuanLuo @rmccorm4

@yinggeh, I just merged DLIS-6805 changes, could you please try with the latest code?

into yinggeh-DLIS-6657-client-input-byte-size-check

rmccorm4 · 2024-07-29T16:12:54Z

src/c++/library/common.cc

@@ -232,6 +236,26 @@ InferInput::SetBinaryData(const bool binary_data)
  return Error::Success;
 }

+Error
+InferInput::ValidateData() const


Moving TRT reformat conversation to a thread 🧵

Yingge:

There is a known issue with TensorRT (Jira DLIS-6805 ) which causes TRT tests to fail again at client (CI job 102924904). There is no way to know the platform of inference model at the client side. Should we wait until @pskiran1 finish his change first?
CC @tanmayv25 @GuanLuo @rmccorm4

Sai:

@yinggeh, I just merged DLIS-6805 changes, could you please try with the latest code?

I just merged DLIS-6805 changes, could you please try with the latest code?

Sai's changes will allow the check to work on core side, but probably not on client side, right? @yinggeh

There is no way to know the platform of inference model at the client side.

You can query the platform/backend through the model config APIs on client side, which would work when inferring on a TensorRT model directly. You can probably even query is_non_linear_format_io from the model config if needed.

For an ensemble model containing one of these TRT models with non-linear inputs, you may need to follow the ensemble definition to find out if it's calling a TRT model with its inputs, which can be a pain. It may be simpler to skip the check on ensemble models and let the core check handle it (but it feels like we're starting to introduce a lot of special checks and cases with this feature).

For a BLS model, I think it's fine and will work as any other python model, then it will trigger the core check internally if the BLS is calling the TRT model.

CC @tanmayv25

Another alternative is to introduce a new flag in client Input/Output tensors to skip the byte size check on the client side.
We can document when we expect the user to provide this option (using non-linear format).
This way the user can be aware of what they are doing.
Pro:

Generic API change allows for all the flexibility

Powerful expression for the client-side code.

Cons:

Adding a flag to skip these checks seems to be counter-intuitive and makes us question even the requirement of such checks in the first place.
a. This can be alleviated by an additional check to some degree by validating the skip_byte_size check flag is set for the correct scenario.

Breaks backwards compatibility, as the user now has to set a new flag to use models with non-linear tensors.

For an ensemble model containing one of these TRT models with non-linear inputs, you may need to follow the ensemble definition to find out if it's calling a TRT model with its inputs, which can be a pain.

@rmccorm4 Can you elaborate on this?

Can you elaborate on this?

If you have an ensemble with ENSEMBLE_INPUT0 where the first step is a TRT model with non-linear IO INPUT0 and a mapping of ENSEMBLE_INPUT0 -> INPUT0, do we require an ensemble config to mention that the ENSEMBLE_INPUT0 is non-linear IO too? Or is it inferred internally?

Adding a flag to skip these checks seems to be counter-intuitive and makes us question even the requirement of such checks in the first place

+1 that I think this is counter-intuitive to the goal

This can be alleviated by an additional check to some degree by validating the skip_byte_size check flag is set for the correct scenario.

If we are able to internally determine "the correct scenario" programatically, isn't this the same as being able to skip internally without user specification?

into yinggeh-DLIS-6657-client-input-byte-size-check

Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

into yinggeh-DLIS-6657-client-input-byte-size-check

…s://github.com/triton-inference-server/client into yinggeh-DLIS-6657-client-input-byte-size-check

src/python/library/tritonclient/grpc/_infer_input.py

tanmayv25 · 2024-08-05T23:01:16Z

src/python/library/tritonclient/grpc/_infer_input.py

+        cnt += self._raw_content != None
+        cnt += "shared_memory_region" in self._input.parameters
+        if cnt != 1:
+            return


Shouldn't we return an error when more that one fields are specified in the inputs?

This error was handled by the server.

tanmayv25 · 2024-08-05T23:05:52Z

src/python/library/tritonclient/grpc/_infer_input.py

+        data_num_elements = num_elements(self._data_shape)
+        if expected_num_elements != data_num_elements:
+            raise_error(
+                "input '{}' got unexpected elements count {}, expected {}".format(


Can you also include the respective shapes in the error message as well?

something like:

input 'XYZ' got unexpected elements count 8 (shape: 8,1), expected 16 (shape: 16,1)

I think you are trying to keep supporting the case where a user might just want to call set_shape() with a different shape but same underlying data.

Are shapes (8,2), (4,4) also valid?

yinggeh added 2 commits July 5, 2024 05:37

Add client checks

3178f99

Add C++ client tests

7210d00

yinggeh added the enhancement New feature or request label Jul 9, 2024

yinggeh requested review from tanmayv25, rmccorm4 and GuanLuo July 9, 2024 01:22

yinggeh self-assigned this Jul 9, 2024

yinggeh mentioned this pull request Jul 9, 2024

test: Client-side input shape/element validation triton-inference-server/server#7427

Draft

11 tasks

Update copyrights

9c2941b

yinggeh changed the title ~~feat: Client Input Byte Size Checks~~ feat: Client input byte size checks Jul 9, 2024

rmccorm4 reviewed Jul 9, 2024

View reviewed changes

src/c++/library/common.cc Outdated Show resolved Hide resolved

yinggeh added 3 commits July 9, 2024 15:02

Update error msg and build deps

b4c6a17

Update error msg

e5e6b7e

Remove client checks for string inputs

07059a6

yinggeh mentioned this pull request Jul 23, 2024

refactor: Refactor core input size checks triton-inference-server/core#382

Merged

11 tasks

yinggeh requested a review from rmccorm4 July 23, 2024 03:22

rmccorm4 reviewed Jul 25, 2024

View reviewed changes

src/c++/library/common.cc Outdated Show resolved Hide resolved

rmccorm4 reviewed Jul 25, 2024

View reviewed changes

src/python/library/tritonclient/http/_infer_input.py Outdated Show resolved Hide resolved

Merge branch 'main' of https://github.com/triton-inference-server/client

8b699c0

into yinggeh-DLIS-6657-client-input-byte-size-check

rmccorm4 reviewed Jul 29, 2024

View reviewed changes

yinggeh added 2 commits July 30, 2024 19:05

Merge branch 'main' of https://github.com/triton-inference-server/client

60f3f52

into yinggeh-DLIS-6657-client-input-byte-size-check

Undo C++ client checks and tests

2a5c507

yinggeh force-pushed the yinggeh-DLIS-6657-client-input-byte-size-check branch from 82b8d57 to 2a5c507 Compare July 31, 2024 02:23

Update src/python/library/tritonclient/http/_infer_input.py

6b56c3b

Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

yinggeh requested a review from rmccorm4 July 31, 2024 02:25

yinggeh added 2 commits August 1, 2024 14:20

Merge branch 'main' of https://github.com/triton-inference-server/client

223b9d8

into yinggeh-DLIS-6657-client-input-byte-size-check

Merge branch 'yinggeh-DLIS-6657-client-input-byte-size-check' of http…

a9a2c1c

…s://github.com/triton-inference-server/client into yinggeh-DLIS-6657-client-input-byte-size-check

Workaround with L0_trt_reformat_free by removing shm checks

a584741

yinggeh force-pushed the yinggeh-DLIS-6657-client-input-byte-size-check branch from 2b34379 to a584741 Compare August 2, 2024 21:48

github-advanced-security bot found potential problems Aug 2, 2024

View reviewed changes

src/python/library/tritonclient/grpc/_infer_input.py Fixed Show fixed Hide fixed

Remove unused function

5889b8e

rmccorm4 reviewed Aug 5, 2024

View reviewed changes

src/python/library/tritonclient/grpc/_infer_input.py Show resolved Hide resolved

rmccorm4 changed the title ~~feat: Client input byte size checks~~ feat: Client-side input shape/element validation Aug 5, 2024

tanmayv25 reviewed Aug 5, 2024

View reviewed changes

yinggeh requested review from tanmayv25 and rmccorm4 August 6, 2024 17:25

This was referenced Aug 7, 2024

feat: Report histogram metrics to Triton metrics server triton-inference-server/vllm_backend#56

Merged

feat: Add histogram metric type triton-inference-server/python_backend#374

Merged

feat: Add histogram metric type triton-inference-server/core#386

Merged

yinggeh marked this pull request as draft September 18, 2024 18:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Client-side input shape/element validation #742

feat: Client-side input shape/element validation #742

yinggeh commented Jul 9, 2024 •

edited

Loading

rmccorm4 Jul 25, 2024 •

edited

Loading

rmccorm4 Jul 25, 2024

yinggeh commented Jul 26, 2024 •

edited

Loading

pskiran1 commented Jul 27, 2024

rmccorm4 Jul 29, 2024

rmccorm4 Jul 29, 2024

rmccorm4 Jul 29, 2024

tanmayv25 Jul 29, 2024

tanmayv25 Jul 29, 2024

rmccorm4 Jul 29, 2024

rmccorm4 Jul 29, 2024

tanmayv25 Aug 5, 2024

yinggeh Aug 6, 2024 •

edited

Loading

tanmayv25 Aug 5, 2024

yinggeh Aug 6, 2024

feat: Client-side input shape/element validation #742

Are you sure you want to change the base?

feat: Client-side input shape/element validation #742

Conversation

yinggeh commented Jul 9, 2024 • edited Loading

What does the PR do?

Checklist

Commit Type:

Related PRs:

Where should the reviewer start?

Test plan:

Caveats:

Background

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

rmccorm4 Jul 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yinggeh commented Jul 26, 2024 • edited Loading

pskiran1 commented Jul 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yinggeh Aug 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yinggeh commented Jul 9, 2024 •

edited

Loading

rmccorm4 Jul 25, 2024 •

edited

Loading

yinggeh commented Jul 26, 2024 •

edited

Loading

yinggeh Aug 6, 2024 •

edited

Loading