You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Download my own finetune_version of Nougat, which has the same architecture with nougat base 0.1.0, only change the model weights. # Clone finetune-version of nougat model
git lfs install
git clone https://huggingface.co/shenzhanyou/table_nougat
and copy the model to examples/multimodal/tmp/hf_models/${MODEL_NAME} to align with the official example script.
Follow the tutorial of Nougat and transform the original model above to bfloat16 and float32 version. (Only show the bfloat16 cmd, you can replace bfloat16 with float32 to check float32 accuracy)
Only replace the test image in examples/multimodal/run.py with my own image below to check result.
python run.py
--hf_model_dir tmp/hf_models/${MODEL_NAME}
--visual_engine_dir tmp/trt_engines/${MODEL_NAME}/vision_encoder
--llm_engine_dir tmp/trt_engines/${MODEL_NAME}/1-gpu/bfloat16
which is different from the trtllm inference result with the first line:
\begin{tabular}{@{}llcccccccc@{}} (original transformer)
and
\begin{tabular}{@{}lllllllllll@{}} (trtllm engine bfloat16 and float32)
System Info
Platform: Linux-5.15.0-52-generic-x86_64-with-glibc2.35
Python version: 3.10.12
PyTorch version (GPU?): 2.4.0+cu121 (True)
[TensorRT-LLM] TensorRT-LLM version: 0.12.0
Driver Version: 535.161.08
CUDA Version: 12.5
GPU: A40 single card
Who can help?
@byshiue
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Follow the tutorial of TensorRT-LLM Linux install the docker and TensorRT-LLM 0.12.0
https://nvidia.github.io/TensorRT-LLM/installation/linux.html
Download my own finetune_version of Nougat, which has the same architecture with nougat base 0.1.0, only change the model weights. # Clone finetune-version of nougat model
git lfs install
git clone https://huggingface.co/shenzhanyou/table_nougat
and copy the model to examples/multimodal/tmp/hf_models/${MODEL_NAME} to align with the official example script.
Follow the tutorial of Nougat and transform the original model above to bfloat16 and float32 version. (Only show the bfloat16 cmd, you can replace bfloat16 with float32 to check float32 accuracy)
Only replace the test image in examples/multimodal/run.py with my own image below to check result.
python run.py
--hf_model_dir tmp/hf_models/${MODEL_NAME}
--visual_engine_dir tmp/trt_engines/${MODEL_NAME}/vision_encoder
--llm_engine_dir tmp/trt_engines/${MODEL_NAME}/1-gpu/bfloat16
Expected behavior
the output of original version of NougatModel is:
actual behavior
bfloat16 and float32 trt_engine gives the same output below.
which is different from the trtllm inference result with the first line:
\begin{tabular}{@{}llcccccccc@{}} (original transformer)
and
\begin{tabular}{@{}lllllllllll@{}} (trtllm engine bfloat16 and float32)
code for transformers:
additional notes
Not all the images show different results between transformers and trtllm v0.12.0. This image is a weird one.
The text was updated successfully, but these errors were encountered: