Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatically set max_batch_size according to the device when it is not specified #2434

Merged
merged 5 commits into from
Sep 9, 2024

Conversation

lvhan028
Copy link
Collaborator

@lvhan028 lvhan028 commented Sep 7, 2024

Set the default value of max_batch_size None.
When users don't specify this value, LMDeploy set the max_batch_size according to the device.
The rules are:
A100, A800 -> 256
H100, H800 -> 512
Other cuda devices -> 128
ascend -> 16

Copy link
Collaborator

@zhyncs zhyncs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM
I'll help verify it on H100 asap and I think it'll improve the throughput

@lvhan028 lvhan028 merged commit e6d70a0 into InternLM:main Sep 9, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants