Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the default Gunicorn API server workers count to one #1454

Merged
merged 5 commits into from
Feb 25, 2021

Conversation

ssheng
Copy link
Collaborator

@ssheng ssheng commented Feb 16, 2021

Description

Update the default API GUnicorn server workers count to one. Originally, the workers count was automatically determined by the number of available CPU cores.

Motivation and Context

The automatic determination was not ideal because users were often unaware the large number of workers and large number of workers could result in additional memory overhead from models. Users will still have the ability to set BentoML to automatically determine the number of workers by explicit configuration.

How Has This Been Tested?

./dev/format.sh
./dev/lint.sh
./ci/unit_test.sh

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature and improvements (non-breaking change which adds/improves functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Code Refactoring (internal change which is not user facing)
  • Documentation
  • Test, CI, or build

Component(s) if applicable

  • BentoService (service definition, dependency management, API input/output adapters)
  • Model Artifact (model serialization, multi-framework support)
  • Model Server (mico-batching, dockerisation, logging, OpenAPI, instruments)
  • YataiService gRPC server (model registry, cloud deployment automation)
  • YataiService web server (nodejs HTTP server and web UI)
  • Internal (BentoML's own configuration, logging, utility, exception handling)
  • BentoML CLI

Checklist:

  • My code follows the bentoml code style, both ./dev/format.sh and
    ./dev/lint.sh script have passed
    (instructions).
  • My change reduces project test coverage and requires unit tests to be added
  • I have added unit tests covering my code change
  • My change requires a change to the documentation
  • I have updated the documentation accordingly

Comment on lines 69 to 80
container = BentoMLContainer()
config = BentoMLConfiguration()
if "BENTOML_GUNICORN_TIMEOUT" in os.environ:
config.override(["api_server", "port"], int(os.environ.get("BENTOML_GUNICORN_TIMEOUT")))
if "BENTOML_GUNICORN_NUM_OF_WORKERS" in os.environ:
config.override(["api_server", "workers"], int(os.environ.get("BENTOML_GUNICORN_NUM_OF_WORKERS")))
container.config.from_dict(config.as_dict())

from bentoml import service, yatai

container.wire(packages=[service, yatai])

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the best way to test this? @parano @yubozhao @bojiang

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have a good way right now. Deployment tests (e2e_tests) are running offline right now.

We need to refactor/improve the sagemaker deployment with better gunicorn supports anyway. I am fine that we continue to do the basic testings on the e2e test

@codecov
Copy link

codecov bot commented Feb 16, 2021

Codecov Report

Merging #1454 (431d5d9) into master (13e5388) will increase coverage by 0.21%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1454      +/-   ##
==========================================
+ Coverage   67.54%   67.75%   +0.21%     
==========================================
  Files         149      149              
  Lines       10031    10008      -23     
==========================================
+ Hits         6775     6781       +6     
+ Misses       3256     3227      -29     
Impacted Files Coverage Δ
bentoml/configuration/containers.py 86.48% <100.00%> (-2.25%) ⬇️
bentoml/server/instruments.py 76.74% <100.00%> (+0.55%) ⬆️
bentoml/yatai/deployment/sagemaker/operator.py 77.03% <0.00%> (-2.64%) ⬇️
bentoml/yatai/deployment/aws_ec2/operator.py 68.81% <0.00%> (-0.38%) ⬇️
bentoml/saved_bundle/config.py 88.80% <0.00%> (-0.09%) ⬇️
bentoml/clipper/__init__.py 0.00% <0.00%> (ø)
bentoml/cli/bento_service.py 82.46% <0.00%> (ø)
bentoml/yatai/deployment/docker_utils.py 94.73% <0.00%> (ø)
bentoml/service/__init__.py 75.42% <0.00%> (+0.42%) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 13e5388...431d5d9. Read the comment docs.

yubozhao
yubozhao previously approved these changes Feb 17, 2021
@ssheng ssheng changed the title Update the default API GUnicorn server workers count to one Update the default Gunicorn API server workers count to one Feb 22, 2021
container = BentoMLContainer()
config = BentoMLConfiguration()
if "BENTOML_GUNICORN_TIMEOUT" in os.environ:
config.override(["api_server", "port"], int(os.environ.get("BENTOML_GUNICORN_TIMEOUT")))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a typo? s/port/timeout

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This was a typo. We should think about adding unit test for this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@inject
def _serve(
bento_server_timeout: int = Provide[BentoMLContainer.config.api_server.timeout],
bento_server_workers: int = Provide[BentoMLContainer.api_server_workers],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also remove the CPU core related calculation in BentoMLContainer.api_server_workers right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We we still want to maintain the behavior to automatically determine works if workers is set to None?

@parano
Copy link
Member

parano commented Feb 25, 2021

Thanks, @ssheng, I'm merging this PR now, feel free to address the issue we just discussed in a follow-up PR.

@parano parano merged commit cc104f9 into bentoml:master Feb 25, 2021
@ssheng ssheng deleted the default-workers branch January 9, 2022 11:08
aarnphm pushed a commit to aarnphm/BentoML that referenced this pull request Jul 29, 2022
…1454)

* Update the default API server GUnicorn workers to 1.

* Update dependency injection wire functionality for yatai/deployment/sagemaker/serve.

* Update wire modules in yatai/deployment/sagemaker/serve.

* Use app_server_workers in yatai/deployment/sagemaker/serve.

* Fix typo.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants