-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Fix incorrect assumption about minimum ML node size #91694
[ML] Fix incorrect assumption about minimum ML node size #91694
Conversation
The ML autoscaling code was making an assumption that all ML nodes in Cloud will be at least 1GB. This is not correct. After allowing for logging and metrics collection it is possible for ML nodes to be smaller. This PR updates the assumption to 0.5GB.
Pinging @elastic/ml-core (Team:ML) |
Marked as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The ML autoscaling code was making an assumption that all ML nodes in Cloud will be at least 1GB. This is not correct. After allowing for logging and metrics collection it is possible for ML nodes to be smaller. This PR updates the assumption to 0.5GB.
💔 Backport failed
You can use sqren/backport to manually backport by running |
The ML autoscaling code was making an assumption that all ML nodes in Cloud will be at least 1GB. This is not correct. After allowing for logging and metrics collection it is possible for ML nodes to be smaller. This PR updates the assumption to 0.5GB.
On closer inspection the code changed quite radically in 8.3, so I think it would be best not to backport this change to 7.x. Doing so might aggravate some of the other ML autoscaling discrepancies that were fixed in 8.3. |
…1696) The ML autoscaling code was making an assumption that all ML nodes in Cloud will be at least 1GB. This is not correct. After allowing for logging and metrics collection it is possible for ML nodes to be smaller. This PR updates the assumption to 0.5GB.
…1697) The ML autoscaling code was making an assumption that all ML nodes in Cloud will be at least 1GB. This is not correct. After allowing for logging and metrics collection it is possible for ML nodes to be smaller. This PR updates the assumption to 0.5GB.
This change fixes a discrepancy that has existed for a long time but was revealed by elastic#91694. The ML automatic node/JVM sizing code contained a minimum node size but did not restrict the minimum JVM size to the size that would be chosen on that minimum node size. This could throw off calculations at small scale. Fixes elastic#91728
This change fixes a discrepancy that has existed for a long time but was revealed by #91694. The ML automatic node/JVM sizing code contained a minimum node size but did not restrict the minimum JVM size to the size that would be chosen on that minimum node size. This could throw off calculations at small scale. Fixes #91728
…ic#91732) This change fixes a discrepancy that has existed for a long time but was revealed by elastic#91694. The ML automatic node/JVM sizing code contained a minimum node size but did not restrict the minimum JVM size to the size that would be chosen on that minimum node size. This could throw off calculations at small scale. Fixes elastic#91728
…ic#91732) This change fixes a discrepancy that has existed for a long time but was revealed by elastic#91694. The ML automatic node/JVM sizing code contained a minimum node size but did not restrict the minimum JVM size to the size that would be chosen on that minimum node size. This could throw off calculations at small scale. Fixes elastic#91728
… (#91742) This change fixes a discrepancy that has existed for a long time but was revealed by #91694. The ML automatic node/JVM sizing code contained a minimum node size but did not restrict the minimum JVM size to the size that would be chosen on that minimum node size. This could throw off calculations at small scale. Fixes #91728
… (#91741) This change fixes a discrepancy that has existed for a long time but was revealed by #91694. The ML automatic node/JVM sizing code contained a minimum node size but did not restrict the minimum JVM size to the size that would be chosen on that minimum node size. This could throw off calculations at small scale. Fixes #91728
The ML autoscaling code was making an assumption that all ML nodes in Cloud will be at least 1GB. This is not correct. After allowing for logging and metrics collection it is possible for ML nodes to be smaller.
This PR updates the assumption to 0.5GB.