Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early error throwing #27

Closed
Foixa opened this issue Jun 5, 2024 · 4 comments
Closed

Early error throwing #27

Foixa opened this issue Jun 5, 2024 · 4 comments
Labels
enhancement Possible new feature or improvement request

Comments

@Foixa
Copy link

Foixa commented Jun 5, 2024

If we provide the parameters incorrectly, RunPod Delay time can go up to 2 minutes. Is there a way to throw an error early to prevent this?

For example:

     "api_name": "inpaint-outpaint",
     "save_meta": false,

Normally save_meta should be like this:

     "api_name": "inpaint-outpaint",
     "save_meta": "false",

NOTE: I have not encountered this error yet in "inpaint-outpaint2".

@davefojtik
Copy link
Owner

Thats weird. I usually get errors almost instantly when having a wrong param.

You can set Execution Timeout on your serverless endpoint -> three dots in the right-top corner when inspecting it -> Edit Endpoint -> Enable Execution Timeout. That will set it globally for all requests. Alternatively, you can set the exec-timeout, queue timeout (ttl) and priority in individual requests like this:

{
  "input": {},
  "policy": {
    "executionTimeout": int, // Time in milliseconds. Must be greater than 5 seconds.
    "lowPriority": bool, // Sets the job's priority to low. Default behavior escalates to high under certain conditions.
    "ttl": int // Time in milliseconds. Must be greater than or equal to 10 seconds. Default is 24 hours. Maximum is one week.
  }
}

Note that V1 requests should generally have everything as string because of Python's multipart/form-data conversion that has to be done there, while V2 can follow classic JSON standards. But the correctness of the data needs to be checked in your app.

@Foixa
Copy link
Author

Foixa commented Jun 5, 2024

Thats weird. I usually get errors almost instantly when having a wrong param.

You can set Execution Timeout on your serverless endpoint -> three dots in the right-top corner when inspecting it -> Edit Endpoint -> Enable Execution Timeout. That will set it globally for all requests. Alternatively, you can set the exec-timeout, queue timeout (ttl) and priority in individual requests like this:

{
  "input": {},
  "policy": {
    "executionTimeout": int, // Time in milliseconds. Must be greater than 5 seconds.
    "lowPriority": bool, // Sets the job's priority to low. Default behavior escalates to high under certain conditions.
    "ttl": int // Time in milliseconds. Must be greater than or equal to 10 seconds. Default is 24 hours. Maximum is one week.
  }
}

Note that V1 requests should generally have everything as string because of Python's multipart/form-data conversion that has to be done there, while V2 can follow classic JSON standards. But the correctness of the data needs to be checked in your app.

I get immediate answers to most errors. However, as I explained above, in some exceptional cases it may take up to 2 minutes.

Additionally, the executionTimeout parameter is useless in this case because the application has not been executed yet. It gets stuck on delay.

runpod.serverless.start({"handler": handler})

I think serverless is not started in such cases.

@davefojtik
Copy link
Owner

I checked with the exact request parameters you mentioned and I see what you mean now. The serverless handler is started and everything goes through just fine. The worker is ended right away and becomes available. The only problem is that the request is not ended and instead "returns" back to the queue for a while. Also, the client does not get any results till the expiration. Quite a weird behaviour from RunPod when the worker is finished, but from a quick assessment I guess it might be because:

Error while returning job result. | Object of type AttributeError is not JSON serializable

probably caused by how I handled the possible errors:

except Exception as e:
      print("multipart/form-data task failed: ", e)
      return e

Seems like RunPod expects errors to be returned as JSON

Since it does not make workers occupied nor lose money on execution and actually happens only when not serving correct values, I am not marking this as a critical bug. But it's definitely not right and can make a worse dev experience. I will make sure to take a look into this in the upcoming update and close this issue once fixed. Thank you very much for pointing this out!

@davefojtik davefojtik added the enhancement Possible new feature or improvement request label Jun 6, 2024
davefojtik added a commit that referenced this issue Jun 29, 2024
Update Release compatible with Fooocus-API v0.4.1.0

Changelog:
- Fixed error returns #27
  - RunPod now returns {"delayTime": 0, "error": "message", "executionTime": 0, "id": "runpod-job-id", "status": "FAILED"} on handler (wrong params, images etc.) errors.
- Updated containers and software versions
- The new NSFW filter is tested and working. Try it with "advanced_params": {"black_out_nsfw": true}
See also [Fooocus-API changelog](https://github.com/mrhan1993/Fooocus-API/releases) to find out what's new in the API code and [Fooocus changelog](https://github.com/lllyasviel/Fooocus/releases) to see what's new in the included Fooocus version.

Breaking change:
- This new release introduced two new models: sdxl_hyper_lora and nsfw-checker (~2GB), so the higher 0,12$/hr CPU pod has to be used for network installation since it's more than 20GB total. Standalone is unaffected.
- CUDA version has been updated to 12.1. To prevent unexpected errors, we recommend setting "Allowed CUDA Versions" in your Advanced RunPod endpoint settings to 12.1 and higher
davefojtik added a commit that referenced this issue Jun 29, 2024
Update Release compatible with Fooocus-API v0.4.1.0

Changelog:
- Fixed error returns #27
  - RunPod now returns {"delayTime": 0, "error": "message", "executionTime": 0, "id": "runpod-job-id", "status": "FAILED"} on handler (wrong params, images etc.) errors.
- Updated containers and software versions
- The new NSFW filter is tested and working. Try it with "advanced_params": {"black_out_nsfw": true}
See also [Fooocus-API changelog](https://github.com/mrhan1993/Fooocus-API/releases) to find out what's new in the API code and [Fooocus changelog](https://github.com/lllyasviel/Fooocus/releases) to see what's new in the included Fooocus version.

Breaking change:
- This new release introduced two new models: sdxl_hyper_lora and nsfw-checker (~2GB), so the higher 0,12$/hr CPU pod has to be used for network installation since it's more than 20GB total. Standalone is unaffected.
- CUDA version has been updated to 12.1. To prevent unexpected errors, we recommend setting "Allowed CUDA Versions" in your Advanced RunPod endpoint settings to 12.1 and higher
@davefojtik
Copy link
Owner

Error returns fixed with the new update. Closing this issue as completed. Thanks for your feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Possible new feature or improvement request
Projects
None yet
Development

No branches or pull requests

2 participants