Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending Grpc Request in Appengine Java 8 Standard Environment Could Fail #3296

Closed
neozwu opened this issue Jul 31, 2017 · 11 comments
Closed
Milestone

Comments

@neozwu
Copy link

neozwu commented Jul 31, 2017

What version of gRPC are you using?

1.4.0

What JVM are you using (java -version)?

Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

Google-cloud-java has received issues (googleapis/google-cloud-java#2150, googleapis/google-cloud-java#2275) regarding sending grpc reqeust in appengine (specifically, using google-cloud-java Pub/Sub client library). When initiating grpc connection in appengine, grpc picks up currentRequestThreadFactory for its thread factory (https://github.com/grpc/grpc-java/blob/master/core/src/main/java/io/grpc/internal/GrpcUtil.java#L482). This could cause appengine runtime to throw NPE (see below for a typical stack trace). A hack to force defaultThreadFactory to be used by Grpc seems to resolve this issue.

Sending Grpc request within Appengine request thread seems to be a valid scenario. Grpc should support it.

java.lang.NullPointerException
	at com.google.apphosting.runtime.ApiProxyImpl$CurrentRequestThreadFactory.newThread(ApiProxyImpl.java:1267)
	at com.google.common.util.concurrent.ThreadFactoryBuilder$1.newThread(ThreadFactoryBuilder.java:162)
	at java.util.concurrent.ThreadPoolExecutor$Worker.<init>(ThreadPoolExecutor.java:612)
	at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:925)
	at java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1587)
	at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:336)
	at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:555)
	at io.grpc.internal.ManagedChannelImpl.rescheduleIdleTimer(ManagedChannelImpl.java:334)
	at io.grpc.internal.ManagedChannelImpl.exitIdleMode(ManagedChannelImpl.java:299)
	at io.grpc.internal.ManagedChannelImpl$4$1.run(ManagedChannelImpl.java:357)
	at io.grpc.internal.ChannelExecutor.drain(ChannelExecutor.java:87)
	at io.grpc.internal.ManagedChannelImpl$4.get(ManagedChannelImpl.java:359)
	at io.grpc.internal.ClientCallImpl.start(ClientCallImpl.java:218)
	at io.grpc.ForwardingClientCall.start(ForwardingClientCall.java:47)
	at com.google.api.gax.grpc.GrpcHeaderInterceptor$1.start(GrpcHeaderInterceptor.java:62)
	at io.grpc.stub.ClientCalls.startCall(ClientCalls.java:276)
	at io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:252)
	at io.grpc.stub.ClientCalls.futureUnaryCall(ClientCalls.java:186)
	at com.google.pubsub.v1.PublisherGrpc$PublisherFutureStub.publish(PublisherGrpc.java:538)
	at com.google.cloud.pubsub.v1.Publisher.publishOutstandingBatch(Publisher.java:330)
	at com.google.cloud.pubsub.v1.Publisher.publishAllOutstanding(Publisher.java:304)
	at com.google.cloud.pubsub.v1.Publisher.access$500(Publisher.java:79)
	at com.google.cloud.pubsub.v1.Publisher$5.run(Publisher.java:283)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:295)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

/cc @jbaldassari @cmaan @jabubake @tcoffee-google @garrettjonesgoogle

@jbaldassari
Copy link

I believe @cmaan had the same issue but with the Spanner client if that sheds any light on it

@carl-mastrangelo
Copy link
Contributor

@neozwu Can you clarify when it is okay to start a new thread?

@neozwu
Copy link
Author

neozwu commented Aug 2, 2017

@carl-mastrangelo there are more discussions here. It seems to me grpc handles GAE J7 request threads specifically in order to avoid zombie threads. The same thing should apply to J8. However, the IS_RESTRICTED_APPENGINE flag only sets to true on GAE J7.

@carl-mastrangelo
Copy link
Contributor

@neozwu I don't understand. All gRPC threads are created with Daemon=true. Where do you see that not being the case?

@neozwu
Copy link
Author

neozwu commented Aug 2, 2017

@carl-mastrangelo , thanks for following up. I suspect Daemon=true might be causing the problem. So, on GAE J7, I think Daemon is not set true for currentRequestThreadFactory, since IS_RESTRICTED_APPENGINE is true and it will use this branch. Also, from the comments here, it does suggest grpc realizes to not have zombie threads on GAE J7. I think all these special treatments of threads for GAE J7 should be applied to GAE J8 too. I suspect this is why we are seeing the errors from grpc right now. I'm not familiar with grpc's internal code. My statement is based on the reasoning of the same request thread behavior on GAE J7 and J8, and , the different treatment of request threads on GAE J7 and J8 in grpc.

@garrettjonesgoogle
Copy link
Contributor

@carl-mastrangelo would you be able to respond to @neozwu 's last comment?

@carl-mastrangelo
Copy link
Contributor

I thought the point of GAE J8 was that it didn't have all the painful restrictions, which is why IS_RESTRICTED_APPENGINE is stuck on J7.

OkHttp will use a worse TLS suite (rand I think?) on J7 which we don't want to keep doing on j8. Making IS_RESTRICTED_APPENGINE true on j8 would be conflating grpc thread behavior with grpc TLS behavior.

No one has said specifically why creating a thread doesn't work on G8, and I don't really have time to walk all the previous discussions. Why was an NPE raised? What was actually null? Why isn't this filed under app engine instead of gRPC?

@carl-mastrangelo
Copy link
Contributor

@rrch would you be able to shed light on why GAE raises this?

@garrettjonesgoogle
Copy link
Contributor

Do we need a new configuration, IS_SEMI_RESTRICTED_APPENGINE? :-)

The problem is more fundamentally the thread factory that is chosen. On both G7 and G8, the thread factory chosen while in the app engine request thread has restrictions around the threads it creates (e.g. you can't create threads from it while not in an app engine request thread). So, trying to use that thread factory in other contexts will cause an NPE (internal detail - maybe another exception is more appropriate, but regardless, a different exception would probably be explicitly thrown if the NPE itself was fixed.)

@rrch
Copy link

rrch commented Aug 21, 2017

The "current request thread factory" only works in the context of a request. This is currently true for all App Engine APIs in Java 8: if you invoke them from a thread that is not associated with a request, they will fail. In several other places we've improved the message associated with the exception to make the problem clearer, I'll file a bug to do the same for thread factories.

@ejona86
Copy link
Member

ejona86 commented Aug 21, 2017

I had mentioned that maybe the problem is we are using MoreExecutors.platformThreadFactory() in all cases. Maybe we should only use it when IS_RESTRICTED_APPENGINE == true. IIUC, that would avoid using request threads.

However, I also heard that you can't do AppEngine API calls from "normal" threads. That seems like that would be a be a problem for 1) the callback executor and 2) auth.

ejona86 added a commit to ejona86/grpc-java that referenced this issue Aug 23, 2017
While the code had correctly determined full threads were available, the
call to MoreExecutors returned a request thread factory, which has
limitations.

Note that Async stub users may not be able to call GAE APIs in
callbacks. This is because the threads aren't request threads. They can
override the individual call's executor with
com.google.appengine.api.ThreadManager.currentRequestThreadFactory() in
an interceptor via callOptions.withExecutor().

Fixes grpc#3296
@ejona86 ejona86 removed the usability label Aug 23, 2017
ejona86 added a commit that referenced this issue Aug 24, 2017
While the code had correctly determined full threads were available, the
call to MoreExecutors returned a request thread factory, which has
limitations.

Note that Async stub users may not be able to call GAE APIs in
callbacks. This is because the threads aren't request threads. They can
override the individual call's executor with
com.google.appengine.api.ThreadManager.currentRequestThreadFactory() in
an interceptor via callOptions.withExecutor().

Fixes #3296
ejona86 added a commit to ejona86/grpc-java that referenced this issue Aug 28, 2017
While the code had correctly determined full threads were available, the
call to MoreExecutors returned a request thread factory, which has
limitations.

Note that Async stub users may not be able to call GAE APIs in
callbacks. This is because the threads aren't request threads. They can
override the individual call's executor with
com.google.appengine.api.ThreadManager.currentRequestThreadFactory() in
an interceptor via callOptions.withExecutor().

Fixes grpc#3296
@ejona86 ejona86 added this to the 1.6 milestone Aug 28, 2017
ejona86 added a commit that referenced this issue Aug 28, 2017
While the code had correctly determined full threads were available, the
call to MoreExecutors returned a request thread factory, which has
limitations.

Note that Async stub users may not be able to call GAE APIs in
callbacks. This is because the threads aren't request threads. They can
override the individual call's executor with
com.google.appengine.api.ThreadManager.currentRequestThreadFactory() in
an interceptor via callOptions.withExecutor().

Fixes #3296
@lock lock bot locked and limited conversation to collaborators Sep 21, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants