Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Really a use for select() ? #19

Open
MagicalTux opened this issue Jul 25, 2014 · 5 comments
Open

Really a use for select() ? #19

MagicalTux opened this issue Jul 25, 2014 · 5 comments

Comments

@MagicalTux
Copy link
Contributor

The way nope.c works, each worker thread will wait for an incoming connection, then process it.

While this new method allows a single worker to process multiple requests at the same time, it is quite unlikely to happen - especially since the process() method will not return the processing of that query is done.

I would guess the purpose is to have multiple open sockets and wait for incoming data, but if one sender is very slow and sends fragmented request elements (ie. only "GET " at first), it will block that child while the child may have accepted other connections.

Anyway I believe it would make sense to either keep nope.c's original simple design (listen, process), or if we really want to do this right we could:

  • listen is only done in parent process
  • each child process has a unix pipe to the parent process and keeps waiting for data to come
  • incoming connections are accepted and data is received async in a temporary buffer (until empty newline or buffer full which would cause an error to be sent)
  • once full buffer received, it is passed to an available child process through the pipe, alongside the client's fd (UNIX domain sockets can pass file descriptor between processes)
  • parent closes the file descriptor, free memory
  • child processes the request, responds directly to the client

From there, the parent process could automatically increase the number of child processes should - for example - all children be busy, and do a lot of other interesting things.

Pros:

  • If someone opens multiple connections but doesn't send anything, it doesn't cause a denial of service
  • Parent process has more control on the distribution of work
  • If a child dies for some reason (bug in the server for example) parent will notice and will be able to respawn some children

Cons:

  • More complex to implement
  • Could potentially cause parent process to block if a child is frozen
  • Things like OpenSSL will have troubles dealing with the fd moving to a different process (could be solved by having the parent act as a proxy rather than sending over the fd - also would improve security as child code will not have access to certificates/etc)
@riolet
Copy link
Collaborator

riolet commented Jul 26, 2014

Thanks for the detailed proposal, Mark. I like it, except, as you say, we will be straying away from the simple design. How about the following model as an alternative:

  • Keep the forking and select() as it is now
  • The first time we find a connection with new data to read, we span process as a new thread, i.e., each connection will have its own worker thread.
  • Each thread has a read and write buffer that the parent thread can see. They read and write as normal to this buffer. As far as they are concerned, they have no clue that they are not writing to a socket.
  • When the parent process finds that connection X has new data to read, it reads an puts this data on thread X's read buffer.
  • When the parent finds that thread X has written something to its write buffer, the data is then written to connection X

Pros:

  • Easy to implement and maintains the minimalist design.

Cons:

  • Mixing processes, threads and select()

Please let me know what you think.

@MagicalTux
Copy link
Contributor Author

Having a proxy (ie. thread that will read/write between socket and buffers - or more likely pipes - passed to the child) is something I though of. In terms of design this gets closer to FastCGI for example (http server receives request, then pass it to another process that does the actual work).
This way of doing things is most likely the best in order to keep things smooth, but may not be adapted to everyone.
For example machinekit.io (see #5 ) would need a HTTP server that only one person will interact with (as would many embedded systems, actually) which is a use case where blocking behavior is acceptable, while a public website (say http://nopedotc.com/ ) requires that one user's action on the website do not impact another user's actions.

With both cases could be implemented with appropriate #ifdef (whoever compiles decides what fits best for him) that would be a lot of differences that could end making the code too complex.

As such I would guess that before we decide on the appropriate method to deal with queries, defining a bit more the intended use for nope.c is required.

@riolet
Copy link
Collaborator

riolet commented Jul 26, 2014

What do you think of the idea of having two version? The "standard" version aimed at public websites, which was the original intention of doing this, and nope.c-embed for embedded systems.

@MagicalTux
Copy link
Contributor Author

That could work, and if coded properly, we could just have a single file defining which version you're using (kind of the way Apache's MPM works, but more simple).

In that case I can see a few patterns:

  • All-in-one-process: no forks, only select()
  • Standard fork: as things were initially
  • Hybrid (fork + select)
  • Proxy (parent doing select, passing data to children through dedicated pipes)

By the way we might want to support other methods of waiting for activity such as epoll().

@Charles0429
Copy link

Hi @riolet @MagicalTux , there are multiple networking patterns as MagicalTux pointed out:

  • All-in-one-process: no forks, only select()
  • Standard fork: as things were initially
  • Hybrid (fork + select)
  • Proxy (parent doing select, passing data to children through dedicated pipes)

Besides fork method, if we consider multi-threading, then the network patterns could be as follows:

  • All-in-one-process: no forks, only select
  • Standard fork: fork a process when a client request comes
  • Standard pthread_create: create a thread when a client request comes
  • Pre fork: Pre fork N children processes, each process deals with a request
  • Pre pthread_create: Pre create N threads, each thread deals with a request
  • Hybrid fork: fork + select
  • Hybrid pre fork: pre fork + select
  • Hybrid pthread_create: pthead_create + select
  • Hybrid pre pthread_create: pre pthread_create +select
  • Proxy fork: fork + Proxy
  • Proxy pthread_create: pthread_create() + Proxy
  • Proxy pre fork: pre fork + Proxy
  • Proxy pre pthread_create: pre pthread_create + Proxy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants