Friday, October 3, 2014

Avoiding the "NGINX buffers the request of the body when uploading large files" issue

The problem with NGINX is perfectly commented by David Moreau Simard in his blog post: A use case of Tengine, a drop-in replacement and fork of nginx. The summary is in this paragraph: 
I noticed a problem when using nginx as a load balancer in front of servers that are the target of large and numerous uploads. nginx buffers the request of the body and this is something that drives a lot of discussion in the nginx mailing lists.
This effectively means that the file is uploaded twice. You upload a file to nginx that acts as a reverse proxy/load balancer and nginx waits until the file is finished uploading before sending the file to one of the available backends. The buffer will happen either in memory or to an actual file, depending on configuration.
Tengine was recently brought up in the Ceph mailing lists as part of the solution to tackling the problem so I decided to give it a try and see what kind of impact it’s unbuffered requests had on performance.
There are similar issues in a lot of lists:


I have made a fast "adaptation" of the Yaoweibin's no_buffer patch to the new nginx releases.

Weibin Yao (yaoweibin) is a MOTU working in the tengine project: https://github.com/yaoweibin

Tengine (https://github.com/alibaba/tengine) is a web server originated by Taobao, the largest e-commerce website in Asia. It is based on the Nginx HTTP server and has many advanced features. Tengine has proven to be very stable and efficient on some of the top 100 websites in the world, including taobao.com and tmall.com

At the moment, it is not possible avoid the buffering in the POST requests in NGINX. If you are working uploading large files to a backend, you know i am meaning.

Tengine has a patch (by yaoweibin?) to solve it and it appears as a feature in its webpage: http://tengine.taobao.org/
  • Sends unbuffered upload directly to HTTP and FastCGI backend servers, which saves disk I/Os.
There is a pending ticket to Nginx team requesting that: http://trac.nginx.org/nginx/ticket/251 but there is not ETA: http://forum.nginx.org/read.php?2,253626,253705#msg-253705

Finally, i chose to adapt the Yaoweibin's patches (http://yaoweibin.cn/patches/) to the 1.7.6 nginx version.

For me, it is working perfectly.

A CentOS RPM package is available in  our repo: http://repo.enetres.net/repoview/nginx.html

The new options in the conf file are:
  • client_body_buffers
  • client_body_postpone_size
  • proxy_request_buffering
  • fastcgi_request_buffering
The description of this new options is in this tengine page: http://tengine.taobao.org/document/http_core.html

UPDATE
This patch is not necessary from nginx > 1.7.11

6 comments:

  1. I tested the patch today on Solaris, and it seems to kind of work. The upload gets proxied directly as expected. Unfortunately, when taking a closer look, nginx burns a full CPU during the upload. It seems that the event handling is completely messed up. The write event in the upstream direction is set all the time, even when no data to relay is present. This leads to a quasi busy waiting loop, eating a full CPU. Can you tell me where the write event is supposed to be deleted?

    ReplyDelete
    Replies
    1. Hello Arne.... thanks for writing.

      I would like to help you but as you read in the post i just took the patch from tengine (by Yaoweibin) and made a version/adaptation for nginx 1.7.6. Nothing else.

      For your information, I was high-profiling it in linux but i don't see anything weird here so what about a bottleneck in the fcgi (or similar) backend? it is possible that the backend is not eating all the data coming from the nginx. Is that possible?.



      Delete
    2. Hallo Vicente,

      I was hoping that you have a deeper understanding of the event internals of nginx with regard to this patch. Nevermind, I'll figure out what's going on sooner or later...
      The problem I see is with proxy_pass, not fcgi. The backend consumes the data very quickly.
      Anyway, thanks for putting together the patch. Hopefully nginx comes up with this feature natively very soon, I read something about this quarter.

      Delete
    3. Yes sir, indeed, I hope so. This quarter seems to be the ETA.

      Thanks for answering and sorry for my "not-help" :(

      Delete
    4. pay attention to nginx version 1.7.11 guys!

      Delete
    5. 1.7.11 and 1.7.12 working perfectly. No more patchs :P

      Delete