3
'''ngx_chunkin''' - HTTP 1.1 chunked-encoding request body support for Nginx.
5
''This module is not distributed with the Nginx source.'' See [[#Installation|the installation instructions]].
9
This module is considered production ready.
13
This document describes chunkin-nginx-module [http://github.com/agentzh/chunkin-nginx-module/tags v0.22] released on October 14, 2011.
20
error_page 411 = @my_411_error;
21
location @my_411_error {
26
# your fastcgi_pass/proxy_pass/set/if and
27
# any other config directives go here...
35
error_page 411 = @my_411_error;
36
location @my_411_error {
41
chunkin_keepalive on; # WARNING: too experimental!
43
# your fastcgi_pass/proxy_pass/set/if and
44
# any other config directives go here...
50
This module adds [http://tools.ietf.org/html/rfc2616#section-3.6.1 HTTP 1.1 chunked] input support for Nginx without the need of patching the Nginx core.
52
Behind the scene, it registers an access-phase handler that will eagerly read and decode incoming request bodies when a <code>Transfer-Encoding: chunked</code> header triggers a <code>411</code> error page in Nginx. For requests that are not in the <code>chunked</code> transfer encoding, this module is a "no-op".
54
To enable the magic, just turn on the [[#chunkin|chunkin]] config option and define a custom <code>411 error_page</code> using [[#chunkin_resume|chunkin_resume]], like this:
60
error_page 411 = @my_411_error;
61
location @my_411_error {
69
No other modification is required in your nginx.conf file and everything should work out of the box including the standard [[HttpProxyModule|proxy module]] (except for those [[#Known Issues|known issues]]). Note that the [[#chunkin|chunkin]] directive is not allowed in the location block while the [[#chunkin_resume|chunkin_resume]] directive is only allowed on in <code>locations</code>.
71
The core module's [[HttpCoreModule#client_body_buffer_size|client_body_buffer_size]], [[HttpCoreModule#client_max_body_size|client_max_body_size]], and [[HttpCoreModule#client_body_timeout|client_body_timeout]] directive settings are honored. Note that, the "body sizes" here always indicate chunked-encoded body, not the data that has already been decoded. Basically, the
72
chunked-encoded body will always be slightly larger than the original data that is not encoded.
74
The [[HttpCoreModule#client_body_in_file_only|client_body_in_file_only]] and [[HttpCoreModule#client_body_in_single_buffer|client_body_in_single_buffer]] settings are followed partially. See [[#Known Issues|Know Issues]].
76
This module is not supposed to be merged into the Nginx core because I've used [http://www.complang.org/ragel/ Ragel] to generate the chunked encoding parser for joy :)
80
Nginx explicitly checks chunked <code>Transfer-Encoding</code> headers and absent content length header in its very
81
early phase. Well, as early as the <code>ngx_http_process_request_header</code>
82
function. So this module takes a rather tricky approach. That is, use an output filter to intercept the <code>411 Length Required</code> error page response issued by <code>ngx_http_process_request_header</code>,
83
fix things and finally issue an internal redirect to the current location,
84
thus starting from those phases we all know and love, this time
85
bypassing the horrible <code>ngx_http_process_request_header</code> function.
87
In the <code>rewrite</code> phase of the newly created request, this module eagerly reads in the chunked request body in a way similar to that of the standard <code>ngx_http_read_client_request_body</code> function, but using its own chunked parser generated by Ragel. The decoded request body will be put into <code>r->request_body->bufs</code> and a corresponding <code>Content-Length</code> header will be inserted into <code>r->headers_in</code>.
89
Those modules using the standard <code>ngx_http_read_client_request_body</code> function to read the request body will just work out of box because <code>ngx_http_read_client_request_body</code> returns immediately when it sees <code>r->request_body->bufs</code> already exists.
91
Special efforts have been made to reduce data copying and dynamic memory allocation.
96
'''syntax:''' ''chunkin on|off''
98
'''default:''' ''off''
100
'''context:''' ''http, server''
102
'''phase:''' ''access''
104
Enables or disables this module's hooks.
107
'''syntax:''' ''chunkin_resume''
109
'''default:''' ''no''
111
'''context:''' ''location''
113
'''phase:''' ''content''
115
This directive must be used in your custom <code>411 error page</code> location to help this module work correctly. For example:
118
error_page 411 = @my_error;
124
For the technical reason behind the necessity of this directive, please read the <code>nginx-devel</code> thread [http://nginx.org/pipermail/nginx-devel/2009-December/000041.html Content-Length is not ignored for chunked requests: Nginx violates RFC 2616].
126
This directive was first introduced in the [[#v0.17|v0.17]] release.
128
== chunkin_max_chunks_per_buf ==
129
'''syntax:''' ''chunkin_max_chunks_per_buf <number>''
131
'''default:''' ''512''
133
'''context:''' ''http, server, location''
135
Set the max chunk count threshold for the buffer determined by the [[HttpCoreModule#client_body_buffer_size|client_body_buffer_size]] directive.
136
If the average chunk size is <code>1 KB</code> and your [[HttpCoreModule#client_body_buffer_size|client_body_buffer_size]] setting
137
is 1 meta bytes, then you should set this threshold to <code>1024</code> or <code>2048</code>.
139
When the raw body size is exceeding [[HttpCoreModule#client_body_buffer_size|client_body_buffer_size]] ''or'' the chunk counter is exceeding this <code>chunkin_max_chunks_per_buf</code> setting, the decoded data will be temporarily buffered into disk files, and then the main buffer gets cleared and the chunk counter gets reset back to 0 (or <code>1</code> if there's a "pending chunk").
141
This directive was first introduced in the [[#v0.17|v0.17]] release.
143
== chunkin_keepalive ==
144
'''syntax:''' ''chunkin_keepalive on|off''
146
'''default:''' ''off''
148
'''context:''' ''http, server, location, if''
150
Turns on or turns off HTTP 1.1 keep-alive and HTTP 1.1 pipelining support.
152
Keep-alive without pipelining should be quite stable but pipelining support is very preliminary, limited, and almost untested.
154
This directive was first introduced in the [[#v0.07|v0.07 release]].
156
'''Technical note on the HTTP 1.1 pipeling support'''
158
The basic idea is to copy the bytes left by my chunked parser in
159
<code>r->request_body->buf</code> over into <code>r->header_in</code> so that nginx's
160
<code>ngx_http_set_keepalive</code> and <code>ngx_http_init_request</code> functions will pick
161
it up for the subsequent pipelined requests. When the request body is
162
small enough to be completely preread into the <code>r->header_in</code> buffer,
163
then no data copy is needed here -- just setting <code>r->header_in->pos</code>
164
correctly will suffice.
166
The only issue that remains is how to enlarge <code>r->header_in</code> when the
167
data left in <code>r->request_body->buf</code> is just too large to be hold in the
168
remaining room between <code>r->header_in->pos</code> and <code>r->header_in->end</code>. For
169
now, this module will just give up and simply turn off <code>r->keepalive</code>.
171
I know we can always use exactly the remaining room in <code>r->header_in</code> as
172
the buffer size when reading data from <code>c->recv</code>, but's suboptimal when
173
the remaining room in <code>r->header_in</code> happens to be very small while
174
<code>r->request_body->buf</code> is quite large.
176
I haven't fully grokked all the details among <code>r->header_in</code>, <code>c->buffer</code>,
177
busy/free lists and those so-called "large header buffers". Is there a
178
clean and safe way to reallocate or extend the <code>r->header_in</code> buffer?
182
Grab the nginx source code from [http://nginx.org/ nginx.org], for example,
183
the version 1.0.8 (see [[#Compatibility|nginx compatibility]]), and then build the source with this module:
186
wget 'http://nginx.org/download/nginx-1.0.8.tar.gz'
187
tar -xzvf nginx-1.0.8.tar.gz
190
# Here we assume you would install you nginx under /opt/nginx/.
191
./configure --prefix=/opt/nginx \
192
--add-module=/path/to/chunkin-nginx-module
198
Download the latest version of the release tarball of this module from [http://github.com/agentzh/chunkin-nginx-module/tags chunkin-nginx-module file list].
202
The chunked parser is generated by [http://www.complang.org/ragel/ Ragel]. If you want to
203
regenerate the parser's C file, i.e., [http://github.com/agentzh/chunkin-nginx-module/blob/master/src/chunked_parser.c src/chunked_parser.c], use
204
the following command from the root of the chunkin module's source tree:
207
$ ragel -G2 src/chunked_parser.rl
210
= Packages from users =
212
== Fedora 13 RPM files ==
214
The following source and binary rpm files are contributed by Ernest Folch, with nginx 0.8.54, ngx_chunkin v0.21 and ngx_headers_more v0.13:
216
* [http://agentzh.org/misc/nginx/ernest/nginx-0.8.54-1.fc13.src.rpm nginx-0.8.54-1.fc13.src.rpm]
217
* [http://agentzh.org/misc/nginx/ernest/nginx-0.8.54-1.fc13.x86_64.rpm nginx-0.8.54-1.fc13.x86_64.rpm]
221
The following versions of Nginx should work with this module:
223
* '''1.1.x''' (last tested: 1.1.5)
224
* '''1.0.x''' (last tested: 1.0.8)
225
* '''0.8.x''' (last tested: 0.8.54)
226
* '''0.7.x >= 0.7.21''' (last tested: 0.7.67)
228
Earlier versions of Nginx like 0.6.x and 0.5.x will ''not'' work.
230
If you find that any particular version of Nginx above 0.7.21 does not work with this module, please consider [[#Report Bugs|reporting a bug]].
234
Although a lot of effort has been put into testing and code tuning, there must be some serious bugs lurking somewhere in this module. So whenever you are bitten by any quirks, please don't hesitate to
236
# send a bug report or even patches to <agentzh@gmail.com>,
237
# or create a ticket on the [http://github.com/agentzh/chunkin-nginx-module/issues issue tracking interface] provided by GitHub.
239
= Source Repository =
241
Available on github at [http://github.com/agentzh/chunkin-nginx-module agentzh/chunkin-nginx-module].
246
* now we remove the request header Transfer-Encoding completely because at least Apache will complain about the empty-value <code>Transfer-Encoding</code> request header. thanks hoodoos and Sandesh Kotwal.
247
* now we allow DELETE requests with chunked request bodies per hoodoos's request.
248
* now we use the 2-clause BSD license.
251
* applied a patch from Gong Kaihui (龚开晖) to always call <code>post_handler</code> in <code>ngx_http_chunkin_read_chunked_request_body</code>.
254
* fixed a bug that may read incomplete chunked body. thanks Gong Kaihui (龚开晖).
255
* fixed various memory issues in the implementation which may cause nginx processes to crash.
256
* added support for chunked PUT requests.
257
* now we always require "error_page 411 @resume" and no default (buggy) magic any more. thanks Gong Kaihui (龚开晖).
260
* we now use ragel -G2 to generate the chunked parser and we're 36% faster.
261
* we now eagerly read the data octets in the chunked parser and we're 43% faster.
264
* added support for <code>chunk-extension</code> to the chunked parser as per [http://tools.ietf.org/html/rfc2616#section-3.6.1 RFC 2616], but we just ignore them (if any) because we don't understand them.
265
* added more diagnostic information for certian error messages.
268
* implemented the [[#chunkin_max_chunks_per_buf|chunkin_max_chunks_per_buf]] directive to allow overriding the default <code>512</code> setting.
269
* we now bypass nginx's [http://nginx.org/pipermail/nginx-devel/2009-December/000041.html discard requesty body bug] by requiring our users to define explicit <code>411 error_page</code> with [[#chunkin_resume|chunkin_resume]] in the error page location. Thanks J for reporting related bugs.
270
* fixed <code>r->phase_handler</code> in our post read handler. our handler may run one more time before :P
271
* the chunkin handler now returns <code>NGX_DECLINED</code> rather than <code>NGX_OK</code> when our <code>ngx_http_chunkin_read_chunked_request_body</code> function returns <code>NGX_OK</code>, to avoid bypassing other access-phase handlers.
274
* turned off ddebug in the previous release. thanks J for reporting it.
277
* fixed a regression that ctx->chunks_count never incremented in earlier versions.
280
* now we no longer skip those operations between the (interrupted) ngx_http_process_request_header and the server rewrite phase. this fixed the security issues regarding the [[HttpCoreModule#internal|internal]] directive as well as SSL sessions.
281
* try to ignore CR/LF/SP/HT at the begining of the chunked body.
282
* now we allow HT as padding spaces and ignore leading CRLFs.
283
* improved diagnostic info in the error.log messages when parsefail occurs.
286
* added a random valid-chunked-request generator in t/random.t.
287
* fixed a new connection leak issue caught by t/random.t.
290
* fixed a serious bug in the chunked parser grammer: there would be ambiguity when CRLF appears in the chunked data sections. Thanks J for reporting it.
293
* fixed gcc compilation errors on x86_64, thanks J for reporting it.
294
* used the latest Ragel 6.6 to generate the <code>chunked_parser.c</code> file in the source tree.
298
* marked the disgarded 411 error page's output chain bufs as consumed by setting <code>buf->pos = buf->last</code>. (See [http://nginx.org/pipermail/nginx-devel/2009-December/000025.html this nginx-devel thread] for more details.)
299
* added the [[#chunkin_keepalive|chunkin_keepalive]] directive which can enable HTTP 1.1 keep-alive and HTTP 1.1 pipelining, and defaults to <code>off</code>.
300
* fixed the <code>alphtype</code> bug in the Ragel parser spec; which caused rejection of non-ascii octets in the chunked data. Thanks J for his bug report.
301
* added <code>Test::Nginx::Socket</code> to test our nginx module on the socket level. Thanks J for his bug report.
302
* rewrote the bufs recycling part and preread-buf-to-rb-buf transition part, also refactored the Ragel parser spec, thus eliminating lots of serious bugs.
303
* provided better diagnostics in the error log message for "bad chunked body" parsefails in the chunked parser. For example:
306
2009/12/02 17:35:52 [error] 32244#0: *1 bad chunked body (offset 7, near "4^M
310
", marked by " <-- HERE ").
311
, client: 127.0.0.1, server: localhost, request: "POST /main
312
HTTP/1.1", host: "localhost"
315
* added some code to let the chunked parser handle special 0-size chunks that are not the last chunk.
316
* fixed a connection leak bug regarding incorrect <code>r->main->count</code> reference counter handling for nginx 0.8.11+ (well, the <code>ngx_http_read_client_request_body</code> function in the nginx core also has this issue, I'll report it later.)
319
* minor optimization: we won't traverse the output chain link if the chain count is not large enough.
323
This module comes with a Perl-driven test suite. The [http://github.com/agentzh/chunkin-nginx-module/tree/master/test/t/ test cases] are
324
[http://github.com/agentzh/chunkin-nginx-module/blob/master/test/t/sanity.t declarative] too. Thanks to the [http://search.cpan.org/perldoc?Test::Base Test::Base] module in the Perl world.
326
To run it on your side:
330
$ PATH=/path/to/your/nginx-with-chunkin-module:$PATH prove -r t
333
You need to terminate any Nginx processes before running the test suite if you have changed the Nginx server binary.
335
At the moment, [http://search.cpan.org/perldoc?LWP::UserAgent LWP::UserAgent] is used by the [http://github.com/agentzh/chunkin-nginx-module/blob/master/test/lib/Test/Nginx/LWP.pm test scaffold] for simplicity.
337
Because a single nginx server (by default, <code>localhost:1984</code>) is used across all the test scripts (<code>.t</code> files), it's meaningless to run the test suite in parallel by specifying <code>-jN</code> when invoking the <code>prove</code> utility.
339
Some parts of the test suite requires modules [[HttpProxyModule|proxy]] and [[HttpEchoModule|echo]] to be enabled as well when building Nginx.
343
* May not work with certain 3rd party modules like the [http://www.grid.net.ru/nginx/upload.en.html upload module] because it implements its own request body reading mechanism.
344
* "client_body_in_single_buffer on" may *not* be obeyed for short contents and fast network.
345
* "client_body_in_file_only on" may *not* be obeyed for short contents and fast network.
346
* HTTP 1.1 pipelining may not fully work yet.
350
* make the chunkin handler run at the end of the <code>access phase</code> rather than beginning.
351
* add support for <code>trailers</code> as specified in the [http://tools.ietf.org/html/rfc2616#section-3.6.1 RFC 2616].
352
* fix the [[#Known Issues|known issues]].
356
You'll be very welcomed to submit patches to the [[#Author|author]] or just ask for a commit bit to the [[#Source Repository|source repository]] on GitHub.
360
Zhang "agentzh" Yichun (章亦春) ''<agentzh@gmail.com>''
362
This wiki page is also maintained by the author himself, and everybody is encouraged to improve this page as well.
364
= Copyright & License =
366
The basic client request body reading code is based on the <code>ngx_http_read_client_request_body</code> function and its utility functions in the Nginx 0.8.20 core. This part of code is copyrighted by Igor Sysoev.
368
Copyright (c) 2009, Taobao Inc., Alibaba Group ( http://www.taobao.com ).
370
Copyright (c) 2009, 2010, 2011, Zhang "agentzh" Yichun (章亦春) <agentzh@gmail.com>.
372
This module is licensed under the terms of the BSD license.
374
Redistribution and use in source and binary forms, with or without
375
modification, are permitted provided that the following conditions
378
* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
379
* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
380
* Neither the name of the Taobao Inc. nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
382
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
383
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
384
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
385
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
386
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
387
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
388
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
389
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
390
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
391
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
392
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
396
* The original thread on the Nginx mailing list that inspires this module's development: [http://forum.nginx.org/read.php?2,4453,20543 "'Content-Length' header for POSTs"].
397
* The orginal announcement thread on the Nginx mailing list: [http://forum.nginx.org/read.php?2,22967 "The chunkin module: Experimental chunked input support for Nginx"].
398
* The original [http://agentzh.spaces.live.com/blog/cns!FF3A735632E41548!481.entry blog post] about this module's initial development.
399
* The thread discussing chunked input support on the nginx-devel mailing list: [http://nginx.org/pipermail/nginx-devel/2009-December/000021.html "Chunked request body and HTTP header parser"].
400
* The [[HttpEchoModule|echo module]] for Nginx module's automated testing.
401
* [http://tools.ietf.org/html/rfc2616#section-3.6.1 RFC 2616 - Chunked Transfer Coding].