Sharing data when download the same object from upstream
Anatoli Marinov
a.marinov at ucdn.com
Fri Aug 30 08:31:26 UTC 2013
Hello,
On Wed, Aug 28, 2013 at 7:56 PM, Alex Garzão <alex.garzao at azion.com> wrote:
> Hello Anatoli,
>
> Thanks for your reply. I will appreciate (a lot) your help :-)
>
> I'm trying to fix the code with the following requirements in mind:
>
> 1) We were upstreams/downstreams with good (and bad) links; in
> general, upstream speed is more than downstream speed but, in some
> situations, the downstream speed is a lot more quickly than the
> upstream speed;
>
I think this is asynchronous and if the upstream is faster than the
downstream it save the data to cached file faster and the downstream gets
the data from the file instead of the mem buffers.
> 2) I'm trying to disassociate the upstream speed from the downstream
> speed. The first request (request that already will connect in the
> upstream) download data to temp file, but no longer sends data to
> downstream. I disabled this because, in my understand, if the first
> request has a slow downstream, all others downstreams will wait data
> to be sent to this slow downstream.
>
I think this is not necessary.
>
> My first doubt is: Need I worry about downstream/upstream speed?
>
> No
> Well, I will try to explain what I did in the code:
>
> 1) I created a rbtree (currrent_downloads) that keeps the current
> downloads (one rbtree per upstream). Each node keeps the first request
> (request that already will connect into upstream) and a list
> (download_info_list) that will keep two fields: (a) request waiting
> data from the temp file and (b) file offset already sent from the temp
> file (last_offset);
>
>
I have the same but in ordered array (simple implementation). Anyway the
rbtree will do the same. But this structure should be in shared memory
because all workers should know which files are currently in downloading
from upstream state. The should exist in tmp directory.
> 2) In ngx_http_upstream_init_request(), when the object isn't in the
> cache, before connect into upstream, I check if the object is in
> rbtree (current_downloads);
>
> 3) When the object isn't in current_downloads, I add a node that
> contains the first request (equal to current request) and I add the
> current request into the download_info_list. Beyond that, I create a
> timer event (polling) that will check all requests in
> download_info_list and verify if there are data in temp file that
> already not sent to the downstream. I create one timer event per
> object [1].
>
> 4) When the object is in current_downloads, I add the request into
> download_info_list and finalize ngx_http_upstream_init_request() (I
> just return without execute ngx_http_upstream_finalize_request());
>
> 5) I have disabled (in ngx_event_pipe) the code that sends data to
> downstream (requirement 2);
>
> 6) In the polling event, I get the current temp file offset
> (first_request->upstream->pipe->temp_file->offset) and I check in the
> download_info_list if this is > than last_offset. If true, I send more
> data to downstream with the ngx_http_upstream_cache_send_partial (code
> bellow);
>
> 7) In the polling event, when pipe->upstream_done ||
> pipe->upstream_eof || pipe->upstream_error, and all data were sent to
> downstream, I execute ngx_http_upstream_finalize_request for all
> requests;
>
> 8) I added a bit flag (first_download_request) in ngx_http_request_t
> struct to avoid request to be finished before all requests were
> completed. In ngx_http_upstream_finalize_request() I check this flag.
> But, in really, I don't have sure if is necessary avoid this
> situation...
>
>
> Bellow you can see the ngx_http_upstream_cache_send_partial code:
>
>
> /////////////
> static ngx_int_t
> ngx_http_upstream_cache_send_partial(ngx_http_request_t *r,
> ngx_temp_file_t *file, off_t offset, off_t bytes, unsigned last_buf)
> {
> ngx_buf_t *b;
> ngx_chain_t out;
> ngx_http_cache_t *c;
>
> c = r->cache;
>
> /* we need to allocate all before the header would be sent */
>
> b = ngx_pcalloc(r->pool, sizeof(ngx_buf_t));
> if (b == NULL) {
> return NGX_HTTP_INTERNAL_SERVER_ERROR;
> }
>
> b->file = ngx_pcalloc(r->pool, sizeof(ngx_file_t));
> if (b->file == NULL) {
> return NGX_HTTP_INTERNAL_SERVER_ERROR;
> }
>
> /* FIX: need to run ngx_http_send_header(r) once... */
>
> b->file_pos = offset;
> b->file_last = bytes;
>
> b->in_file = 1;
> b->last_buf = last_buf;
> b->last_in_chain = 1;
>
> b->file->fd = file->file.fd;
> b->file->name = file->file.name;
> b->file->log = r->connection->log;
>
> out.buf = b;
> out.next = NULL;
>
> return ngx_http_output_filter(r, &out);
> }
> ////////////
>
> My second doubt is: Could I just fix ngx_event_pipe to send to all
> requests (instead of to send to one request)? And, if true,
> ngx_http_output_filter can be used to send a big chunk at first time
> (300 MB or more) and little chunks after that?
>
>
Use smaller chunks.
Thanks in advance for your attention :-)
>
> [1] I know that "polling event" is a bad approach with NGINX, but I
> don't know how to fix this. For example, the upstream download can be
> very quickly, and is possible that I need send data to downstream in
> little chunks. Upstream (in NGINX) is socket event based, but, when
> download from upstream finished, which event can I expect?
>
> Regards.
> --
> Alex Garzão
> Projetista de Software
> Azion Technologies
> alex.garzao (at) azion.com
>
> _______________________________________________
> nginx-devel mailing list
> nginx-devel at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx-devel
>
You are on a right way. Just keep digging. Do not forget to turn off this
features when you have flv or mp4 seek, partial requests and
content-ecoding different than identity because you will send broken files
to the browsers.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20130830/e2c481a9/attachment.html>
More information about the nginx-devel
mailing list