[PATCH] Add optional "&exact=1" CGI arg to show video between keyframes

Tracey Jaquith tracey at archive.org
Sat Jun 12 23:01:05 UTC 2021


Hello Roman,

> On Jun 9, 2021, at 5:10 AM, Roman Arutyunyan <arut at nginx.com> wrote:
> 
> Hello Tracey.
> 
> Thanks for your patch, it looks very interesting.

Yay! :) 

> 
> On Thu, Jun 03, 2021 at 07:54:49PM +0000, Tracey Jaquith wrote:
>> # HG changeset patch
>> # User Tracey Jaquith <tracey at archive.org>
>> # Date 1622678642 0
>> #      Thu Jun 03 00:04:02 2021 +0000
>> # Node ID 5da9c62fa61016600f2c59982ae184e2811be427
>> # Parent  5f765427c17ac8cf753967387562201cf4f78dc4
>> Add optional "&exact=1" CGI arg to show video between keyframes.
> 
> I think this can be a directive (like mp4_buffer_size/mp4_max_buffer_size)
> rather than an argument.  Is it possible than we need both exact and non-exact
> in the same location?  It feels like we don't need to give clients control over
> this.  The exact mode introduces some overhead and should only be used when
> needed.

That’s a fine idea.  
So far, we’ve been using “exact clipping” at archive.org <http://archive.org/> both ways — but there’s some interest from our TV team to just always do the “exact” mode, anyway.
So that could work for us.
That something I should look into for a revised / 2nd patch, sounds like?

> 
>> archive.org has been using mod_h264_streaming with an "&exact=1" patch from me since 2013.
>> We just moved to nginx mp4 module and are using this patch.
>> The technique is to find the keyframe just before the desired "start" time, and send
>> that down the wire so video playback can start immediately.
>> Next calculate how many video samples are between the keyframe and desired "start" time
>> and update the STTS atom where those samples move the duration from (typically) 1001 to 1.
>> This way, initial unwanted video frames play at ~1/30,000s -- so visually the
>> video & audio start playing immediately.
>> 
>> You can see an example before/after here (nginx binary built with mp4 module + patch):
>> 
>> https://pi.archive.org/0/items/CSPAN_20160425_022500_2011_White_House_Correspondents_Dinner.mp4?start=12&end=30
>> https://pi.archive.org/0/items/CSPAN_20160425_022500_2011_White_House_Correspondents_Dinner.mp4?start=12&end=30&exact=1
>> 
>> Tested on linux and macosx.
>> 
>> I realize two var declarations should style-wise get moved "up" in the lower hooked in method.
>> However, to minimize the changeset/patch, I figured we should at least start here and see what you folks think.
> 
> Syle-wise there are a few minor issues.  We'll figure these out later.

Perfect thanks.  I figured _at least_ those var decls but am sure there are probably some more since I’m stepping into your house :)

> 
> Below are my comments on ways to simplify the code.
> 
>> (this is me: https://github.com/traceypooh )
>> 
>> diff -r 5f765427c17a -r 5da9c62fa610 src/http/modules/ngx_http_mp4_module.c
>> --- a/src/http/modules/ngx_http_mp4_module.c	Tue Jun 01 17:37:51 2021 +0300
>> +++ b/src/http/modules/ngx_http_mp4_module.c	Thu Jun 03 00:04:02 2021 +0000
>> @@ -2045,6 +2045,109 @@
>>     u_char    duration[4];
>> } ngx_mp4_stts_entry_t;
>> 
>> +typedef struct {
>> +    uint32_t              speedup_samples;
>> +    ngx_uint_t            speedup_seconds;
>> +} ngx_mp4_exact_t;
>> +
>> +static void
>> +exact_video_adjustment(ngx_http_mp4_file_t *mp4, ngx_http_mp4_trak_t *trak, ngx_mp4_exact_t *exact)
>> +{
>> +    // will parse STTS -- time-to-sample atom
>> +    ngx_str_t             value;
>> +    ngx_buf_t            *stts_data;
>> +    ngx_buf_t            *atom;
>> +    ngx_mp4_stts_entry_t *stts_entry, *stts_end;
>> +    uint32_t              count, duration, j, n, sample_keyframe, sample_num;
>> +    uint64_t              sample_time, seconds, start_seconds_closest_keyframe;
>> +    uint8_t               is_keyframe;
>> +
>> +    exact->speedup_samples = 0;
>> +    exact->speedup_seconds = 0;
>> +
>> +    // if '&exact=1' CGI arg isn't present, do nothing
>> +    if (!(ngx_http_arg(mp4->request, (u_char *) "exact", 5, &value) == NGX_OK)) {
>> +        return;
>> +    }
>> +
>> +    if (!trak->sync_samples_entries) {
>> +        // Highly unlikely video STSS got parsed and _every_ sample is a keyframe.
>> +        // However, if the case, we don't need to adjust the video at all.
>> +        return;
>> +    }
>> +
>> +    // check HDLR atom to see if this trak is video or audio
>> +    atom = trak->out[NGX_HTTP_MP4_HDLR_ATOM].buf;
>> +    // 'vide' or 'soun'
>> +    if (!(atom->pos[16] == 'v' &&
>> +          atom->pos[17] == 'i' &&
>> +          atom->pos[18] == 'd' &&
>> +          atom->pos[19] == 'e')) {
>> +        return; // do nothing if not video
>> +    }
> 
> Do we really need to check this?  If a track has an stss atom, this seems
> enough to do the work.

With the .mp4 that we make at archive.org <http://archive.org/>, I am seeing STSS consistently for the audio track.
That’s pointing out the start times of the audio samples, so that sort of makes sense to me.
So I found we needed a way to do nothing for any audio tracks.

I guess… we _could_ maybe run the same logic on any audio track(s) since, presumably,
each audio sample is a “keyframe” (and so theoretically we should do nothing below)?
If you’d prefer something like that, I’d need to do a little digging/testing.


> 
>> +    stts_data = trak->out[NGX_HTTP_MP4_STTS_DATA].buf;
>> +    stts_entry = (ngx_mp4_stts_entry_t *) stts_data->pos;
>> +    stts_end   = (ngx_mp4_stts_entry_t *) stts_data->last;
>> +
>> +    sample_num = 0; // they start at one <shrug>
>> +    sample_time = 0;
>> +    start_seconds_closest_keyframe = 0;
>> +    while (stts_entry < stts_end) {
>> +        // STTS === time-to-sample atom
>> +        //    each entry is 4B and [sample count][sample duration]  (since durations can vary)
>> +        count = ngx_mp4_get_32value(stts_entry->count);
>> +        duration = ngx_mp4_get_32value(stts_entry->duration);
>> +
>> +        for (j = 0; j < count; j++) {
>> +            sample_num++;
>> +
>> +            // search STSS sync sample entries to see if this sample is a keyframe
>> +            is_keyframe = (trak->sync_samples_entries ? 0 : 1);
> 
> Considering the above, trak->sync_samples_entries is always non-zero here.

Oh, excellent eyes, thanks!
I will update that to 
is_keyframe = 0;


>> +            for (n = 0; n < trak->sync_samples_entries; n++) {
>> +                // each one of this these are a video sample number keyframe
>> +                sample_keyframe = ngx_mp4_get_32value(trak->stss_data_buf.pos + (n * 4));
>> +                if (sample_keyframe == sample_num) {
>> +                    is_keyframe = 1;
>> +                    break;
>> +                }
>> +                if (sample_keyframe > sample_num) {
>> +                    break;
>> +                }
>> +            }
>> +
>> +            seconds = sample_time * 1000 / trak->timescale;
>> +            sample_time += duration;
>> +
>> +            if (seconds > mp4->start) {
>> +                goto found;
>> +            }
>> +
>> +            if (is_keyframe) {
>> +                start_seconds_closest_keyframe = seconds;
>> +                exact->speedup_samples = 0;
>> +            } else {
>> +                exact->speedup_samples++;
>> +            }
>> +        }
>> +
>> +        stts_entry++;
>> +    }
>> +
>> +    found:
>> +
>> +    ngx_log_debug2(NGX_LOG_DEBUG_HTTP, mp4->file.log, 0,
>> +                   "exact_video_adjustment() new keyframe start: %d, speedup first %d samples",
>> +                   start_seconds_closest_keyframe,
>> +                   exact->speedup_samples);
>> +
>> +    // NOTE: begin 1 start position before keyframe to ensure first video frame emitted is always
>> +    // a keyframe
>> +    exact->speedup_seconds = mp4->start - start_seconds_closest_keyframe
>> +                             - (start_seconds_closest_keyframe ? 1 : 0);
>> +}
>> +
>> 
>> static ngx_int_t
>> ngx_http_mp4_read_stts_atom(ngx_http_mp4_file_t *mp4, uint64_t atom_data_size)
>> @@ -2164,10 +2267,14 @@
>>     ngx_buf_t             *data;
>>     ngx_uint_t             start_sample, entries, start_sec;
>>     ngx_mp4_stts_entry_t  *entry, *end;
>> +    ngx_mp4_exact_t        exact;
>> 
>>     if (start) {
>>         start_sec = mp4->start;
>> 
>> +        exact_video_adjustment(mp4, trak, &exact);
>> +        start_sec -= exact.speedup_seconds;
> 
> Here you find the closest key frame from the past and adjust start_sec to
> match it.  But what if we don't do this here, but continue with the original
> start_sec.  When we find the right (probably non-key) frame, we'll have its
> number in start_sample.  All we'll need to do at that point is to search for
> the closest key frame to that sample number - just a simple loop.  When we
> find the number of samples into the past until the most recent key frame,
> we'll just add one entry to the stts array with (this number, 1) and then
> decrease start_sample accordingly.  Sounds simpler.

Ah, ooh, that’s very interesting!  I like the sound of it.
So we’d use
trak->start_sample
Sounds like, and go… either forwards or backwards in a simple loop over the 
trak->sync_samples_entries
until we find the the keyframe entry number that is just less than or equal to the 
trak->start_sample
Does that sound right?

And I think we could be adding up to 300 STTS entries compared to the non-exact method
(depending on video, but assuming up to 10s between keyframes and 30fps).


> 
>> +
>>         ngx_log_debug1(NGX_LOG_DEBUG_HTTP, mp4->file.log, 0,
>>                        "mp4 stts crop start_time:%ui", start_sec);
>> 
>> @@ -2230,6 +2337,42 @@
>> 
>>     if (start) {
>>         ngx_mp4_set_32value(entry->count, count - rest);
>> +
>> +        if (exact.speedup_samples) {
>> +            // We're going to prepend an entry with duration=1 for the frames we want to "not see".
>> +            // MOST of the time, we're taking a single element entry array and making it two.
>> +            uint32_t current_count = ngx_mp4_get_32value(entry->count);
>> +            ngx_mp4_stts_entry_t* entries_array = ngx_palloc(mp4->request->pool,
>> +                (1 + entries) * sizeof(ngx_mp4_stts_entry_t));
>> +            if (entries_array == NULL) {
>> +                return NGX_ERROR;
>> +            }
> 
> I believe there are situations when there's an entry from the past in the old
> array, that can be reused.

So this is indeed a _very_ interesting point!
Most of the time, certainly for our created .mp4, we’re talking about constant frame rates for the video.
(The audio is indeed often a huge list of entries w/ minor variance in their durations).
So for the video, I’m mostly only ever seeing a single entry list with duration “1001”
(For the 29.97 fps typical TV we see in US where it’s 30000 / 1001 => 29.97).

I did start trying to make the logical list split anywhere it might be found for 2+ entries,
but it _really_ was making my head hurt since there were so many computations to do
while trying to figure out where all the frames were, and where we might be splitting.

In the end, I found I have almost entirely single-entry arrays, splitting into two.
(Though the current patch doesn’t care what the number of array elements is — it just
 prepends a new entry (of duration 1 instead of typical 1001) via a logical `realloc()`)

If you’d prefer something for 2+ entries case, that tries to see if we’re dropping enough entries from 
the front — and to just reuse one (and not realloc — though we’d <still> need a realloc case for
corner cases anyway, _i think_) then perhaps I could use some help or thoughts on that since
it was getting super complicated when I was trying to diagram it out and was fretting and
nearly paralyzed with how to proceed cleanly and safely :-p 

Thanks so much for the detailed feedback and time for your thoughts!
Super appreciated and I’m very upbeat about this all.

-Tracey

> 
>> +            ngx_copy(&(entries_array[1]), entry, entries * sizeof(ngx_mp4_stts_entry_t));
>> +
>> +            ngx_log_debug1(NGX_LOG_DEBUG_HTTP, mp4->file.log, 0,
>> +                           "exact split in 2 video STTS entry from count:%d", current_count);
>> +
>> +            if (current_count <= exact.speedup_samples)
>> +                return NGX_ERROR;
>> +
>> +            entry = &(entries_array[1]);
>> +            ngx_mp4_set_32value(entry->count, current_count - exact.speedup_samples);
>> +            ngx_log_debug2(NGX_LOG_DEBUG_HTTP, mp4->file.log, 0,
>> +                           "exact split new[1]: count:%d duration:%d",
>> +                           ngx_mp4_get_32value(entry->count),
>> +                           ngx_mp4_get_32value(entry->duration));
>> +            entry--;
>> +            ngx_mp4_set_32value(entry->count, exact.speedup_samples);
>> +            ngx_mp4_set_32value(entry->duration, 1);
>> +            ngx_log_debug1(NGX_LOG_DEBUG_HTTP, mp4->file.log, 0,
>> +                           "exact split new[0]: count:%d duration:1",
>> +                           ngx_mp4_get_32value(entry->count));
>> +
>> +            data->last = (u_char *) (entry + 2);
>> +
>> +            entries++;
>> +        }
>> +
>>         data->pos = (u_char *) entry;
>>         trak->time_to_sample_entries = entries;
>>         trak->start_sample = start_sample;
>> _______________________________________________
>> nginx-devel mailing list
>> nginx-devel at nginx.org
>> http://mailman.nginx.org/mailman/listinfo/nginx-devel
> 
> -- 
> Roman Arutyunyan
> _______________________________________________
> nginx-devel mailing list
> nginx-devel at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx-devel

-Tracey
@tracey_pooh
TV Architect  https://archive.org/tv <https://archive.org/tv>





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20210612/543013ac/attachment-0001.htm>


More information about the nginx-devel mailing list