video - What is the correct way to fix keyframes in FFmpeg for DASH?

Sunday 16 September 2018

video - What is the correct way to fix keyframes in FFmpeg for DASH?

When conditioning a stream for DASH playback, random access points must be at the exact same source stream time in all streams. The usual way to do this is to force a fixed frame rate and fixed GOP length (i.e. a keyframe every N frames).

In FFmpeg, fixed frame rate is easy (-r NUMBER).

But for fixed keyframe locations (GOP length), there are three methods...which one is "correct"? The FFmpeg documentation is frustratingly vague on this.

Method 1: messing with libx264's arguments

-c:v libx264 -x264opts keyint=GOPSIZE:min-keyint=GOPSIZE:scenecut=-1

There seems to be some debate if scenecut should be turned off or not, as it is unclear if the keyframe "counter" is restarted when a scene cut happens.

Method 2: setting a fixed GOP size:

-g GOP_LEN_IN_FRAMES

This is unfortunately only documented in passing in the FFMPEG documentation, and thus the effect of this argument is very unclear.

Method 3: insert a keyframe every N seconds (Maybe?):

-force_key_frames expr:gte(t,n_forced*GOP_LEN_IN_SECONDS)

This is explicitly documented. But it is still not immediately clear if the "time counter" restarts after every key frame. For instance, in an expected 5-second GOP, if there is a scenecut keyframe injected 3 seconds in by libx264, would the next keyframe be 5 seconds later or 2 seconds later?

In fact, the FFmpeg documentation differentiates between this and the -g option, but it doesn't really say how these two options above are the least bit different (obviously, -g is going to require a fixed frame rate).

Which is right?

It would seem that the -force_key_frames would be superior, as it would not require a fixed frame rate. However, this requires that

it conforms to GOP specifications in H.264 (if any)

it GUARANTEES that there would be a keyframe in fixed cadence, irrespective of libx264 scenecut keyframes.

It would also seem that -g could not work without forcing a fixed frame rate (-r), as there is no guarantee that multiple runs of ffmpeg with different codec arguments would provide the same instantaneous frame rate in each resolution. Fixed frame rates may reduce compression performance (IMPORTANT in a DASH scenario!).

Finally, the keyint method just seems like a hack. I hope against hope that this isn't the correct answer.

References:

An example using the -force_key_frames method

An example using the keyint method

FFmpeg advanced video options section

Answer

The answer therefore seems to be:

Method 1 is verified to work, but is libx264-specific, and comes at the cost of eliminating the very useful scenecut option in libx264.

Method 3 works as of the FFMPEG version of April 2015, but you should verify your results with with the script included at the bottom of this post, as the FFMPEG documentation is unclear as to the effect of the option. If it works, it is the superior of the two options.

DO NOT USE Method 2, -g appears to be deprecated. It neither appears to work, nor is it explicitly defined in the documentation, nor is found in the help, nor does it appear to be used in the code. Code inspection shows that the -g option is likely meant for MPEG-2 streams (there are even code stanzas referring to PAL and NTSC!).

Also:

Files generated with Method 3 may be slightly larger than Method 1, as interstitial I frames (keyframes) are allowed.

You should explicitly set the "-r" flag in both cases, even though Method 3 places an I frame at the next frameslot on or after the time specified. Failure to set the "-r" flag places you at the mercy of the source file, possibly with a variable frame rate. Incompatible DASH transitions may result.

Despite the warnings in the FFMPEG documentation, method 3 is NOT less efficient than others. In fact, tests show that it might be slightly MORE efficient than method 1.

Script for the `-force_key_frames` option

Here is a short PERL program I used to verify I-frame cadence based on the output of slhck's ffprobe suggestion. It seems to verify that the -force_key_frames method will also work, and has the added benefit of allowing for scenecut frames. I have absolutely no idea how FFMPEG makes this work, or if I just lucked out somehow because my streams happen to be well-conditioned.

In my case, I encoded at 30fps with an expected GOP size of 6 seconds, or 180 frames. I used 180 as the gopsize argument to this program verified an I frame at each multiple of 180, but setting it to 181 (or any other number not a multiple of 180) made it complain.

#!/usr/bin/perl
use strict;
my $gopsize = shift(@ARGV);
my $file = shift(@ARGV);
print "GOPSIZE = $gopsize\n";
my $linenum = 0;
my $expected = 0;
open my $pipe, "ffprobe -i $file -select_streams v -show_frames -of csv -show_entries frame=pict_type |"
        or die "Blah";
while (<$pipe>) {
  if ($linenum > $expected) {
    # Won't catch all the misses. But even one is good enough to fail.
    print "Missed IFrame at $expected\n";
    $expected = (int($linenum/$gopsize) + 1)*$gopsize;
  }
  if (m/,I\s*$/) {
    if ($linenum < $expected) {
      # Don't care term, just an extra I frame. Snore.
      #print "Free IFrame at $linenum\n";
    } else {
      #print "IFrame HIT at $expected\n";
      $expected += $gopsize;
    }
  }
  $linenum += 1;
}

Notes

Sunday 16 September 2018