Audio/Video output length disparity between version 2.7.5 and later

Started by hebeguess, October 08, 2020, 03:05:11 pm

Previous topic - Next topic

hebeguess

Another attempt to post this, previous attempt had all been block by spam filter.
I had been resolt to using PM to raised the issue to eumagga0x2a and several exchange was made.

Here is roughly what happened:
https://gitlab.com/mbunkus/mkvtoolnix/-/issues/2931

hebeguess

Here are some replied to me from eumagga0x2a via PM:

QuoteDefault frame duration in MKV is used only if presentation timestamps are missing, thus its value is meanigless in most cases.

Stutter and gaps at joining points depend on the properties of the video stream. In general, if the video stream has no B-frames, there should not be any irregularities after appending. If B-frames are present but all GOPs are closed, there might be (should be) small gaps. If streams are H.264 or HEVC encoded in Open GOP mode, you better use Avidemux to append as it checks whether cut points are viable or not (the keyword is "POC going back").

Earlier versions of Avidemux were calculating the duration of the last frame wrong, the fix for this issue might be related to the behavior you see in case Avidemux doesn't cut away audio for the duration of the B-frame delay (I guess it keeps it).

QuoteYou should be aware of 2.7.5 and some earlier releases introducing irregular, jittery timestamps in perfectly regular video streams due to rounding errors. This is fixed in 2.7.6 and is IMHO one of sufficient reasons not to consider 2.7.5 for usage.

If mkvtoolnix recalculates all timestamps making all frame durations exactly the same (not just a multiple of a timebase) and doesn't take dropped early B-frames into account, Avidemux can't be responsible for such behavior. mediainfo always interprets gaps from dropped early B-frames as irregular fps, even if all timestamps match a multiple of the timebase.

I guess the core of the problem is different handling of B-frame delay as Avidemux uses unsigned values for timestamps and has to delay all timestamps to avoid the first few decode timestamps going negative. Cutting audio short to match expectations of other applications will probably result in even larger gaps when appended in Avidemux.

hebeguess

Another message sent via PM few days ago because I had been blacklisted by cleantalk again.

QuoteI wish I can switch to 2.7.6 now as the DTS-HD MA retention was a nice addition for me.
However, there is fairly high chances for me to face the issue if I'm to use avidemux 2.7.6++ to trim and join the clips.

For last few days I continue to my effort trying to identify the root cause, I think I may have found it.
It is related to mkv container and negative timestamps. Couple it with the negative audio timestamps from avidemux output.
One thing to note: Avidemux 2.7.x always produce an negative timstamps if output as mkv container, unless you cut from the start of the source video.

Here is mkvtoolnix behaviors from my observation:

A. If you're doing simple drop and remux, it wouldn't touch the timestamps.
B. If you manually specify force FPS duration on a source file which contained video and audio tracks,
mkvtoolnix will shift to the right so the audio tracks start with a positive timestamp.
C. If you mux from seperate sources, using video and audio from seperated files.
It will drop the leading packets like what's documented in their wiki.


I had been told audio length calculations on avidemux 2.7.5 was erroneous.
I can verify by simply dropping a clip then straight up save output as mkv, the audio length will become shorter and shorter with each repetitions.
The error was fix in 2.7.6 resulting the audio length will be few frames longer compared to 2.7.5, provided you're trimming on exact the same segment.
That's what causing the issue I'm talking about, because longer audio length make it easier to creep into the start of succeeding video segment.
(Under assumption that I'm using mkvtoolnix to append the clips while also manually key in the FPS duration data to prevent monor stutter)

hebeguess

Second half of the previous message.

QuoteHere is the different strategies I tried:

1) Using avidemux 2.7.5 to selectively trim from a source video into multiple mkv clips, then using mkvtoolnix appeding mode to join the clips into single mkv with manually specified FPS duration.

Result: Preceding audio may make it slightly to the succeeding segment, it not perfect but okay.
At least I wouldn't have withstand the minor video stutter and small pop on audio.
This is how I'm doing for last few years.

2) Using avidemux 2.7.6 to trim, appending in mkvtoolnix with force FPS duration.

Result: Audio creep into the next one easily, it can be up to 500ms sometimes.

One way to more handily spot the issue: pick a leading video that end with a loud audio then append relatively silent video. Use the 'one chapter for each append file' setting under mkvtoolnix to generate chapter right at the joing point. Now play then pause the video, navigate/skip to the chapter and click play again listen to the sound.


How I 'solved' it with extra steps:

3) Using avidemux 2.7.5 to trim, put the clips in mkvtoolnix with manually specify force FPS duration then remux them one by one.
Proceed by appending the remux clips in mkvtoolnix again with force FPS duration.

Result: The result is better than (1) with leading audio only make it very little into the next one.

What I reckon happen is: with the extra step to remux each clips using mkvtoolnix first, it shifted the start of the whole audio track from negative to 0ms or greater value (behavior B from above).
IT's true the audio track produce by avidemux 2.7.5 ends few frames early but the total length just happened to be at the 'right' length.
[Because when on appending mode mkvtoonix simply doing some simple audio tracks stitching without doing extra steps to ensure their length matching up against the video.]


4) Using avidemux 2.7.6 to trim, extract audio tracks from those clips using gMKVExtractGUI (their file name will includes delay timestamps indentifieable by mkvtoonix; "example_track2_[rus]_DELAY -66ms.eac3").
Put the clips in mkvtoolnix with manually specify force FPS duration to remux them one by one, remember to deselecting thier original audio track and replace audio using the extracted track.
Follow on by appending the remux files in mkvtoolnix again with force FPS duration.

Result: The result is pretty much inline with (3).

What I reckon happen is: with the extra step to remux each clips using mkvtoolnix first with seperated audio track, mkvtoolnix will recognized and dropped the leading audio frame for the clip (behavior C from above).
Therefore making audio tracks from these clips 'right' length for appending.
If you check the a/v length on these clip, they're fairly close. The positive audio timestamps will alleviate the 'dumb' audio sticthing method use by mkvmerge.


==

What it looks like to me, it can be alleviate if avidemux change how it handle the leading audio frames?
I remembered on earlier days, I tried switching to output as mp4. It largely able to avoid by the audio creeping issue.
Is it bacause mp4 container had to start at 0ms?
The limited audio formats support in mp4 was holding me back, so I was back with mkv.

Much appreciate for all your effort in avidemux, I use it almost daily currently.

hebeguess

Another replied from eumagga0x2a via PM:

Quote from: eumagga0x2a on October 09, 2020, 10:03:02 am
Quote from: hebeguess on October 08, 2020, 05:31:00 pmWhat it looks like to me, it can be alleviate if avidemux change how it handle the leading audio frames?

If this is necessary to address audio desync when appended in Avidemux, the handling of audio preceding the first keyframe in the first segment will need to be fixed, yes.

Quote from: hebeguess on October 08, 2020, 05:31:00 pmI remembered on earlier days, I tried switching to output as mp4. It largely able to avoid by the audio creeping issue.
Is it bacause mp4 container had to start at 0ms?

AFAIK, mp4 specify frame duration (DTS of a frame is the sum of all previous durations), composition offset (PTS is DTS + composition offset), which may be negative, and edit lists, which tell whether a part of the stream needs to be skipped. In my understanding, both specifying a negative PTS for a frame as also specifying a non-zero positive minimum PTS is technically valid. In all cases Avidemux would need to shift all timestamps in all tracks to ensure PTS >= 0 && DTS >= 0 && PTS >= DTS.

In earlier days, the MP4 demuxer in Avidemux interpreted edit lists wrong, which resulted in generally wrong A/V sync. I can imagine that the combination of various bugs by chance improved compatibility to mkvtoolnix.

Quote from: hebeguess on October 08, 2020, 05:31:00 pmThe limited audio formats support in mp4 was holding me back, so I was back with mkv.

Now we have a MOV muxer in Avidemux, supporting these codecs, and the MP4 codec compatibility has been significantly extended too. In general, Windows Media and Opus are entirely incompatible with mp4/mov, almost everything else can be used.

hebeguess

Quote from: eumagga0x2a on October 09, 2020, 10:03:02 amNow we have a MOV muxer in Avidemux, supporting these codecs, and the MP4 codec compatibility has been significantly extended too. In general, Windows Media and Opus are entirely incompatible with mp4/mov, almost everything else can be used.

Very nice additions.

FYI if I chose to output the same segment as MKV/MP4/MOV/MPEG-TS from avidemux 2.7.7 nightly, drag them into mkvtooolnix and simply remux it.
Roughly the same negative audio offsets is present on the mkvtoolnix remuxed output, I assume the offset were present across avidemux's output.
So jumping around different containers will not magically solve the issue, but sometimes one works little better than others.

==

Been through another round of investigating, gained some more insights.

Found a bug, which is the main factor causing the stutter I'm talking about.
I'm using nightly here but the bug probably present since 2.6.x.
The sample video was H264; audio EAC3; mkv container.
Both video and audio PTS timestamps start at 0ms.

Drag the sample00.mkv into avidemux, save it right away as sample01.mkv directly.
check the timestamps of sample01.mkv, 42ms prepeneded to both audio and video tracks.

Drag sample01.mkv into empty avidemux, save right away again as sample02.mkv.
The audio and video timestamps remained starting at 42ms.

Now back to sample00.mkv, select a random segment and cut it out as sample03.mkv.
The first video timestamp of sample03.mkv start at 42ms, while the first audio timestamp start at 22ms.
Mediainfo: 'Delay relative to video : -20 ms'

The length of time prepened to the output is not fixed, usually vary from clip to clip.
I've seen up to 120ms. Not sure where it came from.

TBD..

hebeguess

Quote from: undefinedI can verify by simply dropping a clip then straight up save output as mkv, the audio length will become shorter and shorter with each repetitions.

I revisited this part again, re-wrote it with more info and clarity.

It occur in 2.7.5 when a mkv input clip's first video frame start at 0ms,
drag and save right away the audio track will become shorter.
Per my previous post, output clip from avidemux doesn't set the first video frames at 0ms.
If you do it again using the output clip, 2.7.5 will not cut short it's audio.

avidemux 2.7.7 200917 nightly doesn't exhibit the behaviour.

hebeguess

Stupid CleanTalk, assumed long post == spam.

Found a regression related to audio length when avidemux 'auto redesignate' marker B placed on B-frame to a safe to cut P-frame.
When avidemux revert back to safe video frame on nightly, it forgot to take audio track length into account too. Audio was handled correctly on 2.7.5.

Quote from: undefinedexample01.mkv

Video
Format                      : AVC
Frame rate                  : 23.976 (24000/1001) FPS

Audio
Format                      : E-AC-3
Frame rate                  : 31.250 FPS (1536 SPF)


last couple of frames:
...PBBBBPBBBPI
...XXXXX4XX321


avidemux 2.7.7 (200917)
Delay relative to video    : -70 ms


Marker B set at 1 (I)
Video Duration                    : 9 s 843 ms
Audio Duration                    : 9 s 888 ms

Marker B set at 2 (P)
Video Duration                    : 9 s 802 ms
Audio Duration                    : 9 s 856 ms

Marker B set at 3 (B)
Video Duration                    : 9 s 635 ms
Audio Duration                    : 9 s 824 ms

Marker B set at 4 (P)
Video Duration                    : 9 s 635 ms
Audio Duration                    : 9 s 696 ms


avidemux 275
Delay relative to video    : -110 ms

Marker B set at 1 (I)
Video Duration                    : 9 s 843 ms
Audio Duration                    : 9 s 856 ms

Marker B set at 2 (P)
Video Duration                    : 9 s 802 ms
Audio Duration                    : 9 s 824 ms

Marker B set at 3 (B)
Video Duration                    : 9 s 635 ms
Audio Duration                    : 9 s 632 ms

Marker B set at 4 (P)
Video Duration                    : 9 s 635 ms
Audio Duration                    : 9 s 632 ms

Quote from: undefinedexample02.mkv

Video
Format                      : AVC
Frame rate                  : 23.976 (24000/1001) FPS

Audio
Format                      : DTS
Frame rate                  : 93.750 FPS (512 SPF)

last couple of frames:
...PBBPBBPBPI
...XXXXXX4321


avidemux 2.7.7 (200917)
Delay relative to video    : -64 ms

Marker B set at 1 (I)
Video Duration                    : 5 s 47 ms
Audio Duration                    : 5 s 66 ms

Marker B set at 2 (P)
Video Duration                    : 5 s 5 ms
Audio Duration                    : 5 s 23 ms

Marker B set at 3 (B)
Video Duration                    : 4 s 921 ms
Audio Duration                    : 4 s 980 ms

Marker B set at 4 (P)
Video Duration                    : 4 s 921 ms
Audio Duration                    : 4 s 938 ms



avidemux 275
Delay relative to video    : -32 ms

Marker B set at 1 (I)
Video Duration                    : 5 s 47 ms
Audio Duration                    : 5 s 55 ms

Marker B set at 2 (P)
Video Duration                    : 5 s 6 ms
Audio Duration                    : 5 s 13 ms

Marker B set at 3 (B)
Video Duration                    : 4 s 922 ms
Audio Duration                    : 4 s 927 ms

Marker B set at 4 (P)
Video Duration                    : 4 s 921 ms
Audio Duration                    : 4 s 927 ms

You can also see the the A/V length produced by 2.7.5 stick close to another.
On nightly, audio length became few frames longer than video.