News:

--

Main Menu

Concatenating: timestamps wrong?

Started by CryGuy, July 04, 2018, 12:46:12 PM

Previous topic - Next topic

CryGuy

Hi again,

If I concatenate two videos, the PTS values in the output file are wrong. I've put a lot of work in analyzing this. "Wrong" means: The video is not seamless, and A/V is not in sync anymore. Though, maybe I'm missing something and I'm completely mistaken. I can provide the link to the videos in a PM (input files, output file, admlog.txt), so you can evaluate it yourself. In advance, here's a link to a PDF file illustrating the situation as good  as I am able to.

https://mega.nz/#!KwY2xKAY!D_2E6AdodR2hbSwujssKtY5kCFg08YetxcP52xNd1Ow

If something's missing, please let me know.

Thank you!

eumagga0x2a

Yes, please provide the samples. Seamless append is possible only with videos which don't contain B-frames.

eumagga0x2a

Thank you for the samples. Apart from A/V sync which is impossible to judge precisely enough based on the content, they allow to understand what is happening. WRT PTS of the first frame, the commit [mp4/demux] Unit in elst is in track timescale, not movie timescale states that the first frame PTS is 160/25000 = 6.4 ms instead of 160/1000 = 160 ms. According to that, the delay specified for the audio track is 181/48000 = approx. 4 ms, so that video should be roughly 2 ms behind audio. Thus said, Avidemux adds 2 ms to all video timestamps in the file, both DTS and PTS (I was wrong telling you that DTS offset is always set to zero). These 2 ms is what you see in 0.082 first frame PTS while 80 ms are required to account for B-frames.

If FFmpeg assumes 160 units of track timescale are 160 ms, it is most likely wrong.

I didn't have time to look into appending yet, I'll check this ASAP.

CryGuy

First, thanks for looking into it.

Quotestates that the first frame PTS is 160/25000 = 6.4 ms
Here, libav returns 4000 as the raw PTS value for the first video frame, which means 0.16 s because track timescale is 25000. It's 8688 for the first audio frame, which is 0.181 s (8688/48000). The 160 are milliseconds, not raw PTS values.
But I'll wait and see what you're gonna find out.
Thanks so far.

eumagga0x2a

160 is the value read from the edit list atom. It really looks like we do something wrong in the Mp4 demuxer: enabling audio shift +1000 ms (video at 25 fps), saving in copy mode as mp4 and loading the resulting file end up with audio delayed only by 22 ms. Playing the file with mpv confirms that audio is really delayed by 1 second.

Thank you for your valuable input.