News:

--

Main Menu

2.6/2.7 AudioCopy Duping

Started by KoolAidMan, December 18, 2012, 05:24:27 AM

Previous topic - Next topic

KoolAidMan

The following patch adds duping to the audiocopy in the case where an audio stream is being shifted beyond the beginning boundary of the stream.

Previously, corrected (int64_t) was being compared to startTime (uint64_t) causing an implicit cast where the corrected dts was being interpreted as a huge positive uint64_t even though the corrected dts was actually negative. This was causing an invalid branch, and ultimately caused dts to be set to a huge number and the encoded video would not have any audio. Even when the invalid branch was corrected, the logic was still incorrect because it did not support duplication of the audio stream in the case where the corrected dts is set to a negative value.

The code below will duplicate a portion of the audio stream in the case where it is being shifted beyond the beginning boundary of the stream (negative corrected dts). Once the corrected dts is positive, we rewind the stream and start copying the audio stream as we normally would.


Index: avidemux/common/ADM_audioFilter/src/audiocopy.cpp
===================================================================
--- avidemux/common/ADM_audioFilter/src/audiocopy.cpp (revision 8315)
+++ avidemux/common/ADM_audioFilter/src/audiocopy.cpp (working copy)
@@ -34,6 +34,7 @@
                         ADM_audioStream *in;
                         uint64_t        startTime;
                         int64_t         shift;
+                        bool            dupe;
         public:

                        ADM_audioStreamCopy(ADM_audioStream *input,uint64_t startTime, int64_t shift); 
@@ -70,15 +71,25 @@
}
uint8_t         ADM_audioStreamCopy::getPacket(uint8_t *buffer,uint32_t *size, uint32_t sizeMax,uint32_t *nbSample,uint64_t *dts)
{
-again:
     if(false==in->getPacket(buffer,size,sizeMax,nbSample,dts)) return false;
     if(*dts!=ADM_NO_PTS)
     {
-       
         int64_t corrected=*dts;
         corrected+=shift;
-        if(corrected<startTime) goto again; // cant have <0 dts
-        *dts=corrected-startTime;
+        if(corrected<0) //duping
+        {
+            dupe=true;
+        }
+        else
+        {
+            *dts=corrected-startTime;
+            if(dupe)
+            {
+                shift=0;
+                in->goToTime(*dts);
+                dupe=false;
+            }
+        }
     }
     return true;

mean

Thanks
But you might end up with irregular DTS (non monotonic) , no ?

KoolAidMan

Interesting point. Yes, that is true. The dts will repeat itself after the duplication is complete. I am not familiar enough with your codebase to be able to know the side-effects of a dts which could fluctuate-- you obviously know better than I. I've maybe been inside of your code for 2 or 3 days now :).

That being said, there is obviously still a bug in this class. At the very least, we could do this--

Change

         if(corrected<startTime) goto again; // cant have <0 dts


to

         if(corrected<(int64_t)startTime) goto again; // cant have <0 dts


While that won't add duping to the copyaudio, at least it will solve some headaches where the corrected dts as a negative number is being cast to a huge unsigned dts.

mean

Yes, you are right
Non monotonic DTS will cause an internal error for all lavcodec based muxer
and create weird sync outputs on the others

KoolAidMan

#4
Hey! Sorry for the delay-- I was celebrating the holidays with my family. Here is my second attempt at the audio copy shift duping patch.

When shifting audio in a positive direction or when not shifting audio at all (corrected >= 0), the code executes exactly the same as it did before.

This patch fixes the edge case where audio is being shifted beyond the beginning boundary of the stream (ie. a negative corrected DTS) by adding duping logic where we copy a portion of the stream and then rewind the stream. Unlike the previous version of the patch, I am keeping the DTS monotonic this time. Prior to rewinding the audio stream, I double the DTS to account for the impending rewinding of the DTS. Since I doubled it, it stays monotonic-- the dts I just added compensates. Also, in order to ensure we do not lose any time precision, I am adding the (most likely small) value of the corrected DTS to the real DTS, and am rewinding the stream to the corrected DTS.

I have tested this patch at the beginning, center, and end of the video for shifting audio in both directions as well as not shifting audio at all, and it works great for me.


Index: avidemux/common/ADM_audioFilter/src/audiocopy.cpp
===================================================================
--- avidemux/common/ADM_audioFilter/src/audiocopy.cpp (revision 8339)
+++ avidemux/common/ADM_audioFilter/src/audiocopy.cpp (working copy)
@@ -34,6 +34,7 @@
                         ADM_audioStream *in;
                         uint64_t        startTime;
                         int64_t         shift;
+                        bool            dupe;
         public:

                        ADM_audioStreamCopy(ADM_audioStream *input,uint64_t startTime, int64_t shift); 
@@ -53,6 +54,7 @@
     this->startTime=startTime;
     in->goToTime(startTime);
     this->shift=shift;
+    this->dupe=false;
}

bool ADM_audioStreamCopy::isCBR()
@@ -74,14 +76,31 @@
     if(false==in->getPacket(buffer,size,sizeMax,nbSample,dts)) return false;
     if(*dts!=ADM_NO_PTS)
     {
-       
         int64_t corrected=*dts;
         corrected+=shift;
-        if(corrected<startTime) goto again; // cant have <0 dts
-        *dts=corrected-startTime;
+
+        if(corrected<0)
+        {
+            dupe=true;
+        }
+        else if(dupe)
+        {
+            shift=0;
+            *dts+=*dts+corrected;
+            in->goToTime(corrected);
+            dupe=false;
+        }
+        else if(corrected<startTime)
+        {
+            goto again;
+        }
+        else
+        {
+            *dts=corrected-startTime;
+        }
+
     }
     return true;
-
}
/**
         \fn audioCreateCopyStream

mean

Thanks
If compensation is big enough to notice you wil create a delay between audio & video, no ?
Additionnaly, the first packet will have its dts uncorrected ?

KoolAidMan

#6
Quote from: mean on December 29, 2012, 08:20:04 AM
If compensation is big enough to notice you wil create a delay between audio & video, no ?
Hmm... I'm not sure.

When I first started writing this patch I created an infinite loop because whenever I would rewind the stream the next pass it would set corrected back to a negative number and I would be duping yet again. Although confusing, the compensation was added to fix this problem.

Quote from: mean on December 29, 2012, 08:20:04 AM
Additionnaly, the first packet will have its dts uncorrected ?

I added the following (really noisy) logs to my local copy to check.


    ADM_info("dts before getPacket: %d\n", *dts);
    if(false==in->getPacket(buffer,size,sizeMax,nbSample,dts)) return false;
    ADM_info("dts after getPacket: %d\n", *dts);


Here was the results. Note that in my example I was shifting by 500ms which equates to a shift of -500000.

 [getPacket]  dts before getPacket: 606097
 [getPacket]  dts after getPacket: 504633
 [goToTime]   go to time 0.00 secs
 [goToTime]  => seg 0, rel time 0.00 secs
 [resetAfterSeek]   flushing after seek
[AudioStream] Warning skew in dts =276000,
[AudioStream] Warning skew lastDts= 00:00:00,004 
[AudioStream] Warning skew newDts= 00:00:00,280   
 [fillAudio]  [AviMuxer] Audio skew!
[audioClock] Drift detected :638097 vs 1142730, delta=504633
 [getPacket]  dts before getPacket: 1142730
 [getPacket]  dts after getPacket: 280633
 [fillAudio]  [AviMuxer] Audio skew!
[audioClock] Drift detected :1174730 vs 414097, delta=-760633
 [getPacket]  dts before getPacket: 414097
 [getPacket]  dts after getPacket: 312633

Argh! So it looks like the DTS is still non-monotonic... The compensation was enough to stop the infinite loop, but not enough to keep the dts monotonic. What is drifting the DTS -760633? I was under the impression that the DTS was drifting because I was rewinding the audio stream.

Anyway, I need to go to bed, but I will look into this tomorrow. I have the day off (I'm a Java and C# programmer in my professional life). I haven't written code in C++ since college, but it sure has been refreshing :). If you have any comments, I will be sure to read them. I want to eventually get the audio shift duping correct. Thanks!

KoolAidMan

All right mean-- here is my third attempt at this. Note that the patch that I have attached does not have the verbose logging which is in the code sample below.


Index: avidemux/common/ADM_audioFilter/src/audiocopy.cpp
===================================================================
--- avidemux/common/ADM_audioFilter/src/audiocopy.cpp (revision 8339)
+++ avidemux/common/ADM_audioFilter/src/audiocopy.cpp (working copy)
@@ -34,6 +34,7 @@
                         ADM_audioStream *in;
                         uint64_t        startTime;
                         int64_t         shift;
+                        bool            dupe;
         public:

                        ADM_audioStreamCopy(ADM_audioStream *input,uint64_t startTime, int64_t shift); 
@@ -53,6 +54,7 @@
     this->startTime=startTime;
     in->goToTime(startTime);
     this->shift=shift;
+    this->dupe=false;
}

bool ADM_audioStreamCopy::isCBR()
@@ -74,14 +76,31 @@
     if(false==in->getPacket(buffer,size,sizeMax,nbSample,dts)) return false;
     if(*dts!=ADM_NO_PTS)
     {
-       
         int64_t corrected=*dts;
         corrected+=shift;
-        if(corrected<startTime) goto again; // cant have <0 dts
-        *dts=corrected-startTime;
+
+        if(corrected<0)
+        {
+            dupe=true;
+        }
+        else if(dupe)
+        {
+            shift=*dts;
+            in->goToTime(corrected);
+            dupe=false;
+        }
+        else if(corrected<startTime)
+        {
+            goto again;
+        }
+        else
+        {
+            *dts=corrected-startTime;
+        }
+
+        ADM_info("out dts: %u\n", *dts);
     }
     return true;
-
}
/**
         \fn audioCreateCopyStream


This time I added a log-- this is the results of that log when shifting audio 5000ms on a test video.

 [getPacket]  out dts: 4946000
 [getPacket]  out dts: 4969000
 [getPacket]  out dts: 4992000
 [goToTime]   go to time 0.02 secs
 [goToTime]  => seg 0, rel time 0.02 secs
 [resetAfterSeek]  Resetting faad
 [initFaad]  [FAAD] using 2 bytes of extradata
 [initFaad]  12   [initFaad]  10   [initFaad] 
 [initFaad]  [FAAD]Found :44100 rate 2 channels
 [getPacket]  out dts: 5016000
 [getPacket]  out dts: 5016000
 [getPacket]  out dts: 5039000
 [getPacket]  out dts: 5062000

mean

Let me have a look tomorrow, it's almost ok

KoolAidMan


mean

I'm not sure i'm getting it
If the corrected is <0 it means that the audio frame is in the past so we dont want it
The duping should happen if corrected >0, it means it is in the future and we lack some frames in the present, we could duplicate frames from the future and alter the DTS

KoolAidMan

#11
Quote from: mean on December 30, 2012, 12:07:53 PM
I'm not sure i'm getting it
If the corrected is <0 it means that the audio frame is in the past so we dont want it
The duping should happen if corrected >0, it means it is in the future and we lack some frames in the present, we could duplicate frames from the future and alter the DTS

I have added comments to the code below.

If the corrected dts is <0 that does not mean the audio frame is in the past. That means the audio frame is beyond the beginning boundary of the audio stream, in which case we need to preform audio frame duplication to compensate. If corrected dts >0 then we do not need to dupe, as we have already seeked to the correct location in the audio.

Here is a real-world example of shifting audio 5000ms using the audio copy stream.

In gui_savenew, we do the following in bool admSaver::setupAudio():


            }else // copy mode...
            {
                ADM_info("[audioTrack %d] Creating audio encoding stream, starttime %s(copy)\n",i,ADM_us2plain(startAudioTime));
                int32_t shift=0;
                if(ed->audioEncodingConfig.shiftEnabled)
                {
                    shift=ed->audioEncodingConfig.shiftInMs;
                    ADM_info("Using shift of %d ms\n",(int)shift);
                }
                access=audioCreateCopyStream(startAudioTime,shift,ed->edTrack); //Get audio copy access with shift.
            }


In audioCopy, the following executes in ADM_audioStream *audioCreateCopyStream(uint64_t startTime, int32_t shift, ADM_audioStream *input). Basically we are seeing if we can do any seeking to compensate for the audio shift. Note that a positive audio shift at the beginning of a stream creates a negative dts.


ADM_audioStream *audioCreateCopyStream(uint64_t startTime,int32_t shift,ADM_audioStream *input)
{
  shift*=-1000; // ms -> us
  // fixup startTime and shift
  if(shift>0)
  {
        startTime+=shift;
        shift=0;
  }
  else //positive audio shift
  {
      int64_t comp=-shift;
      if(comp<startTime)
      {
          startTime-=comp;
          shift=0;
      }else //shifting beyond the beginning boundary of the audio stream
      {
          shift-=startTime; //do whatever you can to reduce the amount of shift
          startTime=0; //start at the beginning of the stream
      }
  }
  ADM_info("Creating audio stream copy with compensation : startTime=%s\n",ADM_us2plain(startTime));
  ADM_info("and shift =%s\n",ADM_us2plain(shift));
  return new ADM_audioStreamCopy(input,startTime,shift); //this will be the audio access
}


At the end of this code segment, we return an ADM_audioStreamCopy with the following parameters
startTime: 0
shift: -5000000

The values are then set in the constructor ADM_audioStreamCopy::ADM_audioStreamCopy(ADM_audioStream *input, uint64_t startTime, int64_t shift) : ADM_audioStream(NULL,NULL). Note that is a funny inheritance-- I can see why we are doing it; however, the underlying audioAccess is set to null which we could run into trouble later if we end up calling any methods on the base class.


ADM_audioStreamCopy::ADM_audioStreamCopy(ADM_audioStream *input,uint64_t startTime, int64_t shift) :
                    ADM_audioStream(NULL,NULL)
{
    in=input;
    this->startTime=startTime;
    in->goToTime(startTime);
    this->shift=shift;
    this->dupe=false;
}


Now in order to explain the changes I made in uint8_t ADM_audioStreamCopy::getPacket(uint8_t *buffer,uint32_t *size, uint32_t sizeMax, uint32_t *nbSample, uint64_t *dts), we should look at the diff.


Index: avidemux/common/ADM_audioFilter/src/audiocopy.cpp
===================================================================
--- avidemux/common/ADM_audioFilter/src/audiocopy.cpp (revision 8339)
+++ avidemux/common/ADM_audioFilter/src/audiocopy.cpp (working copy)
@@ -34,6 +34,7 @@
                         ADM_audioStream *in;
                         uint64_t        startTime;
                         int64_t         shift;
+                        bool            dupe; //need to know if we are currently duping
         public:

                        ADM_audioStreamCopy(ADM_audioStream *input,uint64_t startTime, int64_t shift); 
@@ -53,6 +54,7 @@
     this->startTime=startTime;
     in->goToTime(startTime);
     this->shift=shift;
+    this->dupe=false; //always assume we are not currently duping on construction
}

bool ADM_audioStreamCopy::isCBR()
@@ -74,14 +76,31 @@
     if(false==in->getPacket(buffer,size,sizeMax,nbSample,dts)) return false;
     if(*dts!=ADM_NO_PTS)
     {
-       
         int64_t corrected=*dts;
         corrected+=shift;
-        if(corrected<startTime) goto again; // cant have <0 dts
-        *dts=corrected-startTime;
+
+        if(corrected<0) //dts is beyond the beginning boundary of the audio stream
+        {
+            dupe=true; //continue duping
+        }
+        else if(dupe) //we were duping, but now dts is large enough to read from the audio stream without duping.
+        {
+            shift=*dts; //add positive shift in order to compensate for the rewinding stream.
+            in->goToTime(corrected); //go to the (likely small) corrected dts in the audio stream.
+            dupe=false; //we are no longer duping.
+        }
+        else if(corrected<startTime) //all of this code below behaves the same as before.
+        {
+            goto again;
+        }
+        else
+        {
+            *dts=corrected-startTime;
+        }
+
+        ADM_info("out dts: %u\n", *dts); //I will paste the output of this log below
     }
     return true;
-
}
/**
         \fn audioCreateCopyStream


Here is the output of the log:

 [getPacket]  out dts: 4946000
 [getPacket]  out dts: 4969000
 [getPacket]  out dts: 4992000
 [goToTime]   go to time 0.02 secs
 [goToTime]  => seg 0, rel time 0.02 secs
 [resetAfterSeek]  Resetting faad
 [initFaad]  [FAAD] using 2 bytes of extradata
 [initFaad]  12   [initFaad]  10   [initFaad]
 [initFaad]  [FAAD]Found :44100 rate 2 channels
 [getPacket]  out dts: 5016000
 [getPacket]  out dts: 5016000
 [getPacket]  out dts: 5039000
 [getPacket]  out dts: 5062000

mean

#12
If you alter DTS you might create a/v sync issue

Case 1 : Positive shift / AudioStreamCopy called with a negative shift
=====================================================
- Let's say we are saving from 400 to 800
- Let's say we call ADM_audioStreamCopy with shift=-20 i.e.sending audio in the past

When the first audio packet is obtained, it has a DTS of 400

+ shift = 380
- timeStart = -20 => It is in the past, drop it

Case 1 : Negative shift / AudioStreamCopy called with a positive shift
=====================================================
- Let's say we are saving from 400 to 800
- Let's say we call ADM_audioStreamCopy with shift=20 i.e.sending audio in the future
When the first audio packet is obtained, it has a DTS of 400
+shift= 420
-timeStart=20

So the first packet starts at 20, whereas it could start at 0 if we rewinded a little bit / duped the first 20 ms

Do we agree ?


KoolAidMan

#13
Sorry mean-- I don't want to tell you how your code is supposed to work, but I think you are wrong.

Quote from: mean on December 31, 2012, 07:22:05 AM
If you alter DTS you might create a/v sync issue

Case 1 : Positive shift / AudioStreamCopy called with a negative shift
=====================================================
- Let's say we are saving from 400 to 800
- Let's say we call ADM_audioStreamCopy with shift=-20 i.e.sending audio in the past

When the first audio packet is obtained, it has a DTS of 400

+ shift = 380
- timeStart = -20 => It is in the past, drop it
In this case, the shift=0 and timeStart=380. We stop saving at 780. corrected dts will never = -20 because startTime and shift are already modified by the audioCreateCopyStream function. Shift has also already been set to 0 by the audioCreateCopyStream function. No dropping of audio is required for this action. As a matter of fact, if you do drop the first packet, then you are undoing the shift (and then some).

It is important to note that the outDTS of the getPacket function will be 0 - 400. The DTS is only internally being shifted.

Quote from: mean on December 31, 2012, 07:22:05 AM
Case 2 : Negative shift / AudioStreamCopy called with a positive shift
=====================================================
- Let's say we are saving from 400 to 800
- Let's say we call ADM_audioStreamCopy with shift=20 i.e.sending audio in the future
When the first audio packet is obtained, it has a DTS of 400
+shift= 420
-timeStart=20

So the first packet starts at 20, whereas it could start at 0 if we rewinded a little bit / duped the first 20 ms.

Do we agree ?
In this case, once again the shift=0 and timeStart=420. We stop saving at 820. This is because they both have already been modified by the audioCreateCopyStream function. No duplication is required for this stream because the audio is being shifted forward and you will either fetch audio from the future (what the shift is supposed to do) or end the stream early.

Once again the outDTS is from 0 - 400.

I should note that my code does not change either of the examples above from what the copyStream has previously been doing. The changes I made will only affect the following case:

Case 3 : Positive shift / AudioStreamCopy called with a negative shift from the beginning of the stream
========================================================================
- Let's say we are saving from 0 to 400
- Let's say we call ADM_audioStreamCopy with shift=-20 i.e.sending audio in the past
When the first audio packet is obtained, it has a DTS of 0
+shift=-20
-timeStart=-20

In this case, the corrected dts = -20, so we dupe until it is >0. The outDTS during the first packet is 0. Let's say that  during the next getPacket the corrected dts is 5 and the dts is 25. We set the shift to the dts (25) and rewind the audio stream to the corrected dts (5). The outDTS here is 25. Because I set the shift to 25 earlier, when the getPacket is next called, let's say the DTS is now 50. The corrected DTS will be set to 75 and will be the outDTS (because the startTime is 0). The only instance where audio frame duplication is required is when shifting beyond the beginning boundary of the audio stream.

mean

For the 2 first use cases you are right, long time since i worked on that part

But i'm still not getting it for the 3rd

Let's say you have video like this
XXXXXXXABCDEFGxxxxxxx
and audio like this
YXXXXXXXABCDEFGxxxxxxx

So what you want it to send the audio a little bit in the past to  resync audio and video, you end up with shift=-20 in audioCopy
The current code is dropping "Y", so that it matches


Now if it was like this

Video : XXXXXXXABCDEFGxxxxxxx
Audio : YXXXXABCDEFGxxxxxxx

i.e. ending with a positive shift, to send audio in the future, i would agree
we need to duplicate Y to end up with

Video : XXXXXXXABCDEFGxxxxxxx
Audio : YYYXXXXABCDEFGxxxxxxx

Sorry to be thick; but i'm still stuck at the sign part. There must be something obvious i'm missing
(and thanks for your patience)