"Same as source" FPS Improvements

Archive of historical development discussions
Discussions / Development has moved to GitHub
Forum rules
*******************************
Please be aware we are now using GitHub for issue tracking and feature requests.
- This section of the forum is now closed to new topics.

*******************************
sdm
Bright Spark User
Posts: 194
Joined: Mon Feb 19, 2007 4:53 pm

Re: "Same as source" FPS Improvements

Post by sdm »

Hi Van,

If I applied the patch correctly (which I'm not very sure about), it didn't work on this sample.
Any suggestions?

--sdm.
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Re: "Same as source" FPS Improvements

Post by jbrjake »

van wrote:During testing I noticed that vfr on progressive content (Star Wars Ep.4) seems to do bad things. The frames are mostly 3004 ticks with a few at 4504. Since init_delay is based on a max duration of 3754, if one of these long frames gets reordered it will screw up the decoder.
Hmm...odd. I padded it out to be safe.. The delay is already way longer than it should be anyway; it should be 3754*1 (or 2 with b-pyramid) but instead it's 3754*2 (or 3). Can't hurt to bump it up a bit more, though, I suppose.
van
Veteran User
Posts: 417
Joined: Wed Aug 29, 2007 6:35 am

Re: "Same as source" FPS Improvements

Post by van »

jbjjake,
I encoded the first 5 chapters of Star Wars Ep.4 with vfr on. Here are the durations & render offsets (in 90KHz ticks) of the first 50 mp4 frames. I used "mp4videoinfo -r" to print the frame info and piped it to an awk script to convert frame times to durations & convert from the 48KHz to a 90KHz clock.

Code: Select all

1       dur 3001        ro 7507
2       dur 4503        ro 15015
3       dur 4503        ro 3003
4       dur 1501        ro 4503
5       dur 4503        ro 15015
6       dur 3003        ro 4503
7       dur 6005        ro 3003
8       dur 3001        ro 15015
9       dur 4503        ro 3003
10      dur 1501        ro 4503
11      dur 4503        ro 15016
12      dur 3003        ro 4503
13      dur 6011        ro 3000
14      dur 3000        ro 15015
15      dur 4503        ro 3003
16      dur 1501        ro 4503
17      dur 4503        ro 15015
18      dur 3003        ro 4503
19      dur 6005        ro 3003
20      dur 3001        ro 15015
21      dur 4503        ro 3003
22      dur 1501        ro 4503
23      dur 4503        ro 15016
24      dur 3003        ro 4503
25      dur 6011        ro 3000
26      dur 3000        ro 15015
27      dur 4503        ro 3003
28      dur 1501        ro 4503
29      dur 4503        ro 15015
30      dur 3003        ro 4503
31      dur 6005        ro 3003
32      dur 3001        ro 15015
33      dur 4503        ro 3003
34      dur 1501        ro 4503
35      dur 4503        ro 15016
36      dur 3003        ro 4503
37      dur 6011        ro 3000
38      dur 3000        ro 15015
39      dur 4503        ro 3003
40      dur 1501        ro 4503
41      dur 4503        ro 15015
42      dur 3003        ro 4503
43      dur 6005        ro 3003
44      dur 3001        ro 15015
45      dur 4503        ro 3003
46      dur 1501        ro 4503
47      dur 4503        ro 15016
48      dur 3003        ro 4503
49      dur 6011        ro 3000
50      dur 3000        ro 15015
Here are the settings:

Code: Select all

[14:25:14] starting job
[14:25:14]  + device /Users/van/Movies/DVD/A_NEW_HOPE
[14:25:14]  + title 1, chapter(s) 1 to 5
[14:25:14]  + 720x480 -> 720x364, crop 56/60/0/0
[14:25:14] work: VFR mode -- Switching FPS to 29.97 and detelecining.
[14:25:14]  + filters
[14:25:14]    + Detelecine (pullup) (default settings)
[14:25:14]  + 29.970 fps, video quality 0.50
[14:25:14]  + PixelRatio: 1, width:720, height: 364
[14:25:14]  + encoder x264
[14:25:14]    + x264 options: ref=2:bframes=2:subq=5:me=umh
[14:25:14]  + audio 160 kbps, 48000 Hz
[14:25:14]  + encoder faac
[14:25:14]  + dynamic range compression: 1.000000
[14:25:14]    + 80bd, English (AC3) (5.1 ch)
[14:25:14]      + Requested mixdown: Dolby Pro Logic II (HB_AMIXDOWN_DOLBYPLII)
[14:25:14]      + Actual mixdown: Dolby Pro Logic II (HB_AMIXDOWN_DOLBYPLII)
This is probably worst case for a stateless ivtc since the initial scenes are very black & very similar. The 2*3754 = 7508 init_delay (which you can see on the render offset of the first frame) is less than half what's needed . I can't make much sense of the pattern of durations but I haven't spent any time looking at detelecine.c. I did add code to encx264.c to hb_log the first 1000 incoming frame durations & these are the durations (+-1 for sample rate conversion errs). This part of the trace only shows durations up to 6000 (2 frames merged?) but later on they get up to 9000 (3 frames merged?).

I don't know if there's a bug here. I would never expect ivtc, with or without vfr, to work on progressive content since it's not designed for that. I'm amazed it works as well as it does. But if people want to use it that way we should probably bump init_delay way up.

BTW, here's the very quick-and-dirty awk script:

Code: Select all

/^sampleId/ {
    t = substr($6, 1, index($6,"(")-1) * 90000 / 48000
    s = substr($2, 1, length($2)-1)
    if (s+0 > 1)
        printf("%d\tdur %d\tro %d\n", s-1, t-l, r)
    l = t
    r = $7 * 90000 / 48000
}
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Using the right frame durations

Post by jbrjake »

So...I've only recently realized just how much my earlier VFR code depended upon the sync.c code van removed.

What I'm doing in render.c assumes that frames are always 3003. This was an ideal state...I didn't know how naive I was being. Sync.c was ensuring that the frames were 3003 -- they weren't, in the source.

NightStorm's alerteed me to issues with the first episode of Futurama, so I've been using that as test material. It has both progressive *and* non-progressive frames with durations as close as 3004 ticks and as far off as 4508 ticks, and has hard telecining in *progressive* frames. Between this and what van's reported from Star Wars, it's obvious something's got to change.

My first thought was to just ignore pullup's requests to drop frames, when they were progressive. Then I realized the hard telecined frames had erratic durations as well. So I tried adding a lost_time counter to the private struct in render.c that's the true stop-start duration for the dropped frame, and every time it's 3003/4 long, add it to an outgoing frame's duration. Still I had to disable dropping progressive frames; not only are they large, sometimes pullup is dropping more than 1 out of 5 and things go out of sync.

Anyway, as an interim fix, this encoded the Futurama pilot okay, but with some interlacing artifacts in the hard telecined progressive bits.

Finally what I've hit upon is this:

Keep a lost_time array with four slots. Every time a frame is dropped, add 1/4th its duration to 3 of those lost_time slots, and whatever's left to the 4th. Then, every time there's something in the 4th, add it to an outgoing frame's duration. Then, rotate the lost_time array so what was the 3rd is now the 4th, etc, and zero out the 1st.

Code: Select all

Index: libhb/render.c
===================================================================
--- libhb/render.c	(revision 1234)
+++ libhb/render.c	(working copy)
@@ -20,11 +20,13 @@
     hb_buffer_t        * buf_scale;
     hb_fifo_t          * subtitle_queue;
     hb_fifo_t          * delay_queue;
-    int                  frames_to_extend;
     int                  dropped_frames;
     int                  extended_frames;
     uint64_t             last_start[4];
     uint64_t             last_stop[4];
+    uint64_t             lost_time[4];
+    uint64_t             total_lost_time;
+    uint64_t             total_gained_time;
 };
 
 int  renderInit( hb_work_object_t *, hb_job_t * );
@@ -280,9 +282,22 @@
             {
                 if( job->vfr )
                 {
-                    pv->frames_to_extend += 4;
+                    /* We need to compensate for the time lost by dropping this frame.
+                       Spread its duration out in quarters, because usually dropped frames
+                       maintain a 1-out-of-5 pattern and this spreads it out amongst the remaining ones.
+                       Store these in the lost_time array, which has 4 slots in it.
+                       Because not every frame duration divides evenly by 4, and we can't lose the
+                       remainder, we have to go through an awkward process to preserve it in the 4th array index. */
+                    uint64_t temp_duration = buf_tmp_out->stop - buf_tmp_out->start;
+                    pv->lost_time[0] += (temp_duration / 4);
+                    pv->lost_time[1] += (temp_duration / 4);
+                    pv->lost_time[2] += (temp_duration / 4);
+                    pv->lost_time[3] += ( temp_duration - (temp_duration / 4) - (temp_duration / 4) - (temp_duration / 4) );
+                    
+                    pv->total_lost_time += temp_duration;
                     pv->dropped_frames++;
-                    hb_fifo_get( pv->subtitle_queue );
+                    
+                    hb_fifo_get( pv->subtitle_queue );                    
                     buf_tmp_in = NULL;
                 }
                 else
@@ -385,13 +400,13 @@
 
     if( *buf_out )
     {
-        if( pv->frames_to_extend )
+        if( pv->lost_time[3] > 0 )
         {
             /*
              * A frame's been dropped by VFR detelecine.
              * Gotta make up the lost time. This will also
-             * slow down the video to 23.976fps.
-             * The dropped frame ran for 3003 ticks, so
+             * slow down the video.
+             * The dropped frame's has to be accounted for, so
              * divvy it up amongst the 4 frames left behind.
              * This is what the delay_queue is for;
              * telecined sequences start 2 frames before
@@ -401,25 +416,37 @@
              */
             ivtc_buffer = *buf_out;
             
-            /* The 4th cached frame will be the to use. */
+            /* We can't use the given time stamps. Previous frames
+               might already have been extended, throwing off the
+               raw values fed to render.c. Instead, their
+               stop and start times are stored in arrays.
+               The 4th cached frame will be the to use. */
             ivtc_buffer->start = pv->last_start[3];
             ivtc_buffer->stop = pv->last_stop[3];
+           
+           /* Extend the frame's duration by the value in the
+              4th slot of the lost_time array. */
+            ivtc_buffer->stop += pv->lost_time[3];
+            pv->total_gained_time += pv->lost_time[3];
 
-            if (pv->frames_to_extend % 4)
-                ivtc_buffer->stop += 751;
-            else
-                ivtc_buffer->stop += 750;
-            
-            /* Set the 3rd cached frame to start when this one stops,
-               and to stop 3003 ticks later -- a normal 29.97fps
-               length frame. If it needs to be extended as well to
-               make up lost time, it'll be handled on the next
-               loop through the renderer.                            */
+            /* Set the 3rd cached frame to start when this one stops.
+               If it needs to be extended as well to make up lost time,
+               it'll be handled on the next loop through the renderer.  */
             int temp_duration = pv->last_stop[2] - pv->last_start[2];
             pv->last_start[2] = ivtc_buffer->stop;
             pv->last_stop[2] = ivtc_buffer->stop + temp_duration;
             
-            pv->frames_to_extend--;
+            /* We've pulled the 4th value from the lost_time array
+               and added it to the outgoing frame. Now, rotate the
+               array so the 4th slot now hold's the 3rd's value, and
+               so on down the line, and set the 0 indez to a value of 0. */
+            int i;
+            for( i=2; i >=  0; i--)
+            {
+                pv->lost_time[i+1] = pv->lost_time[i];
+            }
+            pv->lost_time[0] = 0;
+            
             pv->extended_frames++;
         }
 
@@ -431,11 +458,11 @@
 void renderClose( hb_work_object_t * w )
 {
     hb_work_private_t * pv = w->private_data;   
-        
-    hb_log("render: dropped frames: %i (%i ticks)", pv->dropped_frames, (pv->dropped_frames * 3003) );
-    hb_log("render: extended frames: %i (%i ticks)", pv->extended_frames, ( ( pv->extended_frames / 4 ) * 3003 ) );
-    hb_log("render: Lost time: %i frames (%i ticks)", (pv->dropped_frames * 4) - (pv->extended_frames), (pv->dropped_frames * 3003) - ( ( pv->extended_frames / 4 ) * 3003 ) );
 
+    hb_log("render: lost time: %lld (%i frames)", pv->total_lost_time, pv->dropped_frames);
+    hb_log("render: gained time: %lld (%i frames) (%lld not accounted for)", pv->total_gained_time, pv->extended_frames, pv->total_lost_time - pv->total_gained_time);
+    hb_log("render: average dropped frame duration: %i", pv->total_lost_time / pv->dropped_frames);
+    
     /* Cleanup subtitle queue */
     if( pv->subtitle_queue )
     {
@@ -476,12 +503,17 @@
     /* Setup FIFO queue for subtitle cache */
     pv->subtitle_queue = hb_fifo_init( 8 );    
     pv->delay_queue = hb_fifo_init( 8 );
-    pv->frames_to_extend = 0;
     pv->dropped_frames = 0;
     pv->extended_frames = 0;
     pv->last_start[0] = 0;
     pv->last_stop[0] = 0;
-    
+    pv->total_lost_time = 0;
+    pv->total_gained_time = 0;
+    pv->lost_time[0] = 0;
+    pv->lost_time[1] = 0;
+    pv->lost_time[2] = 0;
+    pv->lost_time[3] = 0;
+
     /* Setup filters */
     /* TODO: Move to work.c? */
     if( job->filters )
This encodes the Futurama pilot fine, and does a passable job on rhester's ipod-killer vob (the beginning of From the Earth to the Moon). NightStorm has successfully tested it on some other Futurama eps.

I'd still like it if some more people tested it before I checked it in. Cavalicious, this should work well on those mixed interlaced/progressive streams I was warning you off of in prior weeks. Just maybe a really quick stutter as it goes from one to the other.

And remember: since van's b-frame patch from earlier in this thread hasn't been committed yet, this will still produce very bad results if you use VFR with b-frames.
cvk_b
Veteran User
Posts: 527
Joined: Sun Mar 18, 2007 2:11 am

applied

Post by cvk_b »

music video before

Code: Select all

[14:28:21] render: dropped frames: 204 (612612 ticks)
[14:28:21] render: extended frames: 816 (612612 ticks)
[14:28:21] render: Lost time: 0 frames (0 ticks)
music video after

Code: Select all

[10:05:42] render: lost time: 626151 (204 frames)
[10:05:42] render: gained time: 626151 (792 frames) (0 not accounted for)
[10:05:42] render: average dropped frame duration: 0
no b-frames used. the outcome improved in this case. fwiw, I have the futurama season 1 set also.
nightstrm
Veteran User
Posts: 1887
Joined: Fri Mar 23, 2007 5:43 am

Re: "Same as source" FPS Improvements

Post by nightstrm »

The latest patch (from jbrjake's post above) also fixed a/v sync issues with The Simpsons Season 1 DVDs I noticed with the earlier VFR patches. I've got more animation encodes queued up (got to encode something while I wait for Apple to show us how the AppleTV will handle anamorphic and maybe 5.1) and will post anything interesting.

None of my testing includes b-frame options (as I'm looking for multiple-device compatibility).
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Re: "Same as source" FPS Improvements

Post by jbrjake »

Nice results =)

As another example, I ran it on the entirely progressive film Run Ronnie Run.

Code: Select all

[13:03:22] render: lost time: 94603 (25 frames)
[13:03:22] render: gained time: 94603 (97 frames) (0 not accounted for)
[13:03:22] render: average dropped frame duration: 3784
Now, without this new patch, it would have dropped 25 frames still, but it wouldn't have summed up the true total dropped time to 94603. Instead it would have calculated the total lost time as 25*3003. So instead of extending the duration by 94603 ticks, it would have extended by 75075 ticks.

Also, note how it extended 97 frames, not 100 (25*4). That indicates pullup, once or twice, dropped more than 1 frame out of 5. The old code wouldn't have handled that, either.

Now it's time to turn again to b-frames...
sdm
Bright Spark User
Posts: 194
Joined: Mon Feb 19, 2007 4:53 pm

Re: "Same as source" FPS Improvements

Post by sdm »

good results (smooth, in-order, not combed, a/v sync) for my test source (Sopranos season 6) without b-frames. Sorry I don't have exact details like you provided.

Yes, B-frames too please!
--sdm.
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Re: "Same as source" FPS Improvements

Post by jbrjake »

van wrote:There's a patch at http://pastebin.ca/870061. I don't have any telecined content handy to test with. I've tested with & without vfr on progressive content & it seems to work.
Well, van, I applied it, and increased the size of the initial delay (to 10000, which should be okay if the frames never get larger than 9009), and I can't say it's worked for me :(

Whenever you're testing these b-frame things, it's important to stress it out. That means using the most intensive b-frame settings possible from a frame-reordering point of view:

b-frames=16:b-pyramid:no-b-adapt

I get a blank white screen when I try to play it in QuickTime, which doesn't ever go away, even if I skip 10s to the 2nd keyframe. The audio plays fine. VLC will play a few seconds, then the video hangs while the audio plays on.

If I only encode a very short segment, it plays fine when I lower the number of bframes to 14. But if I encode a whole 8 minutes of video with 14 b-frames, it still blanks out.
van
Veteran User
Posts: 417
Joined: Wed Aug 29, 2007 6:35 am

Re: "Same as source" FPS Improvements

Post by van »

Sorry jbrjake. My simpler test case was actually doing a good job of showing this problem but I had somehow convinced myself that it was due to the vfr code so my brain was stuck in "it's not my bug" mode. As soon as I took another look with a slightly more open mind it was obvious that the vfr code was fine and, in fact, it was entirely my bug. Ugh.

The problem was that the way the h.264 & mp4 renderOffset works assumes that the video frame times are continuous and dense (i.e., frame N's start is always equal to frame N-1's stop) in *DTS* order. But the old dts_start / dts_stop arrays gave a timing that was continuous in PTS order (since the frames coming into the encoder are in presentation order & we grabbed their time stamps on input). My patch half fixed this - it mapped the durations to DTS order but used PTS ordered start times (from dts_start). The result was frames with weird overlaps and inconsistent stop/start times that caused muxmp4 to go bananas. The attached patch fixes this.

Things seem to work great with your new render code with a variety of settings for b-frames, b-pyramid & b-adapt, including "b-frames=16:b-pyramid:no-b-adapt". The only problem I observed was what looks like some kind of start-up issue in render.c. I added a debugging check in encx264.c to verify that the incoming frame times were continuous (current frame's start was the same as the previous frame's stop). I noticed that I would get this "continuity error" message in the log on the fifth frame when encoding progressive content with vfr. So I added code to render.c to print the start & stop time stamps of the incoming buffers & outgoing buffers of the first 50 frames. For an encode of "Star Wars, ep. 4" chapter 5 here are the timestamps of the first 10 frames going into render.c:

Code: Select all

[22:16:36] render-in strt 0 stop 4504 dur 4504
[22:16:36] render-in strt 4504 stop 7507 dur 3003
[22:16:36] render-in strt 7507 stop 12011 dur 4504
[22:16:36] render-in strt 12011 stop 15014 dur 3003
[22:16:36] render-in strt 15014 stop 19518 dur 4504
[22:16:36] render-in strt 19518 stop 22521 dur 3003
[22:16:36] render-in strt 22521 stop 27025 dur 4504
[22:16:37] render-in strt 27025 stop 30028 dur 3003
[22:16:37] render-in strt 30028 stop 34532 dur 4504
[22:16:37] render-in strt 34532 stop 37535 dur 3003
Here are the first 10 frames coming out of render.c:

Code: Select all

[22:16:36] render-out strt 0 stop 753 dur 753
[22:16:36] render-out strt 753 stop 6007 dur 5254
[22:16:37] render-out strt 6007 stop 11261 dur 5254
[22:16:37] render-out strt 11261 stop 15014 dur 3753
[22:16:37] render-out strt 19518 stop 22521 dur 3003
[22:16:37] render-out strt 22521 stop 27025 dur 4504
[22:16:37] render-out strt 27025 stop 30028 dur 3003
[22:16:37] render-out strt 30028 stop 34532 dur 4504
[22:16:37] render-out strt 34532 stop 37535 dur 3003
[22:16:37] render-out strt 37535 stop 42039 dur 4504
The input is a simple alternation of 3003 & 4504 (for an average of 3753.5). The first three frames output have strange durations, the fourth has the 3753 duration then the fifth input frame (starting time 15014) gets dropped. Starting with the fifth frame output (starting time 19518) through the end of the encode, all the output times are exactly the same as the input times (which is what I would expect for progressive content with your new code). So the only issue seems to be with the first five frames.

Anyway, here's the new patch:

http://pastebin.ca/887364

Code: Select all

Index: libhb/encx264.c
===================================================================
--- libhb/encx264.c	(revision 1244)
+++ libhb/encx264.c	(working copy)
@@ -25,6 +25,24 @@
 
 #define DTS_BUFFER_SIZE 32 
 
+/*
+ * The frame info struct remembers information about each frame across calls
+ * to x264_encoder_encode. Since frames are uniquely identified by their
+ * timestamp, we use some bits of the timestamp as an index. The LSB is
+ * chosen so that two successive frames will have different values in the
+ * bits over any plausible range of frame rates. (Starting with bit 9 allows
+ * any frame rate slower than 175fps.) The MSB determines the size of the array.
+ * It is chosen so that two frames can't use the same slot during the 
+ * encoder's max frame delay (set by the standard as 16 frames) and so
+ * that, up to some minimum frame rate, frames are guaranteed to map to
+ * different slots. (An MSB of 16 which is 2^(16-9+1) = 256 slots guarantees
+ * no collisions down to a rate of 1.4 fps).
+ */
+#define FRAME_INFO_MAX2 (9)     // 2^9 = 512; 90000/512 = 175 frames/sec
+#define FRAME_INFO_MIN2 (16)    // 2^16 = 65536; 90000/65536 = 1.4 frames/sec
+#define FRAME_INFO_SIZE (1 << (FRAME_INFO_MIN2 - FRAME_INFO_MAX2 + 1))
+#define FRAME_INFO_MASK (FRAME_INFO_SIZE - 1)
+
 struct hb_work_private_s
 {
     hb_job_t       * job;
@@ -32,15 +50,15 @@
     x264_picture_t   pic_in;
     uint8_t         *x264_allocated_pic;
 
-    // Internal queue of DTS start/stop values.
-    int64_t        dts_start[DTS_BUFFER_SIZE];
-    int64_t        dts_stop[DTS_BUFFER_SIZE];
+    int64_t        dts_next;    // DTS start time value for next output frame
+    int64_t        last_stop;   // Debugging - stop time of previous input frame
     int64_t        init_delay;
-
-    int64_t        dts_write_index;
-    int64_t        dts_read_index;
     int64_t        next_chap;
 
+    struct {
+        int64_t duration;
+    } frame_info[FRAME_INFO_SIZE];
+
     char             filename[1024];
 };
 
@@ -247,8 +265,7 @@
 
     pv->x264_allocated_pic = pv->pic_in.img.plane[0];
 
-    pv->dts_write_index = 0;
-    pv->dts_read_index = 0;
+    pv->dts_next = -1;
     pv->next_chap = 0;
 
     if (job->areBframes)
@@ -268,7 +285,7 @@
            will use the duration of frames running at 23.976fps instead.. */
         if (job->vfr)
         {
-            pv->init_delay = 3754;
+            pv->init_delay = 7506;
         }
    
         /* The delay is 2 frames for regular b-frames, 3 for b-pyramid.
@@ -296,6 +313,22 @@
     /* TODO */
 }
 
+/*
+ * see comments in definition of 'frame_info' in pv struct for description
+ * of what these routines are doing.
+ */
+static void save_frame_info( hb_work_private_t * pv, hb_buffer_t * in )
+{
+    int i = (in->start >> FRAME_INFO_MAX2) & FRAME_INFO_MASK;
+    pv->frame_info[i].duration = in->stop - in->start;
+}
+
+static int64_t get_frame_duration( hb_work_private_t * pv, int64_t pts )
+{
+    int i = (pts >> FRAME_INFO_MAX2) & FRAME_INFO_MASK;
+    return pv->frame_info[i].duration;
+}
+
 int encx264Work( hb_work_object_t * w, hb_buffer_t ** buf_in,
                   hb_buffer_t ** buf_out )
 {
@@ -330,6 +363,16 @@
                 job->height / 4;
         }
 
+        if( pv->dts_next == -1 )
+        {
+            /* we don't have a start time yet so use the first frame's 
+             * start. All other frame times will be determined by the
+             * sum of the prior output frame durations in *DTS* order
+             * (not by the order they arrive here). This timing change is
+             * essential for VFR with b-frames, but a complete nop * otherwise.
+             */
+            pv->dts_next = in->start;
+        }
         if( in->new_chap && job->chapter_markers )
         {
             /* chapters have to start with an IDR frame so request that this
@@ -351,11 +394,22 @@
         }
         pv->pic_in.i_qpplus1 = 0;
 
-        // Remember current PTS value, use as DTS later
-        pv->dts_start[pv->dts_write_index & (DTS_BUFFER_SIZE-1)] = in->start;
-        pv->dts_stop[pv->dts_write_index & (DTS_BUFFER_SIZE-1)]  = in->stop;
-        pv->dts_write_index++;
+        /* XXX this is temporary debugging code to check that the upstream
+         * modules (render & sync) have generated a continuous, self-consistent
+         * frame stream with the current frame's start time equal to the
+         * previous frame's stop time.
+         */
+        if( pv->last_stop != in->start )
+        {
+            hb_log("encx264 input continuity err: last stop %lld  start %lld",
+                    pv->last_stop, in->start);
+        }
+        pv->last_stop = in->stop;
 
+        // Remember info about this frame that we need to pass across 
+        // the x264_encoder_encode call (since it reorders frames).
+        save_frame_info( pv, in );
+
         /* Feed the input DTS to x264 so it can figure out proper output PTS */
         pv->pic_in.i_pts = in->start;
 
@@ -386,17 +440,15 @@
         /* Should be way too large */
         buf        = hb_buffer_init( 3 * job->width * job->height / 2 );
         buf->size  = 0;
-        buf->start = in->start;
-        buf->stop  = in->stop;
         buf->frametype   = 0;
 
-        int64_t dts_start, dts_stop;
-
         /* Get next DTS value to use */
-        dts_start = pv->dts_start[pv->dts_read_index & (DTS_BUFFER_SIZE-1)];
-        dts_stop  = pv->dts_stop[pv->dts_read_index & (DTS_BUFFER_SIZE-1)];
-        pv->dts_read_index++;
+        int64_t dts_start = pv->dts_next;
 
+        /* compute the stop time based on the original frame's duration */
+        int64_t dts_stop  = dts_start + get_frame_duration( pv, pic_out.i_pts );
+        pv->dts_next = dts_stop;
+
         for( i = 0; i < i_nal; i++ )
         {
             int size, data;
@@ -469,11 +521,7 @@
 
                     /* Store the output presentation time stamp
                        from x264 for use by muxmp4 in off-setting
-                       b-frames with the CTTS atom.
-                       For now, just add 1000000 to the offset so that the
-                       value is pretty much guaranteed to be positive.  The
-                       muxing code will minimize the renderOffset at the end. */
-
+                       b-frames with the CTTS atom. */
                     buf->renderOffset = pic_out.i_pts - dts_start + pv->init_delay;
 
                     /* Send out the next dts values */

jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Re: "Same as source" FPS Improvements

Post by jbrjake »

Awesome, van. New patch works great =)

So, for testing purposes, here's a two-fer-one combo: van's bframe changes and my changes to the vfr render code:
http://pastebin.ca/887722

Code: Select all

Index: libhb/encx264.c
===================================================================
--- libhb/encx264.c	(revision 1244)
+++ libhb/encx264.c	(working copy)
@@ -25,6 +25,24 @@
 
 #define DTS_BUFFER_SIZE 32 
 
+/*
+ * The frame info struct remembers information about each frame across calls
+ * to x264_encoder_encode. Since frames are uniquely identified by their
+ * timestamp, we use some bits of the timestamp as an index. The LSB is
+ * chosen so that two successive frames will have different values in the
+ * bits over any plausible range of frame rates. (Starting with bit 9 allows
+ * any frame rate slower than 175fps.) The MSB determines the size of the array.
+ * It is chosen so that two frames can't use the same slot during the 
+ * encoder's max frame delay (set by the standard as 16 frames) and so
+ * that, up to some minimum frame rate, frames are guaranteed to map to
+ * different slots. (An MSB of 16 which is 2^(16-9+1) = 256 slots guarantees
+ * no collisions down to a rate of 1.4 fps).
+ */
+#define FRAME_INFO_MAX2 (9)     // 2^9 = 512; 90000/512 = 175 frames/sec
+#define FRAME_INFO_MIN2 (16)    // 2^16 = 65536; 90000/65536 = 1.4 frames/sec
+#define FRAME_INFO_SIZE (1 << (FRAME_INFO_MIN2 - FRAME_INFO_MAX2 + 1))
+#define FRAME_INFO_MASK (FRAME_INFO_SIZE - 1)
+
 struct hb_work_private_s
 {
     hb_job_t       * job;
@@ -32,15 +50,15 @@
     x264_picture_t   pic_in;
     uint8_t         *x264_allocated_pic;
 
-    // Internal queue of DTS start/stop values.
-    int64_t        dts_start[DTS_BUFFER_SIZE];
-    int64_t        dts_stop[DTS_BUFFER_SIZE];
+    int64_t        dts_next;    // DTS start time value for next output frame
+    int64_t        last_stop;   // Debugging - stop time of previous input frame
     int64_t        init_delay;
-
-    int64_t        dts_write_index;
-    int64_t        dts_read_index;
     int64_t        next_chap;
 
+    struct {
+        int64_t duration;
+    } frame_info[FRAME_INFO_SIZE];
+
     char             filename[1024];
 };
 
@@ -247,8 +265,7 @@
 
     pv->x264_allocated_pic = pv->pic_in.img.plane[0];
 
-    pv->dts_write_index = 0;
-    pv->dts_read_index = 0;
+    pv->dts_next = -1;
     pv->next_chap = 0;
 
     if (job->areBframes)
@@ -268,7 +285,7 @@
            will use the duration of frames running at 23.976fps instead.. */
         if (job->vfr)
         {
-            pv->init_delay = 3754;
+            pv->init_delay = 7506;
         }
    
         /* The delay is 2 frames for regular b-frames, 3 for b-pyramid.
@@ -296,6 +313,22 @@
     /* TODO */
 }
 
+/*
+ * see comments in definition of 'frame_info' in pv struct for description
+ * of what these routines are doing.
+ */
+static void save_frame_info( hb_work_private_t * pv, hb_buffer_t * in )
+{
+    int i = (in->start >> FRAME_INFO_MAX2) & FRAME_INFO_MASK;
+    pv->frame_info[i].duration = in->stop - in->start;
+}
+
+static int64_t get_frame_duration( hb_work_private_t * pv, int64_t pts )
+{
+    int i = (pts >> FRAME_INFO_MAX2) & FRAME_INFO_MASK;
+    return pv->frame_info[i].duration;
+}
+
 int encx264Work( hb_work_object_t * w, hb_buffer_t ** buf_in,
                   hb_buffer_t ** buf_out )
 {
@@ -330,6 +363,16 @@
                 job->height / 4;
         }
 
+        if( pv->dts_next == -1 )
+        {
+            /* we don't have a start time yet so use the first frame's 
+             * start. All other frame times will be determined by the
+             * sum of the prior output frame durations in *DTS* order
+             * (not by the order they arrive here). This timing change is
+             * essential for VFR with b-frames, but a complete nop * otherwise.
+             */
+            pv->dts_next = in->start;
+        }
         if( in->new_chap && job->chapter_markers )
         {
             /* chapters have to start with an IDR frame so request that this
@@ -351,11 +394,22 @@
         }
         pv->pic_in.i_qpplus1 = 0;
 
-        // Remember current PTS value, use as DTS later
-        pv->dts_start[pv->dts_write_index & (DTS_BUFFER_SIZE-1)] = in->start;
-        pv->dts_stop[pv->dts_write_index & (DTS_BUFFER_SIZE-1)]  = in->stop;
-        pv->dts_write_index++;
+        /* XXX this is temporary debugging code to check that the upstream
+         * modules (render & sync) have generated a continuous, self-consistent
+         * frame stream with the current frame's start time equal to the
+         * previous frame's stop time.
+         */
+        if( pv->last_stop != in->start )
+        {
+            hb_log("encx264 input continuity err: last stop %lld  start %lld",
+                    pv->last_stop, in->start);
+        }
+        pv->last_stop = in->stop;
 
+        // Remember info about this frame that we need to pass across 
+        // the x264_encoder_encode call (since it reorders frames).
+        save_frame_info( pv, in );
+
         /* Feed the input DTS to x264 so it can figure out proper output PTS */
         pv->pic_in.i_pts = in->start;
 
@@ -386,17 +440,15 @@
         /* Should be way too large */
         buf        = hb_buffer_init( 3 * job->width * job->height / 2 );
         buf->size  = 0;
-        buf->start = in->start;
-        buf->stop  = in->stop;
         buf->frametype   = 0;
 
-        int64_t dts_start, dts_stop;
-
         /* Get next DTS value to use */
-        dts_start = pv->dts_start[pv->dts_read_index & (DTS_BUFFER_SIZE-1)];
-        dts_stop  = pv->dts_stop[pv->dts_read_index & (DTS_BUFFER_SIZE-1)];
-        pv->dts_read_index++;
+        int64_t dts_start = pv->dts_next;
 
+        /* compute the stop time based on the original frame's duration */
+        int64_t dts_stop  = dts_start + get_frame_duration( pv, pic_out.i_pts );
+        pv->dts_next = dts_stop;
+
         for( i = 0; i < i_nal; i++ )
         {
             int size, data;
@@ -469,11 +521,7 @@
 
                     /* Store the output presentation time stamp
                        from x264 for use by muxmp4 in off-setting
-                       b-frames with the CTTS atom.
-                       For now, just add 1000000 to the offset so that the
-                       value is pretty much guaranteed to be positive.  The
-                       muxing code will minimize the renderOffset at the end. */
-
+                       b-frames with the CTTS atom. */
                     buf->renderOffset = pic_out.i_pts - dts_start + pv->init_delay;
 
                     /* Send out the next dts values */
Index: libhb/render.c
===================================================================
--- libhb/render.c	(revision 1244)
+++ libhb/render.c	(working copy)
@@ -20,11 +20,13 @@
     hb_buffer_t        * buf_scale;
     hb_fifo_t          * subtitle_queue;
     hb_fifo_t          * delay_queue;
-    int                  frames_to_extend;
     int                  dropped_frames;
     int                  extended_frames;
     uint64_t             last_start[4];
     uint64_t             last_stop[4];
+    uint64_t             lost_time[4];
+    uint64_t             total_lost_time;
+    uint64_t             total_gained_time;
 };
 
 int  renderInit( hb_work_object_t *, hb_job_t * );
@@ -280,9 +282,22 @@
             {
                 if( job->vfr )
                 {
-                    pv->frames_to_extend += 4;
+                    /* We need to compensate for the time lost by dropping this frame.
+                       Spread its duration out in quarters, because usually dropped frames
+                       maintain a 1-out-of-5 pattern and this spreads it out amongst the remaining ones.
+                       Store these in the lost_time array, which has 4 slots in it.
+                       Because not every frame duration divides evenly by 4, and we can't lose the
+                       remainder, we have to go through an awkward process to preserve it in the 4th array index. */
+                    uint64_t temp_duration = buf_tmp_out->stop - buf_tmp_out->start;
+                    pv->lost_time[0] += (temp_duration / 4);
+                    pv->lost_time[1] += (temp_duration / 4);
+                    pv->lost_time[2] += (temp_duration / 4);
+                    pv->lost_time[3] += ( temp_duration - (temp_duration / 4) - (temp_duration / 4) - (temp_duration / 4) );
+                    
+                    pv->total_lost_time += temp_duration;
                     pv->dropped_frames++;
-                    hb_fifo_get( pv->subtitle_queue );
+                    
+                    hb_fifo_get( pv->subtitle_queue );                    
                     buf_tmp_in = NULL;
                 }
                 else
@@ -385,13 +400,13 @@
 
     if( *buf_out )
     {
-        if( pv->frames_to_extend )
+        if( pv->lost_time[3] > 0 )
         {
             /*
              * A frame's been dropped by VFR detelecine.
              * Gotta make up the lost time. This will also
-             * slow down the video to 23.976fps.
-             * The dropped frame ran for 3003 ticks, so
+             * slow down the video.
+             * The dropped frame's has to be accounted for, so
              * divvy it up amongst the 4 frames left behind.
              * This is what the delay_queue is for;
              * telecined sequences start 2 frames before
@@ -401,25 +416,37 @@
              */
             ivtc_buffer = *buf_out;
             
-            /* The 4th cached frame will be the to use. */
+            /* We can't use the given time stamps. Previous frames
+               might already have been extended, throwing off the
+               raw values fed to render.c. Instead, their
+               stop and start times are stored in arrays.
+               The 4th cached frame will be the to use. */
             ivtc_buffer->start = pv->last_start[3];
             ivtc_buffer->stop = pv->last_stop[3];
+           
+           /* Extend the frame's duration by the value in the
+              4th slot of the lost_time array. */
+            ivtc_buffer->stop += pv->lost_time[3];
+            pv->total_gained_time += pv->lost_time[3];
 
-            if (pv->frames_to_extend % 4)
-                ivtc_buffer->stop += 751;
-            else
-                ivtc_buffer->stop += 750;
-            
-            /* Set the 3rd cached frame to start when this one stops,
-               and to stop 3003 ticks later -- a normal 29.97fps
-               length frame. If it needs to be extended as well to
-               make up lost time, it'll be handled on the next
-               loop through the renderer.                            */
+            /* Set the 3rd cached frame to start when this one stops.
+               If it needs to be extended as well to make up lost time,
+               it'll be handled on the next loop through the renderer.  */
             int temp_duration = pv->last_stop[2] - pv->last_start[2];
             pv->last_start[2] = ivtc_buffer->stop;
             pv->last_stop[2] = ivtc_buffer->stop + temp_duration;
             
-            pv->frames_to_extend--;
+            /* We've pulled the 4th value from the lost_time array
+               and added it to the outgoing frame. Now, rotate the
+               array so the 4th slot now hold's the 3rd's value, and
+               so on down the line, and set the 0 indez to a value of 0. */
+            int i;
+            for( i=2; i >=  0; i--)
+            {
+                pv->lost_time[i+1] = pv->lost_time[i];
+            }
+            pv->lost_time[0] = 0;
+            
             pv->extended_frames++;
         }
 
@@ -431,11 +458,11 @@
 void renderClose( hb_work_object_t * w )
 {
     hb_work_private_t * pv = w->private_data;   
-        
-    hb_log("render: dropped frames: %i (%i ticks)", pv->dropped_frames, (pv->dropped_frames * 3003) );
-    hb_log("render: extended frames: %i (%i ticks)", pv->extended_frames, ( ( pv->extended_frames / 4 ) * 3003 ) );
-    hb_log("render: Lost time: %i frames (%i ticks)", (pv->dropped_frames * 4) - (pv->extended_frames), (pv->dropped_frames * 3003) - ( ( pv->extended_frames / 4 ) * 3003 ) );
 
+    hb_log("render: lost time: %lld (%i frames)", pv->total_lost_time, pv->dropped_frames);
+    hb_log("render: gained time: %lld (%i frames) (%lld not accounted for)", pv->total_gained_time, pv->extended_frames, pv->total_lost_time - pv->total_gained_time);
+    hb_log("render: average dropped frame duration: %lld", (pv->total_lost_time / pv->dropped_frames) );
+    
     /* Cleanup subtitle queue */
     if( pv->subtitle_queue )
     {
@@ -476,12 +503,17 @@
     /* Setup FIFO queue for subtitle cache */
     pv->subtitle_queue = hb_fifo_init( 8 );    
     pv->delay_queue = hb_fifo_init( 8 );
-    pv->frames_to_extend = 0;
     pv->dropped_frames = 0;
     pv->extended_frames = 0;
     pv->last_start[0] = 0;
     pv->last_stop[0] = 0;
-    
+    pv->total_lost_time = 0;
+    pv->total_gained_time = 0;
+    pv->lost_time[0] = 0;
+    pv->lost_time[1] = 0;
+    pv->lost_time[2] = 0;
+    pv->lost_time[3] = 0;
+
     /* Setup filters */
     /* TODO: Move to work.c? */
     if( job->filters )
Now, that should be the holy grail right there: one-touch vfr that correctly handles both hard *and* soft telecine, including heavily mixed content, while also allowing the heaviest frame-reordering that h.264 allows. =)
nightstrm
Veteran User
Posts: 1887
Joined: Fri Mar 23, 2007 5:43 am

Re: "Same as source" FPS Improvements

Post by nightstrm »

Awesome, I'll check it out when I get home.

I ran the last patch through some more heavily-mixed animation content (Simpsons, Season 2 - very reminiscent of the first Futurama episode), and it appears to have performed very well. I'm still going through the encodes, and will post if I find something out of the ordinary.
sdm
Bright Spark User
Posts: 194
Joined: Mon Feb 19, 2007 4:53 pm

Re: "Same as source" FPS Improvements

Post by sdm »

I think this is relevant testing on my part, but let me know if not.

I applied the new patch and tested with detlecine+vfr+b-frames on the same source as previous in this thread.

The result is that the frames-out-of-order issue has moved further in by a few.

before:
Image

after patch:
Image
notice the frame order is wrong, but in a different place. And notice the frame numbering has 16899 twice and no 16900

Hope this helps
Thanks
--sdm.
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Re: "Same as source" FPS Improvements

Post by jbrjake »

Okay, I think I've fixed those timestamp continuity errors you were seeing, van. Stupidity on my part in how I was keeping track of durations in render.c.

Here's an updated holistic patch:
http://pastebin.ca/887931

Code: Select all

Index: libhb/encx264.c
===================================================================
--- libhb/encx264.c	(revision 1244)
+++ libhb/encx264.c	(working copy)
@@ -25,6 +25,24 @@
 
 #define DTS_BUFFER_SIZE 32 
 
+/*
+ * The frame info struct remembers information about each frame across calls
+ * to x264_encoder_encode. Since frames are uniquely identified by their
+ * timestamp, we use some bits of the timestamp as an index. The LSB is
+ * chosen so that two successive frames will have different values in the
+ * bits over any plausible range of frame rates. (Starting with bit 9 allows
+ * any frame rate slower than 175fps.) The MSB determines the size of the array.
+ * It is chosen so that two frames can't use the same slot during the 
+ * encoder's max frame delay (set by the standard as 16 frames) and so
+ * that, up to some minimum frame rate, frames are guaranteed to map to
+ * different slots. (An MSB of 16 which is 2^(16-9+1) = 256 slots guarantees
+ * no collisions down to a rate of 1.4 fps).
+ */
+#define FRAME_INFO_MAX2 (9)     // 2^9 = 512; 90000/512 = 175 frames/sec
+#define FRAME_INFO_MIN2 (16)    // 2^16 = 65536; 90000/65536 = 1.4 frames/sec
+#define FRAME_INFO_SIZE (1 << (FRAME_INFO_MIN2 - FRAME_INFO_MAX2 + 1))
+#define FRAME_INFO_MASK (FRAME_INFO_SIZE - 1)
+
 struct hb_work_private_s
 {
     hb_job_t       * job;
@@ -32,15 +50,15 @@
     x264_picture_t   pic_in;
     uint8_t         *x264_allocated_pic;
 
-    // Internal queue of DTS start/stop values.
-    int64_t        dts_start[DTS_BUFFER_SIZE];
-    int64_t        dts_stop[DTS_BUFFER_SIZE];
+    int64_t        dts_next;    // DTS start time value for next output frame
+    int64_t        last_stop;   // Debugging - stop time of previous input frame
     int64_t        init_delay;
-
-    int64_t        dts_write_index;
-    int64_t        dts_read_index;
     int64_t        next_chap;
 
+    struct {
+        int64_t duration;
+    } frame_info[FRAME_INFO_SIZE];
+
     char             filename[1024];
 };
 
@@ -247,8 +265,7 @@
 
     pv->x264_allocated_pic = pv->pic_in.img.plane[0];
 
-    pv->dts_write_index = 0;
-    pv->dts_read_index = 0;
+    pv->dts_next = -1;
     pv->next_chap = 0;
 
     if (job->areBframes)
@@ -268,7 +285,7 @@
            will use the duration of frames running at 23.976fps instead.. */
         if (job->vfr)
         {
-            pv->init_delay = 3754;
+            pv->init_delay = 7506;
         }
    
         /* The delay is 2 frames for regular b-frames, 3 for b-pyramid.
@@ -296,6 +313,22 @@
     /* TODO */
 }
 
+/*
+ * see comments in definition of 'frame_info' in pv struct for description
+ * of what these routines are doing.
+ */
+static void save_frame_info( hb_work_private_t * pv, hb_buffer_t * in )
+{
+    int i = (in->start >> FRAME_INFO_MAX2) & FRAME_INFO_MASK;
+    pv->frame_info[i].duration = in->stop - in->start;
+}
+
+static int64_t get_frame_duration( hb_work_private_t * pv, int64_t pts )
+{
+    int i = (pts >> FRAME_INFO_MAX2) & FRAME_INFO_MASK;
+    return pv->frame_info[i].duration;
+}
+
 int encx264Work( hb_work_object_t * w, hb_buffer_t ** buf_in,
                   hb_buffer_t ** buf_out )
 {
@@ -330,6 +363,16 @@
                 job->height / 4;
         }
 
+        if( pv->dts_next == -1 )
+        {
+            /* we don't have a start time yet so use the first frame's 
+             * start. All other frame times will be determined by the
+             * sum of the prior output frame durations in *DTS* order
+             * (not by the order they arrive here). This timing change is
+             * essential for VFR with b-frames, but a complete nop * otherwise.
+             */
+            pv->dts_next = in->start;
+        }
         if( in->new_chap && job->chapter_markers )
         {
             /* chapters have to start with an IDR frame so request that this
@@ -351,11 +394,22 @@
         }
         pv->pic_in.i_qpplus1 = 0;
 
-        // Remember current PTS value, use as DTS later
-        pv->dts_start[pv->dts_write_index & (DTS_BUFFER_SIZE-1)] = in->start;
-        pv->dts_stop[pv->dts_write_index & (DTS_BUFFER_SIZE-1)]  = in->stop;
-        pv->dts_write_index++;
+        /* XXX this is temporary debugging code to check that the upstream
+         * modules (render & sync) have generated a continuous, self-consistent
+         * frame stream with the current frame's start time equal to the
+         * previous frame's stop time.
+         */
+        if( pv->last_stop != in->start )
+        {
+            hb_log("encx264 input continuity err: last stop %lld  start %lld",
+                    pv->last_stop, in->start);
+        }
+        pv->last_stop = in->stop;
 
+        // Remember info about this frame that we need to pass across 
+        // the x264_encoder_encode call (since it reorders frames).
+        save_frame_info( pv, in );
+
         /* Feed the input DTS to x264 so it can figure out proper output PTS */
         pv->pic_in.i_pts = in->start;
 
@@ -386,17 +440,15 @@
         /* Should be way too large */
         buf        = hb_buffer_init( 3 * job->width * job->height / 2 );
         buf->size  = 0;
-        buf->start = in->start;
-        buf->stop  = in->stop;
         buf->frametype   = 0;
 
-        int64_t dts_start, dts_stop;
-
         /* Get next DTS value to use */
-        dts_start = pv->dts_start[pv->dts_read_index & (DTS_BUFFER_SIZE-1)];
-        dts_stop  = pv->dts_stop[pv->dts_read_index & (DTS_BUFFER_SIZE-1)];
-        pv->dts_read_index++;
+        int64_t dts_start = pv->dts_next;
 
+        /* compute the stop time based on the original frame's duration */
+        int64_t dts_stop  = dts_start + get_frame_duration( pv, pic_out.i_pts );
+        pv->dts_next = dts_stop;
+
         for( i = 0; i < i_nal; i++ )
         {
             int size, data;
@@ -469,11 +521,7 @@
 
                     /* Store the output presentation time stamp
                        from x264 for use by muxmp4 in off-setting
-                       b-frames with the CTTS atom.
-                       For now, just add 1000000 to the offset so that the
-                       value is pretty much guaranteed to be positive.  The
-                       muxing code will minimize the renderOffset at the end. */
-
+                       b-frames with the CTTS atom. */
                     buf->renderOffset = pic_out.i_pts - dts_start + pv->init_delay;
 
                     /* Send out the next dts values */
Index: libhb/render.c
===================================================================
--- libhb/render.c	(revision 1244)
+++ libhb/render.c	(working copy)
@@ -20,11 +20,13 @@
     hb_buffer_t        * buf_scale;
     hb_fifo_t          * subtitle_queue;
     hb_fifo_t          * delay_queue;
-    int                  frames_to_extend;
     int                  dropped_frames;
     int                  extended_frames;
     uint64_t             last_start[4];
     uint64_t             last_stop[4];
+    uint64_t             lost_time[4];
+    uint64_t             total_lost_time;
+    uint64_t             total_gained_time;
 };
 
 int  renderInit( hb_work_object_t *, hb_job_t * );
@@ -280,9 +282,22 @@
             {
                 if( job->vfr )
                 {
-                    pv->frames_to_extend += 4;
+                    /* We need to compensate for the time lost by dropping this frame.
+                       Spread its duration out in quarters, because usually dropped frames
+                       maintain a 1-out-of-5 pattern and this spreads it out amongst the remaining ones.
+                       Store these in the lost_time array, which has 4 slots in it.
+                       Because not every frame duration divides evenly by 4, and we can't lose the
+                       remainder, we have to go through an awkward process to preserve it in the 4th array index. */
+                    uint64_t temp_duration = buf_tmp_out->stop - buf_tmp_out->start;
+                    pv->lost_time[0] += (temp_duration / 4);
+                    pv->lost_time[1] += (temp_duration / 4);
+                    pv->lost_time[2] += (temp_duration / 4);
+                    pv->lost_time[3] += ( temp_duration - (temp_duration / 4) - (temp_duration / 4) - (temp_duration / 4) );
+                    
+                    pv->total_lost_time += temp_duration;
                     pv->dropped_frames++;
-                    hb_fifo_get( pv->subtitle_queue );
+                    
+                    hb_fifo_get( pv->subtitle_queue );                    
                     buf_tmp_in = NULL;
                 }
                 else
@@ -304,8 +319,11 @@
             pv->last_start[i] = pv->last_start[i-1];
             pv->last_stop[i] = pv->last_stop[i-1];
         }
-        pv->last_start[0] = in->start;
-        pv->last_stop[0] = in->stop;
+        
+        /* In order to make sure we have continuous time stamps, store
+           the current frame's duration as starting when the last one stopped. */
+        pv->last_start[0] = pv->last_stop[1];
+        pv->last_stop[0] = pv->last_start[0] + (in->stop - in->start);
     }
     
     /* Apply subtitles */
@@ -385,13 +403,17 @@
 
     if( *buf_out )
     {
-        if( pv->frames_to_extend )
+        /* The current frame exists. That means it hasn't been dropped by a filter.
+           Make it accessible as ivtc_buffer so we can edit its duration if needed. */
+        ivtc_buffer = *buf_out;
+
+        if( pv->lost_time[3] > 0 )
         {
             /*
-             * A frame's been dropped by VFR detelecine.
+             * A frame's been dropped earlier by VFR detelecine.
              * Gotta make up the lost time. This will also
-             * slow down the video to 23.976fps.
-             * The dropped frame ran for 3003 ticks, so
+             * slow down the video.
+             * The dropped frame's has to be accounted for, so
              * divvy it up amongst the 4 frames left behind.
              * This is what the delay_queue is for;
              * telecined sequences start 2 frames before
@@ -399,29 +421,53 @@
              * ones you need a 2 frame delay between
              * reading input and writing output.
              */
-            ivtc_buffer = *buf_out;
+                                   
+            /* We want to extend the outputted frame's duration by the value
+              stored in the 4th slot of the lost_time array. Because we need
+              to adjust all the values in the array so they're contiguous,
+              extend the duration inside the array first, before applying
+              it to the current frame buffer. */
+            pv->last_stop[3] += pv->lost_time[3];
             
-            /* The 4th cached frame will be the to use. */
-            ivtc_buffer->start = pv->last_start[3];
-            ivtc_buffer->stop = pv->last_stop[3];
-
-            if (pv->frames_to_extend % 4)
-                ivtc_buffer->stop += 751;
-            else
-                ivtc_buffer->stop += 750;
+            /* Log how much time has been added back in to the video. */
+            pv->total_gained_time += pv->lost_time[3];
             
-            /* Set the 3rd cached frame to start when this one stops,
-               and to stop 3003 ticks later -- a normal 29.97fps
-               length frame. If it needs to be extended as well to
-               make up lost time, it'll be handled on the next
-               loop through the renderer.                            */
-            int temp_duration = pv->last_stop[2] - pv->last_start[2];
-            pv->last_start[2] = ivtc_buffer->stop;
-            pv->last_stop[2] = ivtc_buffer->stop + temp_duration;
+            /* We've pulled the 4th value from the lost_time array
+               and added it to the last_stop array's 4th slot. Now, rotate the
+                lost_time array so the 4th slot now holds the 3rd's value, and
+               so on down the line, and set the 0 index to a value of 0. */
+            int i;
+            for( i=2; i >=  0; i--)
+            {
+                pv->lost_time[i+1] = pv->lost_time[i];
+            }
+            pv->lost_time[0] = 0;
             
-            pv->frames_to_extend--;
+            /* Log how many frames have had their durations extended. */
             pv->extended_frames++;
         }
+        
+        /* We can't use the given time stamps. Previous frames
+           might already have been extended, throwing off the
+           raw values fed to render.c. Instead, their
+           stop and start times are stored in arrays.
+           The 4th cached frame will be the to use.
+           If it needed its duration extended to make up
+           lost time, it will have happened above. */
+        ivtc_buffer->start = pv->last_start[3];
+        ivtc_buffer->stop = pv->last_stop[3];
+        
+        /* Set the 3rd cached frame to start when this one stops,
+           and so on down the line. If any of them need to be
+           extended as well to make up lost time, it'll be handled
+           on the next loop through the renderer.  */
+        int i;   
+        for (i = 2; i >= 0; i--)
+        {
+            int temp_duration = pv->last_stop[i] - pv->last_start[i];
+            pv->last_start[i] = pv->last_stop[i+1];
+            pv->last_stop[i] = pv->last_start[i] + temp_duration;
+        }
 
     }
 
@@ -431,11 +477,11 @@
 void renderClose( hb_work_object_t * w )
 {
     hb_work_private_t * pv = w->private_data;   
-        
-    hb_log("render: dropped frames: %i (%i ticks)", pv->dropped_frames, (pv->dropped_frames * 3003) );
-    hb_log("render: extended frames: %i (%i ticks)", pv->extended_frames, ( ( pv->extended_frames / 4 ) * 3003 ) );
-    hb_log("render: Lost time: %i frames (%i ticks)", (pv->dropped_frames * 4) - (pv->extended_frames), (pv->dropped_frames * 3003) - ( ( pv->extended_frames / 4 ) * 3003 ) );
 
+    hb_log("render: lost time: %lld (%i frames)", pv->total_lost_time, pv->dropped_frames);
+    hb_log("render: gained time: %lld (%i frames) (%lld not accounted for)", pv->total_gained_time, pv->extended_frames, pv->total_lost_time - pv->total_gained_time);
+    hb_log("render: average dropped frame duration: %lld", (pv->total_lost_time / pv->dropped_frames) );
+    
     /* Cleanup subtitle queue */
     if( pv->subtitle_queue )
     {
@@ -476,12 +522,17 @@
     /* Setup FIFO queue for subtitle cache */
     pv->subtitle_queue = hb_fifo_init( 8 );    
     pv->delay_queue = hb_fifo_init( 8 );
-    pv->frames_to_extend = 0;
     pv->dropped_frames = 0;
     pv->extended_frames = 0;
     pv->last_start[0] = 0;
     pv->last_stop[0] = 0;
-    
+    pv->total_lost_time = 0;
+    pv->total_gained_time = 0;
+    pv->lost_time[0] = 0;
+    pv->lost_time[1] = 0;
+    pv->lost_time[2] = 0;
+    pv->lost_time[3] = 0;
+
     /* Setup filters */
     /* TODO: Move to work.c? */
     if( job->filters )
You know how I said above that the code was doing a passable job on rhester's ipod killer sample? Now it's doing a damn fine job. Those discontinuities were adding a bit of jerkiness in a few troublesome spots.

sdm, I'll be curious if this fixes your out of order frames issue. And good lord, how long does it take you to make all those animated gifs?!
van
Veteran User
Posts: 417
Joined: Wed Aug 29, 2007 6:35 am

Re: "Same as source" FPS Improvements

Post by van »

Very nice!

It's doing a beautiful job on mixed content and I now get no continuity errors & no dropped frames on pure progressive content. The latter actually causes a minor problem. You might want to change

Code: Select all

hb_log("render: average dropped frame duration: %lld", (pv->total_lost_time / pv->dropped_frames) );
to

Code: Select all

    if( pv->dropped_frames )
    {
        hb_log("render: average dropped frame duration: %lld", (pv->total_lost_time / pv->dropped_frames) );
    }
to avoid an occasional divide-by-zero fault right at the end of a perfect encode.
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Re: "Same as source" FPS Improvements

Post by jbrjake »

van wrote:avoid an occasional divide-by-zero
Hehe, oops, yeah, probably should avoid that ;-)
sdm
Bright Spark User
Posts: 194
Joined: Mon Feb 19, 2007 4:53 pm

Re: "Same as source" FPS Improvements

Post by sdm »

jbrjake wrote: sdm, I'll be curious if this fixes your out of order frames issue. And good lord, how long does it take you to make all those animated gifs?!
If I applied the patch correctly... (copied/pasted into text edit, made sure there was a carriage return at the end, saved as text file (encoding unicode (UTF-8)) with the name patchfile.diff, downloaded current svn, in terminal ./configure then ./jam)
then, no, the patch didn't fix the out of order issue.

Also tested detelecine/vfr on American Beauty (because there are some interlaced frames at the beginning of chapter 11) and noticed frames out of order, and also on Superbad.

Good lord, those animated gifs do take a couple of minutes, but a picture is worth 1000 words they say.
And speaking of that, here's one for American Beauty:
Image
Notice how the beer is being raised, but one frame has the beer lower than the preceding frame.

When encoded without detelecine+vfr, the motion is smooth (and there are combed frames in at least one place).

Keep up the great work.
I think you almost have it nailed?

--sdm.

edit:
should mention this usually occurs right after a chapter break.
Also notice the problem in other areas of american beauty.
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Re: "Same as source" FPS Improvements

Post by jbrjake »

sdm wrote:If I applied the patch correctly... (copied/pasted into text edit, made sure there was a carriage return at the end, saved as text file (encoding unicode (UTF-8)) with the name patchfile.diff, downloaded current svn, in terminal ./configure then ./jam)
then, no, the patch didn't fix the out of order issue.
Err...but did you apply the patch before jamming? Or did you really just save it to a file named patchfile.diff?
should mention this usually occurs right after a chapter break.
Are you using chapter markers?
sdm
Bright Spark User
Posts: 194
Joined: Mon Feb 19, 2007 4:53 pm

Re: "Same as source" FPS Improvements

Post by sdm »

I really did just save it to a file named patchfile.diff in with the rest of the SVN stuff.
Sorry. um, I know I'm supposed to know what I'm doing, if I'm in this thread, but I don't.

Yes chapter markers are on.

--sdm.

[edit: found instructions on how to apply a patch. I may be able to do it!]
sdm
Bright Spark User
Posts: 194
Joined: Mon Feb 19, 2007 4:53 pm

Re: "Same as source" FPS Improvements

Post by sdm »

I've got the patch applied correctly and tested on two test sources (mostly progressive with soft telecine, but tiny amounts of hard telecined frames close to chapter breaks).

It appears like everything is handled properly! One problem so far though - on my test encode of Superbad, HB encoded the entire movie but only inserted 21 chapter markers, when there should be 28, and the 21 chapter markers aren't where they should be either.

I'll encode something else and see what happens with chapter markers.

I'm very pleased to see this progress!
--sdm.
van
Veteran User
Posts: 417
Joined: Wed Aug 29, 2007 6:35 am

Re: "Same as source" FPS Improvements

Post by van »

Oops. Yes, when detelecine drops frames it will occasionally drop frames with chapter markers & I don't see any code in render.c to move them to the next un-dropped frame. 7 chapters of 28 lost is exactly what you'd expect if the chapter marks occur on a random frame & one frame in five is dropped. But the marks that aren't dropped should be in the correct place. When you say they "aren't where they should be" do you mean that the remaining marks are in a different place in the content than they are on the dvd or that the labels are screwed up? (It's a known bug that if a chapter mark is dropped then labels from that point on will be messed up. E.g., if 3 is dropped then chapter 4 will be labeled "Chapter 3", 5 will be "Chapter 4", etc. I hope to fix this soon but it probably won't get into 0.9.2.)

[EDIT]
jbrjake, here's a possible fix to libhb/render.c for the chapter mark dropping. It just remembers any chapter mark on an incoming frame & applies it to the first outgoing frame that starts at or after the time of the mark. I've tested this against a couple of hard-telecined dvds. Like sdm, I found that without the fix 20% of the marks were lost. With the fix they were all there. This is the diff against the libhb/render.c from your last two-fer patch (I don't expect you to apply this; it's just to show you what changed):

Code: Select all

--- libhb/render.c.jbr	2008-02-02 23:07:55.000000000 -0800
+++ libhb/render.c	2008-02-02 23:21:00.000000000 -0800
@@ -27,6 +27,8 @@
     uint64_t             lost_time[4];
     uint64_t             total_lost_time;
     uint64_t             total_gained_time;
+    int64_t              chapter_time;
+    int                  chapter_val;
  };
 
 int  renderInit( hb_work_object_t *, hb_job_t * );
@@ -235,6 +237,14 @@
     {
         hb_fifo_push( pv->subtitle_queue,  hb_buffer_init(0) );
     }
+
+    /* If there's a chapter mark remember it in case we delay or drop its frame */
+    if( in->new_chap )
+    {
+        pv->chapter_time = in->start;
+        pv->chapter_val = in->new_chap;
+        in->new_chap = 0;
+    }
     
     /* Setup render buffer */
     hb_buffer_t * buf_render = hb_buffer_init( 3 * job->width * job->height / 2 );  
@@ -469,6 +479,14 @@
             pv->last_stop[i] = pv->last_start[i] + temp_duration;
         }
 
+        /* If we have a pending chapter mark and this frame is at
+           or after the time of the mark, mark this frame & clear
+           our pending mark. */
+        if( pv->chapter_time && pv->chapter_time <= ivtc_buffer->start )
+        {
+            ivtc_buffer->new_chap = pv->chapter_val;
+            pv->chapter_time = 0;
+        }
     }
 
     return HB_WORK_OK;

The actual patch against svn head is at http://pastebin.ca/889865
sdm
Bright Spark User
Posts: 194
Joined: Mon Feb 19, 2007 4:53 pm

Re: "Same as source" FPS Improvements

Post by sdm »

van wrote:"When you say they "aren't where they should be" do you mean that the remaining marks are in a different place in the content than they are on the dvd or that the labels are screwed up? (It's a known bug that if a chapter mark is dropped then labels from that point on will be messed up. E.g., if 3 is dropped then chapter 4 will be labeled "Chapter 3", 5 will be "Chapter 4", etc. I hope to fix this soon but it probably won't get into 0.9.2.)
Yes, upon looking again, it seems like you are describing in correctly. I encoded two movies and checked against previous encodes.
Its easier to see near the beginning of the movie - in one vfr encode, the real chapter 7's time was missed, but the real chapter 8's time was marked, but labeled chapter 7.

I'll see how I do with your patch.

Thanks again
--sdm.
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Re: "Same as source" FPS Improvements

Post by jbrjake »

Thanks, van. I don't use chapter markers often, so that stuff just slips me by.

Here's a new, unified diff that's got the vfr duration changes, the b-frame changes, and the chapter marker repositioning all in one, against a recent SVN head:

http://pastebin.ca/890380

Code: Select all

Index: libhb/encx264.c
===================================================================
--- libhb/encx264.c	(revision 1244)
+++ libhb/encx264.c	(working copy)
@@ -25,6 +25,24 @@
 
 #define DTS_BUFFER_SIZE 32 
 
+/*
+ * The frame info struct remembers information about each frame across calls
+ * to x264_encoder_encode. Since frames are uniquely identified by their
+ * timestamp, we use some bits of the timestamp as an index. The LSB is
+ * chosen so that two successive frames will have different values in the
+ * bits over any plausible range of frame rates. (Starting with bit 9 allows
+ * any frame rate slower than 175fps.) The MSB determines the size of the array.
+ * It is chosen so that two frames can't use the same slot during the 
+ * encoder's max frame delay (set by the standard as 16 frames) and so
+ * that, up to some minimum frame rate, frames are guaranteed to map to
+ * different slots. (An MSB of 16 which is 2^(16-9+1) = 256 slots guarantees
+ * no collisions down to a rate of 1.4 fps).
+ */
+#define FRAME_INFO_MAX2 (9)     // 2^9 = 512; 90000/512 = 175 frames/sec
+#define FRAME_INFO_MIN2 (16)    // 2^16 = 65536; 90000/65536 = 1.4 frames/sec
+#define FRAME_INFO_SIZE (1 << (FRAME_INFO_MIN2 - FRAME_INFO_MAX2 + 1))
+#define FRAME_INFO_MASK (FRAME_INFO_SIZE - 1)
+
 struct hb_work_private_s
 {
     hb_job_t       * job;
@@ -32,15 +50,15 @@
     x264_picture_t   pic_in;
     uint8_t         *x264_allocated_pic;
 
-    // Internal queue of DTS start/stop values.
-    int64_t        dts_start[DTS_BUFFER_SIZE];
-    int64_t        dts_stop[DTS_BUFFER_SIZE];
+    int64_t        dts_next;    // DTS start time value for next output frame
+    int64_t        last_stop;   // Debugging - stop time of previous input frame
     int64_t        init_delay;
-
-    int64_t        dts_write_index;
-    int64_t        dts_read_index;
     int64_t        next_chap;
 
+    struct {
+        int64_t duration;
+    } frame_info[FRAME_INFO_SIZE];
+
     char             filename[1024];
 };
 
@@ -247,8 +265,7 @@
 
     pv->x264_allocated_pic = pv->pic_in.img.plane[0];
 
-    pv->dts_write_index = 0;
-    pv->dts_read_index = 0;
+    pv->dts_next = -1;
     pv->next_chap = 0;
 
     if (job->areBframes)
@@ -268,7 +285,7 @@
            will use the duration of frames running at 23.976fps instead.. */
         if (job->vfr)
         {
-            pv->init_delay = 3754;
+            pv->init_delay = 7506;
         }
    
         /* The delay is 2 frames for regular b-frames, 3 for b-pyramid.
@@ -296,6 +313,22 @@
     /* TODO */
 }
 
+/*
+ * see comments in definition of 'frame_info' in pv struct for description
+ * of what these routines are doing.
+ */
+static void save_frame_info( hb_work_private_t * pv, hb_buffer_t * in )
+{
+    int i = (in->start >> FRAME_INFO_MAX2) & FRAME_INFO_MASK;
+    pv->frame_info[i].duration = in->stop - in->start;
+}
+
+static int64_t get_frame_duration( hb_work_private_t * pv, int64_t pts )
+{
+    int i = (pts >> FRAME_INFO_MAX2) & FRAME_INFO_MASK;
+    return pv->frame_info[i].duration;
+}
+
 int encx264Work( hb_work_object_t * w, hb_buffer_t ** buf_in,
                   hb_buffer_t ** buf_out )
 {
@@ -330,6 +363,16 @@
                 job->height / 4;
         }
 
+        if( pv->dts_next == -1 )
+        {
+            /* we don't have a start time yet so use the first frame's 
+             * start. All other frame times will be determined by the
+             * sum of the prior output frame durations in *DTS* order
+             * (not by the order they arrive here). This timing change is
+             * essential for VFR with b-frames, but a complete nop * otherwise.
+             */
+            pv->dts_next = in->start;
+        }
         if( in->new_chap && job->chapter_markers )
         {
             /* chapters have to start with an IDR frame so request that this
@@ -351,11 +394,22 @@
         }
         pv->pic_in.i_qpplus1 = 0;
 
-        // Remember current PTS value, use as DTS later
-        pv->dts_start[pv->dts_write_index & (DTS_BUFFER_SIZE-1)] = in->start;
-        pv->dts_stop[pv->dts_write_index & (DTS_BUFFER_SIZE-1)]  = in->stop;
-        pv->dts_write_index++;
+        /* XXX this is temporary debugging code to check that the upstream
+         * modules (render & sync) have generated a continuous, self-consistent
+         * frame stream with the current frame's start time equal to the
+         * previous frame's stop time.
+         */
+        if( pv->last_stop != in->start )
+        {
+            hb_log("encx264 input continuity err: last stop %lld  start %lld",
+                    pv->last_stop, in->start);
+        }
+        pv->last_stop = in->stop;
 
+        // Remember info about this frame that we need to pass across 
+        // the x264_encoder_encode call (since it reorders frames).
+        save_frame_info( pv, in );
+
         /* Feed the input DTS to x264 so it can figure out proper output PTS */
         pv->pic_in.i_pts = in->start;
 
@@ -386,17 +440,15 @@
         /* Should be way too large */
         buf        = hb_buffer_init( 3 * job->width * job->height / 2 );
         buf->size  = 0;
-        buf->start = in->start;
-        buf->stop  = in->stop;
         buf->frametype   = 0;
 
-        int64_t dts_start, dts_stop;
-
         /* Get next DTS value to use */
-        dts_start = pv->dts_start[pv->dts_read_index & (DTS_BUFFER_SIZE-1)];
-        dts_stop  = pv->dts_stop[pv->dts_read_index & (DTS_BUFFER_SIZE-1)];
-        pv->dts_read_index++;
+        int64_t dts_start = pv->dts_next;
 
+        /* compute the stop time based on the original frame's duration */
+        int64_t dts_stop  = dts_start + get_frame_duration( pv, pic_out.i_pts );
+        pv->dts_next = dts_stop;
+
         for( i = 0; i < i_nal; i++ )
         {
             int size, data;
@@ -469,11 +521,7 @@
 
                     /* Store the output presentation time stamp
                        from x264 for use by muxmp4 in off-setting
-                       b-frames with the CTTS atom.
-                       For now, just add 1000000 to the offset so that the
-                       value is pretty much guaranteed to be positive.  The
-                       muxing code will minimize the renderOffset at the end. */
-
+                       b-frames with the CTTS atom. */
                     buf->renderOffset = pic_out.i_pts - dts_start + pv->init_delay;
 
                     /* Send out the next dts values */
Index: libhb/render.c
===================================================================
--- libhb/render.c	(revision 1244)
+++ libhb/render.c	(working copy)
@@ -20,11 +20,15 @@
     hb_buffer_t        * buf_scale;
     hb_fifo_t          * subtitle_queue;
     hb_fifo_t          * delay_queue;
-    int                  frames_to_extend;
     int                  dropped_frames;
     int                  extended_frames;
     uint64_t             last_start[4];
     uint64_t             last_stop[4];
+    uint64_t             lost_time[4];
+    uint64_t             total_lost_time;
+    uint64_t             total_gained_time;
+    int64_t              chapter_time;
+    int                  chapter_val;
 };
 
 int  renderInit( hb_work_object_t *, hb_job_t * );
@@ -233,7 +237,15 @@
     {
         hb_fifo_push( pv->subtitle_queue,  hb_buffer_init(0) );
     }
-    
+
+    /* If there's a chapter mark remember it in case we delay or drop its frame */
+    if( in->new_chap )
+    {
+        pv->chapter_time = in->start;
+        pv->chapter_val = in->new_chap;
+        in->new_chap = 0;
+    }
+
     /* Setup render buffer */
     hb_buffer_t * buf_render = hb_buffer_init( 3 * job->width * job->height / 2 );  
     
@@ -280,9 +292,22 @@
             {
                 if( job->vfr )
                 {
-                    pv->frames_to_extend += 4;
+                    /* We need to compensate for the time lost by dropping this frame.
+                       Spread its duration out in quarters, because usually dropped frames
+                       maintain a 1-out-of-5 pattern and this spreads it out amongst the remaining ones.
+                       Store these in the lost_time array, which has 4 slots in it.
+                       Because not every frame duration divides evenly by 4, and we can't lose the
+                       remainder, we have to go through an awkward process to preserve it in the 4th array index. */
+                    uint64_t temp_duration = buf_tmp_out->stop - buf_tmp_out->start;
+                    pv->lost_time[0] += (temp_duration / 4);
+                    pv->lost_time[1] += (temp_duration / 4);
+                    pv->lost_time[2] += (temp_duration / 4);
+                    pv->lost_time[3] += ( temp_duration - (temp_duration / 4) - (temp_duration / 4) - (temp_duration / 4) );
+                    
+                    pv->total_lost_time += temp_duration;
                     pv->dropped_frames++;
-                    hb_fifo_get( pv->subtitle_queue );
+                    
+                    hb_fifo_get( pv->subtitle_queue );                    
                     buf_tmp_in = NULL;
                 }
                 else
@@ -304,8 +329,11 @@
             pv->last_start[i] = pv->last_start[i-1];
             pv->last_stop[i] = pv->last_stop[i-1];
         }
-        pv->last_start[0] = in->start;
-        pv->last_stop[0] = in->stop;
+        
+        /* In order to make sure we have continuous time stamps, store
+           the current frame's duration as starting when the last one stopped. */
+        pv->last_start[0] = pv->last_stop[1];
+        pv->last_stop[0] = pv->last_start[0] + (in->stop - in->start);
     }
     
     /* Apply subtitles */
@@ -385,13 +413,17 @@
 
     if( *buf_out )
     {
-        if( pv->frames_to_extend )
+        /* The current frame exists. That means it hasn't been dropped by a filter.
+           Make it accessible as ivtc_buffer so we can edit its duration if needed. */
+        ivtc_buffer = *buf_out;
+
+        if( pv->lost_time[3] > 0 )
         {
             /*
-             * A frame's been dropped by VFR detelecine.
+             * A frame's been dropped earlier by VFR detelecine.
              * Gotta make up the lost time. This will also
-             * slow down the video to 23.976fps.
-             * The dropped frame ran for 3003 ticks, so
+             * slow down the video.
+             * The dropped frame's has to be accounted for, so
              * divvy it up amongst the 4 frames left behind.
              * This is what the delay_queue is for;
              * telecined sequences start 2 frames before
@@ -399,30 +431,63 @@
              * ones you need a 2 frame delay between
              * reading input and writing output.
              */
-            ivtc_buffer = *buf_out;
+                                   
+            /* We want to extend the outputted frame's duration by the value
+              stored in the 4th slot of the lost_time array. Because we need
+              to adjust all the values in the array so they're contiguous,
+              extend the duration inside the array first, before applying
+              it to the current frame buffer. */
+            pv->last_stop[3] += pv->lost_time[3];
             
-            /* The 4th cached frame will be the to use. */
-            ivtc_buffer->start = pv->last_start[3];
-            ivtc_buffer->stop = pv->last_stop[3];
-
-            if (pv->frames_to_extend % 4)
-                ivtc_buffer->stop += 751;
-            else
-                ivtc_buffer->stop += 750;
+            /* Log how much time has been added back in to the video. */
+            pv->total_gained_time += pv->lost_time[3];
             
-            /* Set the 3rd cached frame to start when this one stops,
-               and to stop 3003 ticks later -- a normal 29.97fps
-               length frame. If it needs to be extended as well to
-               make up lost time, it'll be handled on the next
-               loop through the renderer.                            */
-            int temp_duration = pv->last_stop[2] - pv->last_start[2];
-            pv->last_start[2] = ivtc_buffer->stop;
-            pv->last_stop[2] = ivtc_buffer->stop + temp_duration;
+            /* We've pulled the 4th value from the lost_time array
+               and added it to the last_stop array's 4th slot. Now, rotate the
+                lost_time array so the 4th slot now holds the 3rd's value, and
+               so on down the line, and set the 0 index to a value of 0. */
+            int i;
+            for( i=2; i >=  0; i--)
+            {
+                pv->lost_time[i+1] = pv->lost_time[i];
+            }
+            pv->lost_time[0] = 0;
             
-            pv->frames_to_extend--;
+            /* Log how many frames have had their durations extended. */
             pv->extended_frames++;
         }
+        
+        /* We can't use the given time stamps. Previous frames
+           might already have been extended, throwing off the
+           raw values fed to render.c. Instead, their
+           stop and start times are stored in arrays.
+           The 4th cached frame will be the to use.
+           If it needed its duration extended to make up
+           lost time, it will have happened above. */
+        ivtc_buffer->start = pv->last_start[3];
+        ivtc_buffer->stop = pv->last_stop[3];
+        
+        /* Set the 3rd cached frame to start when this one stops,
+           and so on down the line. If any of them need to be
+           extended as well to make up lost time, it'll be handled
+           on the next loop through the renderer.  */
+        int i;   
+        for (i = 2; i >= 0; i--)
+        {
+            int temp_duration = pv->last_stop[i] - pv->last_start[i];
+            pv->last_start[i] = pv->last_stop[i+1];
+            pv->last_stop[i] = pv->last_start[i] + temp_duration;
+        }
 
+        /* If we have a pending chapter mark and this frame is at
+           or after the time of the mark, mark this frame & clear
+           our pending mark. */
+        if( pv->chapter_time && pv->chapter_time <= ivtc_buffer->start )
+        {
+            ivtc_buffer->new_chap = pv->chapter_val;
+            pv->chapter_time = 0;
+        }
+
     }
 
     return HB_WORK_OK;
@@ -431,11 +496,12 @@
 void renderClose( hb_work_object_t * w )
 {
     hb_work_private_t * pv = w->private_data;   
-        
-    hb_log("render: dropped frames: %i (%i ticks)", pv->dropped_frames, (pv->dropped_frames * 3003) );
-    hb_log("render: extended frames: %i (%i ticks)", pv->extended_frames, ( ( pv->extended_frames / 4 ) * 3003 ) );
-    hb_log("render: Lost time: %i frames (%i ticks)", (pv->dropped_frames * 4) - (pv->extended_frames), (pv->dropped_frames * 3003) - ( ( pv->extended_frames / 4 ) * 3003 ) );
 
+    hb_log("render: lost time: %lld (%i frames)", pv->total_lost_time, pv->dropped_frames);
+    hb_log("render: gained time: %lld (%i frames) (%lld not accounted for)", pv->total_gained_time, pv->extended_frames, pv->total_lost_time - pv->total_gained_time);
+    if (pv->dropped_frames)
+        hb_log("render: average dropped frame duration: %lld", (pv->total_lost_time / pv->dropped_frames) );
+    
     /* Cleanup subtitle queue */
     if( pv->subtitle_queue )
     {
@@ -476,12 +542,23 @@
     /* Setup FIFO queue for subtitle cache */
     pv->subtitle_queue = hb_fifo_init( 8 );    
     pv->delay_queue = hb_fifo_init( 8 );
-    pv->frames_to_extend = 0;
+
+    /* VFR IVTC needs a bunch of time-keeping variables to track
+      how many frames are dropped, how many are extended, what the
+      last 4 start and stop times were (so they can be modified),
+      how much time has been lost and gained overall, how much time
+      the latest 4 frames should be extended by, and where chapter
+      markers are (so they can be saved if their frames are dropped.) */
     pv->dropped_frames = 0;
     pv->extended_frames = 0;
     pv->last_start[0] = 0;
     pv->last_stop[0] = 0;
-    
+    pv->total_lost_time = 0;
+    pv->total_gained_time = 0;
+    pv->lost_time[0] = 0; pv->lost_time[1] = 0; pv->lost_time[2] = 0; pv->lost_time[3] = 0;
+    pv->chapter_time = 0;
+    pv->chapter_val  = 0;
+
     /* Setup filters */
     /* TODO: Move to work.c? */
     if( job->filters )
So what's left before checking this in? I figure I should probably excise that line in work.c that forces job->vrate_base to 29.97. Obviously it doesn't affect encoding anymore after the sync.c improvements, but it seems to confuse HB's time estimates with material closer to 23.976.
van
Veteran User
Posts: 417
Joined: Wed Aug 29, 2007 6:35 am

Re: "Same as source" FPS Improvements

Post by van »

I can't think of anything else that needs to be done before checkin. It seems to be working well for everyone that's tried it.
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Re: "Same as source" FPS Improvements

Post by jbrjake »

Okay, checked in the render.c and work.c changes as:
http://trac.handbrake.fr/changeset/1249

Van, please commit your b-frame tracking improvements for encx264.c at your convenience.

NOTE for those playing along at home: that means that rev1249 will not do VFR entirely right when B-Frames are used. That will happen after the x264 changes are checked in.
Post Reply