Struggling with IVTC
So, I've been trying all sorts of different methods.
1) Keep track of whether any the last 4 frames were dropped. If so, extend the duration of the current frame by 751 ticks.
2) Keep track of the frame durations detelecine assigns. If one of the last two frames was a 3 or a 1, extend the duration of the current frame by 751 ticks.
3) Like method 1, except only extend the frame directly after the dropped one, by a full 3003 ticks. It'd be the same effect as not duping frames (a slight stutter) but without the bitrate costs of duped frames.
4) Compare the buf->data and when one is the same as the last one's, discard it.
1) doesn't work because sometimes more than 1 frame out of 4 is dropped so then you're left with an excess of time to make up.
2) doesn't work because sometimes the durations don't go steady 323232 and you start losing time when it goes back to interlaced sections
3) might work if i stared at it some more but it seems stupid
4) doesn't work because apparently there are slight differences between the "duped" frames that the eye doesn't see, but make them not equal, numerically. Or I'm reading them wrong. Or something.
Then I've tried massaging all those methods in various ways, like adjusting their sensitivities (like to only 'notice' 29.97 material after it had been going on for a dozen frames).
Here's a messy diff with a bunch of different attempts, most commented out:
http://pastebin.ca/742806
But finally I figured out how to do it, I think. It's so simple I can't believe I've been working on this so long:
Keep a running count of dropped frames. Keep a running count of extended frames. When dropped * 4 > extended, extend the current frame. This solves the problems with method 1. Whenever the numbers are out of sync, they're made up at the next available opportunity (like a static shot of a wall, when ivtc didn't see anything to discard), instead of accumulating into a growing av desync. Sure, we're still ~1500 ticks off on when telecined sections start and end, and making up that lost time will extend the end even more, but, imo, it'll square the desync within a reasonable window. It will get even more reasonable when someone smarter than I am figures out how to buffer a couple of frames in the render or muxing pipeline, so the extended durations can begin 2 frames before the dropped frame like they should.
Now, this works great with 29.97 material. But I'm getting a weird result with soft telecined 23.976 stuff. To wit, it won't drop frames. I get duped frames on output. It drops the frames fine from hard telecined material, but ignores soft telecined. Weirder still, detelecine.c *does* recognize that the material has varying frame lengths, and by virtue of the duped frames being present, it *is* correctly reading and applying the repeat flags, rendering that extra material.
So I don't know what's up with that.
What I was doing earlier in this thread -- disabling pullup's ability to see soft telecine markers -- is a Bad Idea I've found. It doesn't seem to get the durations quite right, so things get a little jerky and out of sync. But I hope to figure out something. In the mean time, my long-term goal of replacing "Same as source" with always-on IVTC and variable frame rates just ain't gonna happen.
On the other hand, what I have now is pretty cool: variable frame-rate for purely NTSC video sources. Any part of the video that runs at video speed does so. Any part that ivtc does its magic on runs at film speed. It all stays reasonably within sync, most of the time. Are the video parts interlaced? Then throw a deinterlacing filter on as well. It'll work fine. You can feed in film stuff too -- it just doesn't get frames dropped like it should, so those parts runs at 29.97.
Obviously, this code still has a long ways to go. Anyway, here's what I've got...
http://pastebin.ca/742803
Code: Select all
Index: libhb/detelecine.c
===================================================================
--- libhb/detelecine.c (revision 1022)
+++ libhb/detelecine.c (working copy)
@@ -975,7 +975,7 @@
}
else
{
- goto output_frame;
+ goto discard_frame;
}
}
@@ -987,7 +987,7 @@
if (!frame)
{
- goto output_frame;
+ goto discard_frame;
}
if( frame->length < 2 )
{
@@ -995,19 +995,19 @@
if( !(buf_in->flags & PIC_FLAG_REPEAT_FIRST_FIELD) )
{
- goto output_frame;
+ goto discard_frame;
}
frame = pullup_get_frame( ctx );
if( !frame )
{
- goto output_frame;
+ goto discard_frame;
}
if( frame->length < 2 )
{
pullup_release_frame( frame );
- goto output_frame;
+ goto discard_frame;
}
}
}
@@ -1034,6 +1034,15 @@
output_frame:
*buf_out = pv->buf_out;
return FILTER_OK;
+
+/* This and all discard_frame calls shown above are
+ the result of me restoring the functionality in
+ pullup that huevos_rancheros disabled because
+ HB couldn't handle it. */
+discard_frame:
+ *buf_out = pv->buf_out;
+ return FILTER_DROP;
+
}
Index: libhb/muxmp4.c
===================================================================
--- libhb/muxmp4.c (revision 1022)
+++ libhb/muxmp4.c (working copy)
@@ -380,7 +380,7 @@
/* Because we use the audio samplerate as the timescale,
we have to use potentially variable durations so the video
doesn't go out of sync */
- duration = ( buf->stop * job->arate / 90000 ) - m->sum_dur;
+ duration = ( ( buf->stop * job->arate / 90000 ) - ( buf->start * job->arate / 90000 ) );
m->sum_dur += duration;
}
else
Index: libhb/render.c
===================================================================
--- libhb/render.c (revision 1022)
+++ libhb/render.c (working copy)
@@ -9,6 +9,10 @@
#include "ffmpeg/avcodec.h"
#include "ffmpeg/swscale.h"
+/* Used for keeping track of when frames are dropped by detelecine. */
+uint32_t dropped_frames = 0;
+uint32_t extended_durations = 0;
+
struct hb_work_private_s
{
hb_job_t * job;
@@ -225,6 +229,9 @@
}
else if( result == FILTER_DROP )
{
+ /* A drop means detelecine's latched its teeth onto
+ something and has duped frames to discard. */
+ dropped_frames++;
hb_fifo_get( pv->subtitle_queue );
buf_tmp_in = NULL;
break;
@@ -271,6 +278,24 @@
/* Set output to render buffer */
(*buf_out) = buf_render;
+ if ( buf_tmp_in )
+ {
+ /* Pass VFR NTSC IVTC frame durations on through the workflow.
+ I totally wish I could slip some cryptic abbreviations into
+ the last sentence.
+
+ (90090000 / 120000) is the rational equivalent of 750.75.
+ That's the number of ticks by which a 23.976 frame is
+ longer than a 29.97 frame. God I hate fractions.
+ */
+
+ if ( (dropped_frames) &&( dropped_frames * 4 ) > extended_durations )
+ {
+ buf_render->stop = buf_render->stop + (90090000 / 120000);
+ extended_durations++;
+ }
+ }
+
if( buf_tmp_in == NULL )
{
/* Teardown and cleanup buffers if we are emitting NULL */
@@ -297,6 +322,10 @@
void renderClose( hb_work_object_t * w )
{
+ hb_log("RENDER: dropped frames: %i (%i ticks)", dropped_frames, (dropped_frames * 3003) );
+ hb_log("RENDER: extended frames: %i (%i ticks)", extended_durations, (extended_durations * (90090000 / 120000) ) );
+ hb_log("RENDER: Lost time: %i ticks", (dropped_frames * 3003) - (extended_durations * (90090000 / 120000) ) );
+
hb_work_private_t * pv = w->private_data;
/* Cleanup subtitle queue */
I'm planning on cleaning this stuff up soon so it can all be optional and only run when a job->vfr boolean's true. Then, I'd like to commit it. Not because it's ready for regular usage yet, but because it needs more testing than I can do alone.
If you have audio sync problems, see if this code from eddyg works as a temporary fix to at least hide some of the messages:
http://pastebin.ca/742810