https://reviews.handbrake.fr/reviewboard.fcgi/r/50/
https://reviews.handbrake.fr/reviewboard.fcgi/r/61/Camillo wrote:Each sample in an MPEG4 text track fixes the displayed text for the duration of the sample. I think a good way to handle this is to treat each line in the source srt/ssa track as specifying two events, an "add text" event when it begins and a "remove text" even when it ends. Then each interval between subsequent events corresponds to a sample in the text track, whose text is the concatenation of the list of active subtitles. The problem is that we can only output a sample when we are sure that no more events will occur in the middle, because that would require breaking it down into two or more smaller samples. And we would like to determine this as soon as possible, to avoid holding up the pipes.
I think we can assume that the lines in the source track are sorted by their starting time. Therefore, as soon as we see a line beginning at time T, we can output all samples until that time. However, if a movie has very sparse subtitles, and the subtitle track is interleaved with the AV tracks, we may have to read and buffer a lot of data before we see the next subtitle packet; we may even have to wait until the end of the file. Now, I'm not very familiar with video, so correct me if I'm wrong, but I think we can also assume that, in a multiplexed container, the packets are also sorted by starting time; if that's the case, as soon as we see any packet starting at time T, we can output all text samples until that time.
Bottom line: I think the subtitle filter needs to be informed as the current timestamp of the movie advances, so it knows when to output pending subtitles as soon as possible. I need someone more familiar with the internals of Handbrake (I had never looked at it before writing this patch) to determine the best way of doing that.
JohnAStebbins wrote:Regarding the proposal to resequence overlapping subtitles (e.g. 1,2 -> 1,1.2,2). It is true that the data in containers is interleaved. But some containers do more interleaving than others. There is a certain granularity or chunk size that is often used. So you will not always get fine grained interleaving. Also, the start time of overlapped subtitles could differ by large amounts. e.g. subtitle 1 start 10sec end 90sec, subtitle 2 start 40sec end 120sec. That's an extreme example, but demonstrates that there are some cases we just won't be able to handle under any circumstances. So I'm starting to wonder if it is worth the trouble.