Concatenate overlapping UTF-8 subtitles

Archive of historical development discussions
Discussions / Development has moved to GitHub
Forum rules
*******************************
Please be aware we are now using GitHub for issue tracking and feature requests.
- This section of the forum is now closed to new topics.

*******************************
Post Reply
davidfstr
Enlightened
Posts: 149
Joined: Sun Apr 12, 2009 7:41 pm

Concatenate overlapping UTF-8 subtitles

Post by davidfstr »

This topic is moved from the ReviewBoard system.

https://reviews.handbrake.fr/reviewboard.fcgi/r/50/
Camillo wrote:Each sample in an MPEG4 text track fixes the displayed text for the duration of the sample. I think a good way to handle this is to treat each line in the source srt/ssa track as specifying two events, an "add text" event when it begins and a "remove text" even when it ends. Then each interval between subsequent events corresponds to a sample in the text track, whose text is the concatenation of the list of active subtitles. The problem is that we can only output a sample when we are sure that no more events will occur in the middle, because that would require breaking it down into two or more smaller samples. And we would like to determine this as soon as possible, to avoid holding up the pipes.

I think we can assume that the lines in the source track are sorted by their starting time. Therefore, as soon as we see a line beginning at time T, we can output all samples until that time. However, if a movie has very sparse subtitles, and the subtitle track is interleaved with the AV tracks, we may have to read and buffer a lot of data before we see the next subtitle packet; we may even have to wait until the end of the file. Now, I'm not very familiar with video, so correct me if I'm wrong, but I think we can also assume that, in a multiplexed container, the packets are also sorted by starting time; if that's the case, as soon as we see any packet starting at time T, we can output all text samples until that time.

Bottom line: I think the subtitle filter needs to be informed as the current timestamp of the movie advances, so it knows when to output pending subtitles as soon as possible. I need someone more familiar with the internals of Handbrake (I had never looked at it before writing this patch) to determine the best way of doing that.
https://reviews.handbrake.fr/reviewboard.fcgi/r/61/
JohnAStebbins wrote:Regarding the proposal to resequence overlapping subtitles (e.g. 1,2 -> 1,1.2,2). It is true that the data in containers is interleaved. But some containers do more interleaving than others. There is a certain granularity or chunk size that is often used. So you will not always get fine grained interleaving. Also, the start time of overlapped subtitles could differ by large amounts. e.g. subtitle 1 start 10sec end 90sec, subtitle 2 start 40sec end 120sec. That's an extreme example, but demonstrates that there are some cases we just won't be able to handle under any circumstances. So I'm starting to wonder if it is worth the trouble.
davidfstr
Enlightened
Posts: 149
Joined: Sun Apr 12, 2009 7:41 pm

Re: Concatenate overlapping UTF-8 subtitles

Post by davidfstr »

If I recall correctly, the subtitle pipeline isn't synchronized as aggressively as the video/audio pipelines.

I think I remember running into some code that just assumed that the subtitles for a given frame were decoded by the time that the appropriate video frame rolls around. That is, it assumes that subtitles are always decoded faster than the associated video. There is already a bug that I suspect is a manifestation of this: https://trac.handbrake.fr/ticket/142

Anyway, that sounded like a tangent, but I think getting aggressive synchronization of the pipelines will be necessary first if you want to get overlapping subtitles to work as expected.

If you want to learn more about the HandBrake architecture in general, then you may want to look at our Technical Documents on Trac. In particular you may find the HandBrake Architecture Guide useful. Pipeline innards are documentation on this subpage.

As for people resources, I'm probably the most knowledgeable person for the subtitle pipeline. Unfortunately my life is also rather busy at the moment, so I probably won't be able to assist you directly the next few months. John is the wizard of the Core Library, of which the pipeline is a part. Ritsuka may also be a good resource for subtitle questions.
Camillo
Posts: 3
Joined: Sun Apr 03, 2011 7:00 pm

Re: Concatenate overlapping UTF-8 subtitles

Post by Camillo »

JohnAStebbins wrote:Regarding the proposal to resequence overlapping subtitles (e.g. 1,2 -> 1,1.2,2). It is true that the data in containers is interleaved. But some containers do more interleaving than others. There is a certain granularity or chunk size that is often used. So you will not always get fine grained interleaving. Also, the start time of overlapped subtitles could differ by large amounts. e.g. subtitle 1 start 10sec end 90sec, subtitle 2 start 40sec end 120sec. That's an extreme example, but demonstrates that there are some cases we just won't be able to handle under any circumstances. So I'm starting to wonder if it is worth the trouble.
At 10s we put subtitle 1 on the list of active subs, and output the sample <0s, 10s, no text>. At 40s we see that sub 2 starts: we output a sample <10s, 40s, text of sub 1>, then add subtitle 2 to the active list. At 90s we can output the second sample <40s, 90s, text of sub 1 + text of sub 2>, then we drop subtitle 1 from the active list. At 120s we output the third sample <90s, 120s, text of sub 2>, then we drop subtitle 2 from the active list.

The algorithm is not complex; the only problem, as far as I can see, is letting the subtitle muxer know when the movie has reached a given mark (which allows it to assume that no further subtitles will start before that mark). The 10s and 40s marks are already noticed by reading the subtitle pipe. For the 90s and 120s marks, we need to look at the entire stream somehow. This is where interleaving becomes significant.

The worst case is when we have no interleaving at all, ie the container has the tracks laid out one after the other. But in that case, you have to read the entire file anyway before you can start outputting a properly interleaved output (assuming we read the file sequentially), so the pipes are already being held up as much as possible. I'm not even sure if Handbrake handles this kind of abuse, but in any case, handling subtitles does not introduce any additional problems.

Then let's say we have a properly interleaved file. The interleaving may be more more or less fine-grained, but in any case, there has to be some mechanism that lets the player know "you've got the data for all the tracks up to time 90s at this point, so you can play it". Otherwise it would have to read until the end just to make sure that there isn't a subtitle packet at the very end with a display time of 0s, and that would make interleaving moot, wouldn't it?

In other words, a player has to be able to know exactly which subtitle lines to display at a given time, otherwise it wouldn't be able to play the movie; but then we can use the same knowledge to control output of the text stream. Isn't that right?
Camillo
Posts: 3
Joined: Sun Apr 03, 2011 7:00 pm

Re: Concatenate overlapping UTF-8 subtitles

Post by Camillo »

Thanks David, I'll have a look at those resources and report back.
User avatar
JohnAStebbins
HandBrake Team
Posts: 5712
Joined: Sat Feb 09, 2008 7:21 pm

Re: Concatenate overlapping UTF-8 subtitles

Post by JohnAStebbins »

Camillo wrote:
JohnAStebbins wrote:Regarding the proposal to resequence overlapping subtitles (e.g. 1,2 -> 1,1.2,2). It is true that the data in containers is interleaved. But some containers do more interleaving than others. There is a certain granularity or chunk size that is often used. So you will not always get fine grained interleaving. Also, the start time of overlapped subtitles could differ by large amounts. e.g. subtitle 1 start 10sec end 90sec, subtitle 2 start 40sec end 120sec. That's an extreme example, but demonstrates that there are some cases we just won't be able to handle under any circumstances. So I'm starting to wonder if it is worth the trouble.
At 10s we put subtitle 1 on the list of active subs, and output the sample <0s, 10s, no text>. At 40s we see that sub 2 starts: we output a sample <10s, 40s, text of sub 1>, then add subtitle 2 to the active list.
Let me see if I am following things up to this point correctly. Up to the 40s mark, no subtitles have been written yet. In the mean time, we have written 40s of audio and video. Then you want to write a subtitle that starts at the 10s mark? I really don't think this is going to work. No demuxer is going to look ahead in the stream by 30s worth of data to see if there happens to be a subtitle that is not interleaved properly. The other alternative is that you are assuming that we can look ahead in our input stream to the subtitle that starts at 40s *before* we output audio or video between the 10s and 40s mark. Neither of these cases is likely.
Camillo
Posts: 3
Joined: Sun Apr 03, 2011 7:00 pm

Re: Concatenate overlapping UTF-8 subtitles

Post by Camillo »

Good point. At 10s, the subtitle converter needs to be able to tell the muxer to hold up the other streams until it can output the pending subtitle packet (which happens at 40s). Otherwise, the converter can output the subtitle sample right away, but then it needs the ability to go back and adjust its duration. It depends on whether we can expect to be able to seek in the output file. The third option is to read all of the subtitle packets upfront, then go back and process the other tracks, but this requires the ability to seek in the input.

I don't see another way to implement mpeg4 text tracks correctly, so choose your poison. :-)
User avatar
Ritsuka
HandBrake Team
Posts: 1650
Joined: Fri Jan 12, 2007 11:29 am

Re: Concatenate overlapping UTF-8 subtitles

Post by Ritsuka »

In mp4 the demuxer knows the position of every sample, so it would work. Not the best for progressive download, but nothing that an optimize can't fix.
User avatar
JohnAStebbins
HandBrake Team
Posts: 5712
Joined: Sat Feb 09, 2008 7:21 pm

Re: Concatenate overlapping UTF-8 subtitles

Post by JohnAStebbins »

Ritsuka wrote:In mp4 the demuxer knows the position of every sample, so it would work. Not the best for progressive download, but nothing that an optimize can't fix.
ah, thanks for the input. So this would work with a simple algorithm that just delayed each subtitle packet till the next one arrived. Full steam ahead Camillo!
Post Reply