lpcm in m2ts

Post by **JohnAStebbins** » Wed Oct 22, 2008 1:13 am

This is pretty much an fyi. Hoping someone can add to this.

Is anyone familiar with lpcm audio in transport streams? From my browsing of the code and inspecting the contents of an m2ts file. It looks like our current ts parsing will not handle what is done in m2ts files. We currently seem to be expecting a header that contains id, sample rate, bitrate, channels, etc at the top of each PES that contains lpcm. The m2ts doesn't seem to have that header and I'm having no luck figuring out where they've gone and stashed that information. This problem affects capture of the ac3 tracks as well because the random data in the payload of the lpcm track is sometimes interpreted as ac3 in demuxmpeg and handed to deca52 causing a52 sync errors.

Here's what the top of an lpcm pes looks like from an m2ts

Code: Select all

start code prefix - 00 00 01 
stream id -         fd 
extension -         0a 0b 84 81 08 
pts -               21 00 3f ff e1 
pes extnesion -     01 
pes extension 2 -   81 76 
payload -           0b 77 e9 84 24 30 e1 df fd ca d9 24 a0 00 07 d2
                    ab c3 0c 30 c3 fe 75 7c f9 f3 e7 cf 9f 3e 7c f9
                    f3 e7 cf 9f 3e 7c f9 f3 e7 cf 9f 3e 7c f9 f3 e7
...

PES extension 2 provides the extended_stream_id.
Note that the front of the payload makes no since if you try to interpret it as the lpcm header we are expecting.

EDIT: Umm, during my drive home, it occurred to me that first 2 bytes of that payload look suspiciously are ac3 sync. I may have captured the wrong pid there. I'll have to double check it. If I have the right pid, then it looks like the PMT is lieing to us.

The PMT identifies the pid as lpcm, but then has an ac3 audio descriptor for it. Here's the audios and descriptor information from the PMT. Note channels = 0 for the lpcm pid.

Code: Select all

pid 1100 (lpcm)
desc id 5
desc id 81
len 5
sample rate code 0
bsid 6
bit rate code 18
surround mode 0
bsmod 0
num channels 0
full svc 0
langcod 0

pid 1101 (ac3)
desc id 5
desc id 81
len 4
sample rate code 0
bsid 8
bit rate code 18
surround mode 0
bsmod 0
num channels 7
full svc 0
langcod 0

pid 1102 (ac3)
desc id 5
desc id 81
len 4
sample rate code 0
bsid 8
bit rate code 18
surround mode 0
bsmod 0
num channels 7
full svc 0
langcod 0

Post by **JohnAStebbins** » Wed Oct 22, 2008 4:40 pm

Well, this is interesting. The data I captured for the lpcm pid is correct, but incomplete. There are pes packets on that pid with a different extended_stream_id. They look like this:

Code: Select all

start code prefix - 00 00 01 
stream id -         fd 
extension -         00 95 84  81 08 
pts -               21 00 3f ff e1 
pes extension -     01
pes extension 2 -   81 72
payload -           10 45 ff d8 f8 72 6f ba 00 57 a0 0f b7 52 00 00 
                    00 00 86 d5 20 3c 03 80 45 56 e3 06 e3 00 2f 30
                    b0 18 03 f0 a0 31 03 f0 f1 ea 00 00 01 10 00 00

The extended_stream_id for the packet in my previous post is 0x76. This one is 0x72.
There are groups of these packets between each of the other (ac3?) packets. There appears to be a sequence counter in the payload (byte 2 of payload). Its a full byte, not 5 bits as we expect for lpcm. And it doesn't increment for each pes packet. It spans packets. So these also don't look like the kind of lpcm we expect. Very strange.

The normal ac3 pes packets on the other audio pids both have the same extended_stream_id of 0x71.

Post by **JohnAStebbins** » Thu Oct 23, 2008 9:15 pm

I've had some success at encoding this source now. That's not saying I'm doing everything correctly. There's still a lot about the stream that I don't have enough knowledge about. It seems the audio track that is tagged in the PMT with stream type 0x83 is not lpcm. It's some sort of multiplex with ac3 audio on one of the substreams and god-knows-what on the other substreams. After substantial hacking, I'm able to process it. The hacks to do so are:
1. change stream2codec table entry for 0x83 from lpcm to ac3
2. separate processing of stream id 0xbd from 0xfd. this might not be necessary, but it simplified the problem.
3. make the "id" that buffers are tagged with a composite of the pid and pes stream id. tag the buffer returned from
hb_ts_stream_decode with this id. This prevents the accidental mixing of stream data that happened in demuxmpeg when it constructed the id from data that didn't follow the expected audio data format.
4. change hb_ts_stream_decode so it doesn't break a single long pes from the transport into several shorter pes with continuations. This was just another simplification of the logic to make my work easier.
5. make hb_demux_ps do special processing of pes packet with stream id 0xfd. Finds the stream_id_extension and only processes packets with specific values that I identified in my stream as ac3.

There's a lot to do to make this right. And I don't yet have enough information. But at least I can now encode my Ironman Blu-ray

Here's the patch if anyone is interested in taking a peek.
http://handbrake.fr/pastebin/pastebin.php?show=148

Post by **JohnAStebbins** » Fri Oct 24, 2008 12:17 am

Score! This is what I've been looking for:

ffmpeg-devel madshi wrote: To my best knowledge this is the right way to demux TS and m2ts files:

0x01: MPEG2 video
0x02: MPEG2 video
0x03: MP2 audio (MPEG-1 Audio Layer II)
0x04: MP2 audio (MPEG-2 Audio Layer II)
0x06: private data (can be AC3, DTS or something else)
0x0F: AAC audio (MPEG-2 Part 7 Audio)
0x11: AAC audio (MPEG-4 Part 3 Audio)
0x1B: h264 video
0x80: MPEG2 video or PCM audio
0x81: AC3 audio
0x82: DTS audio
0x83: TrueHD/AC3 interweaved audio
0x84: E-AC3 audio
0x85: DTS-HD High Resolution Audio
0x86: DTS-HD Master Audio
0x87: E-AC3 audio
0xA1: secondary E-AC3 audio
0xA2: secondary DTS audio
0xEA: VC-1 video

There are two problematic situations:

(1) 0x80 can be either MPEG2 video or LPCM audio. In the M2TS
container 0x80 should always be LPCM audio. In the TS container
0x80 is normally MPEG2 video. BUT - if people convert an M2TS
stream to TS, 0x80 can be PCM. Also if people convert a TS stream
to M2TS, 0x80 could be MPEG2 video. The most reliable way to
figure out what it really is is checking the descriptors. If there's a
descriptor 0x05 with the format_descriptor "HDMV" then the track
originates from a Blu-Ray and thus is PCM.

(2) 0x06 can be AC3, DTS or something else. You need to check
the descriptors to find out what it really is. If there's a descriptor
0x05 with the format_identifier of "DTS1", "DTS2", "DTS3" or
"AC-3" then you know what the track is. If there's a descriptor
0x05 with the format_identifier "BSSD" then it's an old style (non
Blu-Ray) LPCM track. I don't know how to handle such a track.
Have no sample for that. Finally, if there's no descriptor 0x05
which tells you which format this track is, you can look for
descriptors 0x6a (DVB AC3), 0x73 (DVB DTS) or 0x81 (ATSC
AC3). If any of these 3 descriptors is present, again you know
what the 0x06 track contains.

Edit: Note that 0x83 can also be LPCM. Would have to look at descriptors to distinguish.

Post by **JohnAStebbins** » Fri Oct 24, 2008 9:19 pm

Given the information in my last post, I've come up with a better way of handling the ac-3/TrueHD interleaved stream.
If the stream id is 0x83, I check the format_identifier from the registration descriptor for the stream. If it is "AC-3", then it's
ac-3/truehd, and the id assigned is the composition of PID, stream id, and the stream id extension for the ac-3 substream. Else it's lpcm and the assigned id is like other stream types, composition of PID and stream id. Then, when generating a pes packet (function generate_output_data()), if the stream id of the pes is 0x83 and the stream is a multiplexed stream, the id assigned to the buffer is the composition of the PID, stream id and stream id extension of the PES being generated.

There is one remaining question. The stream id extension for the ac-3 substream in my sample is 0x76. I don't know if this is a fixed value for all ac-3 substreams, or if this can change from stream to stream. I haven't found any directory in the mpeg structures that tells me the substream's id, so it either has to be fixed or you have to scan the audio stream for ac-3 headers to determine which is which. For now, I'm treating it as a fixed value.

EDIT: Found the source to tsremux which can read and write these streams. The values 0x76 for the ac-3 substream and 0x72 for the TrueHD substream are hardcoded there as well. It's not authoritative, but it's a data point in favour.

Here's my current patch:
http://handbrake.fr/pastebin/pastebin.php?show=152

Post by **JohnAStebbins** » Sun Oct 26, 2008 6:24 pm

While working on getting blu-ray encoding working, I've run across a few more glitches with transport stream processing.
1. when the program list in the pat contains a network PID, the pmt parsing code doesn't properly skip this entry. this causes attempts to interpret a pat as a pmt since the pid entry in the table is left initialized to 0.
2. pmt parsing always waits till the start of the second pmt section before parsing the first. if for some reason there were only one pmt in the stream, it would never parse the pmt. I've changed it to parse the pmt as soon as all the necessary bytes are collected.
3. ts_nextpcr was incremented by an arbitrary 10 ticks for every pes packet. This produces an inaccurate pcr for packets between real pcr's. It would produce a certain amount of jitter in the system clock. I don't know exactly what the effect would be, but it can't be good. I modified to only pass up pcr values derived from the stream (either real pcr's or dts's).
4. when looking for the first video frame of h.264 video, we allow non-IDR frames to pose as i-frames. This was done because some sources seem to have few-to-none IDR frames. But these non-IDR frames make for very messy previews and breaks autocropping. I added a threshold. First check for only IDR. If the first N frames checked are all non-IDR, start again and allow non-IDR.

Let me know yay or nay what you think of these changes, as I would like to get the main-lined at some point.
http://handbrake.fr/pastebin/pastebin.php?show=154

van · Post by **van** » Mon Oct 27, 2008 2:22 pm

Nice detective work John & it's good to see others digging into the stream stuff. But I'm a little concerned about the scope of the changes. Titer provided an elegant & very fast PS demuxer in libhb/demuxmpeg. I'd like to convince myself that it's absolutely necessary to push the complexity of parsing all the 13818-1(2007) extended ID cruft into it. So far it's gotten by with minimal knowledge of PES hdrs and only the 13818-1(2000) info that's accepted by everything (the 2007 extensions can't appear on DVDs). Since the PS packets are built by stream.c, the philosophy so far has been to keep all the complexity in it (for example there's code in hb_ts_stream_set_audio_id_and_codec to handle extended IDs that's taken care of everything so far, except apparently Iron Man) and send demuxmpeg only what it needs to do its job.

I'm still dealing with a backlog of work, life & HB stuff that piled up during my trip and I'm completely tied up with work stuff for the next 24 hours. As soon as I get time I'd like to go through your patch carefully to understand what's there.

Post by **JohnAStebbins** » Mon Oct 27, 2008 4:44 pm

Van, thanks for having a look. I can stand in line and wait my turn for you to catch up.

That extra complexity I added to demuxmpeg doesn't have to be there. I added it to reduce code duplication. The stream code needs to access the stream id extension to distinguish ac3 from truehd. There's no existing code that does that. I could move that new function I created (hb_parse_pes) into stream.c and leave hb_demux_ps pretty much alone. However, I really feel that the processing overhead of this is going to be in the noise.

And after saying that, I thought it would be a good idea to do some measurements. I set up an encode with settings that would make it as fast as possible so that any demux overhead would be a larger percentage of the total. Used ffmpeg abr, ac3 passthru, no filters, mkv, no chapter markers. Encoded a single chapter of a dvd multiple times. There's some variability in the times, but both versions look to be equivalent in speed.

Code: Select all

Modified:
[09:25:46] work: average encoding speed for job is 240.474472 fps
[09:26:32] work: average encoding speed for job is 241.360687 fps
[09:27:24] work: average encoding speed for job is 241.416595 fps
[09:28:09] work: average encoding speed for job is 243.228729 fps
[09:28:55] work: average encoding speed for job is 242.706223 fps
[09:29:44] work: average encoding speed for job is 242.519684 fps
[09:30:31] work: average encoding speed for job is 241.130600 fps

Orig:
[09:22:16] work: average encoding speed for job is 242.870621 fps
[09:23:08] work: average encoding speed for job is 241.265686 fps
[09:23:52] work: average encoding speed for job is 242.795944 fps
[09:24:37] work: average encoding speed for job is 242.640778 fps
[09:31:45] work: average encoding speed for job is 239.752441 fps
[09:32:27] work: average encoding speed for job is 242.597168 fps
[09:33:19] work: average encoding speed for job is 240.778275 fps

Post by **JohnAStebbins** » Wed Oct 29, 2008 4:13 pm

The patch I posted has a lot of things mixed together, some of which probably need more thought. So I'm splitting things up so they can be review and worked on independently. Here are the 2 that I'm really most interested in.

Interleaved truhd/ac3 parsing
http://handbrake.fr/pastebin/pastebin.php?show=162

pmt parsing
http://handbrake.fr/pastebin/pastebin.php?show=163

Post by **JohnAStebbins** » Sat Nov 08, 2008 6:02 pm

Found another glitch in m2ts parsing. It's trying to interpret a DigiCypher PID as ac3 audio. deca52 eventually crashes badly on the digicypher data. In stream.c, we mark pretty much all PIDS that we don't handle an "unknown", even when the type is known such as digicypher. Then we expect the scan to weed out the bad ones. But LookForAudio tries to hard to validate the data as the expected type. It fails hundreds of times to validate the data, then one accidental success allows it to pass. There are a couple of way I see to solve this. We can mark known but unhandled types as "ignore" and/or tighten up the audio scan. Tightening up the audio scan is probably a good idea in any case, so I think we should at least do this and optionally add the former.

EDIT: I had something like this in mind for a stricter audio scan.
http://handbrake.fr/pastebin/pastebin.php?show=202

van · Post by **van** » Sun Nov 09, 2008 9:00 am

John,

Sorry for the long delay in replying. The reason I'm uncomfortable with the scope of this change has to do with a future direction possibility that you couldn't know about (because it's not done & I haven't talked about it). Let me give that background:

When awk wrote the original stream code, HB only supported the DVD subset of mpeg video - program streams with fixed length, 2048 byte frames using a strictly defined subset of the ISO-13818-1 2000 spec with DVD-specific extensions such as the encapsulation headers for AC-3, DTS, LPCM & subtitles in a private-stream-1 PES. He wrote the TS code to make the right things happen given this infrastructure. Since that time the 2048 byte limit got removed to support Tivo files then the reader was generalized to handle multiple container formats and multiple demuxers for ffmpeg inputs.

The reader front end is now general enough that Transport Streams can be handled directly rather than going through the translation step. This is a big performance win since right now our the highest volume (HD) content has to go through *four* memory-to-memory copies before it ever gets to the decoder: i/o buf -> TS buf -> stream's PES buf -> reader's PS buf -> reader's PES buf. I've been working on a rewrite of the transport stream code that should get this down to one copy. It demuxes on PID, as per the spec, and decapsulates the PES header on the fly to end up with just the payload in a buf, all set to be sent downstream to the decoder. This is currently limping (I shelved it to smash 0.9.3 bugs). I don't want to make big changes to the architecture of the current TS-to-PS code because I hope it's about to go away. I really don't want to make any changes to the PS demuxer to support ISO-13818-1 2007 TS features because the TS's are going to take a completely different path.

As regards scanning all the 'unknown' PIDs, there are two reasons why we don't currently look at the descriptors in the PMT and scan all the Elementary Streams to see what they contain:

The descriptors were frequently not there: when ATSC first rolled out the same companies made both the head-end encoder boxes and customer decoder boxes. Like Microsoft, these companies viewed standard conformance as a bug not a feature since interoperability destroys your lock-in. So, for example, jbrjake's cableco was (is?) sending AC-3 audio on a PID marked as type 6 with no descriptors at all.

The descriptors were frequently misleading or wrong: Until the 2007 rev of the mpeg2 standard there were no standard descriptors for things like AC-3, DTS or PCM. Since the cableco's wanted to broadcast surround sound, both ATSC & DVB came up with their own descriptors to describe these formats but, since they did their work on different sides of an ocean, in a few important cases the same code points got used to mean completely different things. It would have been possible to resolve this by first deciding if we had DVB or ATSC then decode the descriptors using system-specific parsing routines but, given the previous problem, it seemed easier to kill two birds with one stone & just look at each PES to figure out what was in it (since the PES formats were governed by a single, uniform standard until bluray decided to really muck things up).

In the new code I'm trying to change this. The descriptors are used to type PIDs and we only look into the PES if we don't have an audio and/or video stream.

As far as mis-typing audio, both the AC-3 & DTS standards talk at length about alias issues with their sync pattern. Both standards have a CRC for the entire frame (AC-3 has two CRCs - one for 5/8th of the frame & one for the entire thing). Their suggested resync algorithm is: If you've lost or never had sync find a sync pattern then, if and only if the frame CRC verifies, declare sync. If you have sync, you can just check that the next frame starts with a sync pattern (but it really doesn't cost anything to check CRC since you have to haul the bytes into your cache to decode them). When I saw eddy's 'check sync twice' addition to scan, I added a 'check audio frame crc' item to my todo list in the hope that we can eventually validate sync the way the standard suggests.

As far as the PCR increment, the PCR is only used to detect clock discontinuities. As long as it increments at a rate that doesn't cause it to pass the real PCR (time going backward is always regarded as a discontinuity but it has to jump forward by more than 500ms to be considered a discontinuity & PCR's have to show up at least every 100ms) no jitter is introduced because the old-to-new timing offset is derived from some media stream's PTS differences. The increment wasn't arbitrary - in the STD the PCR is incremented by the arriving bits of the data stream. Since the transmitter is fixed rate and that rate is announced in the adaptation header that contains the PCR, the receiver puts that rate into its VCO then feeds its PLL with the leading edge of the arriving bits. Since the PCR advances by a fixed amount for a fixed number of bits & we're generating (roughly) fixed length 2048 byte PS packets, you end up with a fixed increment to the PCR for each PS packet generated. But, all that being said, your way of handling the PCR is much simpler & cleaner - it never occurred to me that the PACK header could simply be left off when we didn't have a PCR - very nice!

Anyway, my preference for the upcoming release would be to be to put in your PAT, PMT & PCR fixes but do a simpler version of the type 0x83 interleaved stream handling that leaves the ps muxer as it is. To just extract the AC-3, I think most of the runtime work could be done with a small addition to hb_ts_stream_decode to not accumulate PES data for this type until it got 'start' with an 0x76 in the stream ID extension. (I'll try to code this up to make it clearer.)

Post by **JohnAStebbins** » Sun Nov 09, 2008 4:33 pm

Van, thanks for taking the time to give such a detailed critique. I agree with it all.
I've very glad to here that you had planned on separating TS demuxing from PS. I don't know how closely you looked at my code changes to handle the interleaved truehd/ac3. What I was trying to do was move as much of the logic into stream.c as possible. The changes to the ps demuxer were all tweaks to bypass processing that isn't really needed for the transport stream. The buffer id's are all determined in stream.c and the pid is one component of the id. As you said, the PS demuxer is really just copy overhead.

Regarding scanning unknow PIDs, that patch I posted is really awful. After sleeping on it I realized its based on a really bad assumption that there will be more successful sync checks than unsuccessful checks. Since buffers can be arbitrarily small and audio frames can span buffers, that assumption doesn't hold. Checking the CRC would be ideal, but I think that would require keeping context between calls to deca52BSInfo and then only returning success when a good CRC is found.

Sounds like you've got a lot on your plate. Is there anything I can help with? I'd be glad to take a whack at some of it with your direction.

Post by **JohnAStebbins** » Tue Nov 11, 2008 7:25 pm

I took a crack at implementing multiplexed truehd/ac3 handling per van's suggestion. Everything is self-contained in stream.c. The stream is marked as an ac3 stream. The truehd protions are dropped.

Also, there seem to be a significant number of blu-rays that have digicipher streams(stream id 0x80). We choke an die on these. I added an "ignore" kind to stream2codec_t and set streams with stream id 0x80 to ignore. We've discussed this before and concluded that a good way to handle this is to check the CRC's of (potential) ac3 streams. But that's not done yet and this is a cheap substitute. If this causes known problems, just say so and I'll remove it.

http://handbrake.fr/pastebin/pastebin.php?show=219

van · Post by **van** » Wed Nov 12, 2008 8:53 am

Looks like we were working in parallel John. I tried to fold your older code into my new ts stream architecture. Thanks to your great detective work it seems to handle type 0x83 just fine and so far has converted Iron Man & Crystal Skull with no problem. I haven't finished the performance work but I did do a bit more on finding random access points (all the m2ts content has them, it's hit or miss with the NZ ts files - some have RAP marks but they aren't actually restart points & there are no IDR frames at all).

Anyway, the RAP search plus forcing a search for true IDR frames in m2ts has meant that all the preview frames in my two test cases were perfect so cropping, etc., worked.

The patch is at http://handbrake.fr/pastebin/pastebin.php?show=224. Note that it's a work in progress but the more testing the better.

Post by **JohnAStebbins** » Wed Nov 12, 2008 3:51 pm

Very cool. I'll give it some testing.

I did that other patch because I got the impression that it would take you longer to crank that out. I thought it would make a good compromise till the real work could be done. I should have known better.

Post by **JohnAStebbins** » Wed Nov 12, 2008 7:37 pm

After looking over your patch, I've got a couple comments. When you lookup the stream extension id, your assuming it's at pes[[es[8]+8]. This works only if the pes extension doesn't have any extra bytes. The extension has a length field and could have additional bytes after the stream extension id.

Also, your assuming that the entire pes header will fit in a single transport packet. The adaptation field could push the header out across the boundary of 2 packets.

I know that these are only theoretical possibilities and they may be impossible due to limitations dictated by other specifications. I'm just wondering if you know something I don't, or your just hoping we don't bump into these corner cases.

van · Post by **van** » Wed Nov 12, 2008 8:00 pm

JohnAStebbins wrote:After looking over your patch, I've got a couple comments. When you lookup the stream extension id, your assuming it's at pes[[es[8]+8]. This works only if the pes extension doesn't have any extra bytes. The extension has a length field and could have additional bytes after the stream extension id.

Also, your assuming that the entire pes header will fit in a single transport packet. The adaptation field could push the header out across the boundary of 2 packets.

Yes, that's just a quick-and-dirty placeholder. To get rid of the memcpy & just do a buffer swap copy I need to strip the PES header off the data when we first see it rather than when we're about to return it. There's going to be a routine to parse and strip the header, moving the pts & dts into the buf. That was going to deal with robustly locating the extension id & the possibility of a split header.

Post by **JohnAStebbins** » Thu Nov 13, 2008 2:07 am

I have a question about discontinuities. On some blu-rays (e.g. Ratatouille), the title is divided into several segments (an m2ts file per segment). Each m2ts has its own time base. If I try to concatenate the segments together, hb gets "video time went backwards" errors at the segment boundaries (and skips very large chunks of the movie). Is there any code that tries to handle this type of discontinuity. I read through the scr_offset stuff, but it doesn't look like it's meant to take care of this kind of thing.

If we don't currently handle this, do you think it would be a difficult thing to do.

I've deciphered enough of the blu-ray playlist format (mpls files) that I can create a proper list of the needed m2ts files and extract other interesting factoids (video and audio info like duration, framerate, language). So I'm trying to figure out a way feed handbrake the resulting files.

van · Post by **van** » Thu Nov 13, 2008 8:21 am

JohnAStebbins wrote:I have a question about discontinuities. On some blu-rays (e.g. Ratatouille), the title is divided into several segments (an m2ts file per segment). Each m2ts has its own time base. If I try to concatenate the segments together, hb gets "video time went backwards" errors at the segment boundaries (and skips very large chunks of the movie). Is there any code that tries to handle this type of discontinuity. I read through the scr_offset stuff, but it doesn't look like it's meant to take care of this kind of thing.

This kind of discontinuity is exactly what the scr_offset stuff takes care of. It didn't work in this case because of a slightly subtle problem in the stream code: Since we're collecting all the pieces of each PES before returning it to reader, the packet order that reader sees is not the transport stream order. This is a problem because the PCR is defined relative to the transport stream order. Consider a big Video packet followed by and interleaved with a much smaller audio packet. Say there's a PCR change just before the audio starts. So the video is referenced to the old PCR while the audio is referenced to the new. What reader sees is the audio (since it finishes before the video) and processes the pcr change then sees the video, which has a timestamp from mars since it was referenced to the old pcr, and gets confused. That's more-or-less what was happening (for reasons too complicated to explain it was actually an audio getting put after a video which caused the correction to get screwed up rather than just a packet drop because we defer the correction to the first audio after a discontinuity).

I've redone the way PCRs are handled in the stream code to correct this. A new patch with this fix, a minor bug fix in the continuity error checking and a fix for the problem of dropping the leading PPS & SPS is at http://handbrake.fr/pastebin/pastebin.php?show=234.

JohnAStebbins wrote:I've deciphered enough of the blu-ray playlist format (mpls files) that I can create a proper list of the needed m2ts files and extract other interesting factoids (video and audio info like duration, framerate, language). So I'm trying to figure out a way feed handbrake the resulting files.

Great! My next project after the transport stream cleanup was to try and redo the 'content reader' in a cleaner way that would allow us to handle sources that are clip lists, arrays of A-to-B edit points, collections of titles from a multi-episode DVD, etc. This might be a framework that makes what you're trying to do a lot easier. I've got the code sketched out but it's very late & I'm too tired to write about it tonight. I'll try to post the ideas in a new development thread tomorrow or friday.

Post by **JohnAStebbins** » Thu Nov 13, 2008 4:27 pm

Gave the new patch a try. It helped, but still has occasional problems. At the beginning of the 5th segment, I got another "video time went backwards" and it dropped about 1200 frames. Upon playback audio continues playing cleanly, and I get a frozen video frame for what appears to be about the duration of the 1200 dropped frames (50 sec).

Code: Select all

[07:44:34] sync: video time didn't advance - dropped 1272 frames (delta 53011 ms, current 52442524, next 52446277, dur 3753)

For anyone that's interested, here's what I've figured out about blu-ray playlists so far.
http://handbrake.fr/pastebin/pastebin.php?show=239

van · Post by **van** » Thu Nov 13, 2008 5:30 pm

Hmm. I made two changes to help the problem. One was to send up PES packets as soon as we hit the end rather than waiting to the start of the next packet of that media stream. This reduces the window of vulnerability. The other was to explicitly detect the reordering & smash the timestamps so it couldn't mess up the scr logic. Turns out I totally screwed this up. In generate_output_data there's a conditional that looks like:

Code: Select all

    if ( stream->ts_buf[curstream]->cur &&
         stream->ts_buf[curstream]->cur < stream->ts_pcr_out &&
         (uint64_t)(stream->ts_pcr - stream->ts_pcrhist[stream->ts_pcr_out & 3]) > 200*90LL )

It should really be:

Code: Select all

    // check if this packet was referenced to an older pcr and if that
    // pcr was significantly different than the one we're using now.
    // (the reason for the uint cast on the pcr difference is that the
    // difference is significant if it advanced by more than 200ms or if
    // it went backwards by any amount. The negative numbers look like huge
    // unsigned ints so the cast allows both conditions to be checked at once.)
    int bufpcr = stream->ts_buf[curstream]->cur;
    int curpcr = stream->ts_pcr_out;
    if ( bufpcr && bufpcr < curpcr &&
         (uint64_t)(stream->ts_pcrhist[curpcr & 3] - stream->ts_pcrhist[bufpcr & 3]) > 200*90LL )

An updated patch is at http://handbrake.fr/pastebin/pastebin.php?show=240

Post by **JohnAStebbins** » Thu Nov 13, 2008 7:06 pm

Oooo, so close. But no cigar. Dropping 1 frame at several of the boundaries. Causes a video glitch in playback. Missing IDR frame kind of thing.

Code: Select all

10:45:53] sync: video time didn't advance - dropped 1 frames (delta 740 ms, current 61629832, next 61633586, dur 3754)
[10:45:53] sync: video time didn't advance - dropped 1 frames (delta 740 ms, current 61633586, next 61637340, dur 3754)
[10:45:54] sync: video time didn't advance - dropped 1 frames (delta 740 ms, current 61637340, next 61641094, dur 3754)
[10:45:54] sync: video time didn't advance - dropped 1 frames (delta 740 ms, current 61641094, next 61644847, dur 3753)

Note that I also have to comment out the code that stops after the expected frame count is reached since the title duration is based on the last timestamp in the source. This is something that your clip list plans will solve, so I'm pleased to hear about that.

van · Post by **van** » Thu Nov 13, 2008 9:16 pm

750ms is about how much audio lags video in typical HD content. What's probably happening is that since we zap the first video timestamp to re-sync on the audio, we lose information on the relative offset between audio & video at the splice then end up correcting the next time we see a video timestamp. There may be a way to get this down from 750ms to something closer to a frame time but I was just concerned that HB survive the transition without dropping lots of content (since this problem can appear in OTA TS streams due to loss) - when we move to a clip-based model this is a non-issue since each clip is read independently.

JohnAStebbins wrote:Note that I also have to comment out the code that stops after the expected frame count is reached since the title duration is based on the last timestamp in the source. This is something that your clip list plans will solve, so I'm pleased to hear about that.

Um, no, duration isn't based on the last timestamp - it's based on a moderately sophisticated non-linear robust regression on rate samples taken throughout the file. Take a look at the block of comments titled "hb_stream_duration" around line 867 in libhb/stream.c. This code was tuned for mpeg2 content which has a much smaller coefficient of variation than mpeg4 and also will have a problem if NDURSAMPLES is smaller than the number of clips you're concatenating. You could try upping it from 16 to 32 (or some number greater than the number of clips) and see if it computes the correct duration.

Post by **JohnAStebbins** » Thu Nov 13, 2008 10:24 pm

Oops. I now that I think about it, it was the ffmpeg code that calculates the duration that way. Bumping NDURSAMPLES to 32 did the trick as far as the duration problem goes.

For now, I guess I'll have to write up a script that encodes the segments independently and splice them together with mkvmerge.

I found some more information on avchd (blu-ray) specs. European patent application number 07011150.5 has a substantial amount of the syntax for all the files in its list of figures.
https://publications.european-patent-of ... 0895&ki=A1

bse · Post by **bse** » Sun Nov 16, 2008 5:20 am

I hope you don't mind me posting here. I'm not a handbrake developer, but I've been looking at the m2ts format and found this thread through a google search. I noticed the comments regarding blu-rays with multiple .m2ts files (such as Ratatouille), and from my understanding you may not be looking at the problem correctly.

My understanding (and please correct me if I'm wrong), is that you are dealing with "seamless branching", where a movie has alternate scenes. I Am Legend handles this by having two complete copies of the movie on the disc, even though only the last few minutes are different. Ratatouille, on the other hand, has multiple .mpls files where different .m2ts files are run depending on which version you choose. So two versions may point to the same movie start, then cut to one of two files for the next scene, then jump back to the same file for the following scene.

While there may be problems with the PCR (which I think resets for each file and has no relation to PCRs in other files), I think the real problem is syncing the audio with the video. The video will be at a different rate than the audio, so at the end of a file, the audio will last longer (or shorter - but most likely not the same duration) than the video. The only program I know of the handles this is eac3to, and I see that the developer (madshi) has already provided some insight on this thread. My guess is that the programs like PDVD and TMT that play these movies sync to the video, and when a new file starts, they stop the "old" movie and start the "new" one, where the stop occurs at the end of the last video frame. This will, in effect, throw away some audio, but that's fine since the "new" movie has it's own audio that starts with the new video.

In converting to a new (combined) stream, this isn't an option (right?). eac3to (I guess, from looking at the results) will drop some the the audio data to try to get it to line up more closely (within a millisecond or two, vs. 5-30 milliseconds otherwise) with the video. With several files, the errors will grow until the audio is noticeably out-of-sync if uncorrected, and probably isn't noticeable if corrected.

I'm not sure what the best solution is, but I hope the above info is useful.

BSE

HandBrake

lpcm in m2ts

lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: m2ts parsing problems

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts

Re: lpcm in m2ts