ftyp: | 这是一个筐,可以装mdat等其他Box。 例:00 00 00 14 66 74 79 70 69 73 6F 6D 00 00 02 00 6D 70 34 31 语义为:ftyp: Major brand: isom Minor version: 512 Compatible brand: mp41 free|skip 空白Box.装在ftyp等筐里 例:00 00 00 08 66 72 65 语意为: free: (null) |
moov: | 这是一个筐,里面很丰富 例:00 00 07 63 6D 6F 6F 76 本身属性没有。但后面全是它的内容 |
moov:mvhd: | 这是moov的header. 例: 00 00 00 6C 6D 76 68 64 00 00 00 00 7C 25 B0 80 7C 25 B0 80 00 00 03 E8 00 00 06 14 00 01 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03 语义: creation_time:2082844800 modification_time:2082844800 timescale:1000 //一秒分多少份,这里1000表示时间单位为1毫秒,这个设置很重要 duration:1556 rate:10000 表示1.0 volume:100 表示最大声 reserved:0 reserved[0]:0 reserved[1]:0 Matric[0]:10000 Matric[1]:0 Matric[2]:0 Matric[3]:0 Matric[4]:10000 Matric[5]:0 Matric[6]:0 Matric[7]:0 Matric[8]:40000000 Predefined[0]:0 Predefined[1]:0 Predefined[2]:0 Predefined[3]:0 Predefined[4]:0 Predefined[5]:0 next_track_ID:3 |
moov:trak: tkhd: | 这是track header 例1: 00 00 00 5C 74 6B 68 64 00 00 00 0F 7C 25 B0 80 7C 25 B0 80 00 00 00 01 00 00 00 00 00 00 06 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00 02 40 00 00 01 E0 00 00 语义: creation_time:2082844800 modification_time:2082844800 track_ID:1 第一轨index, 这个值很重要, 直接决定了视频和音频是否能同时出现, 如果音频和视频的track_ID都是1, 则会导致播放器无法播放. reserved_1:0 duration:1544 //这个值对播放器很重要, 具体时间还和mvhd的timescale相关,用来指定了时长,1544个时间单位,如果是毫秒为单位,则为1544毫秒, volume:0 //这是视频轨,无音响 reserved_2[0]:0 reserved_2[1]:0 layer:0 //由于我们的视频只有一层,所以这里总是0 alternate_group:0 reserved_3:0 Matric[0]:10000 Matric[1]:0 Matric[2]:0 Matric[3]:0 Matric[4]:10000 Matric[5]:0 Matric[6]:0 Matric[7]:0 Matric[8]:40000000 width:2400000 height:1e00000 //这两个单位都要右移16位才靠谱 例2语义: creation_time:2082844800 modification_time:2082844800 track_ID:2 reserved_1:0 duration:1556 //这个值对播放器很重要, 用来指定了时长,具体时间还和mvhd的timescale相关 volume:100 //这是音频轨,最大声 reserved_2[0]:0 reserved_2[1]:0 layer:0 //只对视频有意义 alternate_group:0 //总是0 reserved_3:0 Matric[0]:10000 Matric[1]:0 Matric[2]:0 Matric[3]:0 Matric[4]:10000 Matric[5]:0 Matric[6]:0 Matric[7]:0 Matric[8]:40000000 width:0 height:0 |
moov:trak: mdia: | 这个Box没有属性,是一个筐,装在trak里 |
moov:trak: mdia: mdhd | 例1:视频 creation_time:2082844800 modification_time:2082844800 timescale:24000 //这个位决定了播放的速度,不过他与duration的相乘后还是可以对应上面的mvhd设置 duration:37037 pad:0 Language[0]:21 Language[1]:14 Language[2]:4 pre_defined:0 例2:声频 creation_time:2082844800 modification_time:2082844800 timescale:44100 duration:68608 pad:0 //这个位无意义,是为了将后面language凑够16位之用 Language[0]:21 Language[1]:14 Language[2]:4 pre_defined:0 hdlr: handler 例1:视频 pre_defined:0 handler_type:vide //似乎除了handler_type以外,其余的属性无意义 reserved0 reserved:0 reserved:0 例2:声频 pre_defined:0 handler_type:soun reserved:0 reserved:0 reserved:0 |
moov:trak: mdia: minf: vmhd | Video media header 例: graphicsmode:0 //Video轨的合成模式,未知语义 opcolor:0 opcolor:0 opcolor:0 //同样未知 |
moov:trak: mdia: minf: dinf: dref | Data referrence 例: entry_count:1 //只有一条Entry url: //即使有也是没内容,测试文件无论音视都没有内容 |
moov:trak: mdia: minf: stbl: stsd | stsd: Sample Description box 这是一个table, 里面放有很多entry 例:entry_count:1 //视频,有一条entry VisualSampleEntry: stsd里装的一条一条的视频entry, 例: data_reference_index:1 pre_defined[0]:0 pre_defined[1]:0 pre_defined[2]:0 width:576 height:480 horizresolution:480000 vertresolution:480000 //常数,即72dpi reserved:0 frame_count:1 compressorname://无 depth:24 //颜色深度 pre_defined:-1 AudioSampleEntry: stsd里装的一条一条音频entry 例:reserved[0]:0 reserved[1]:0 channelcount:2 samplesize:16 pre_defined:0 reserved_2:0 samplerate:ac440000 //显然要右移16位才有意义 |
moov:trak: mdia: minf: stbl: stsd :mp4a | mp4a: aac box 这个box实际就是继承了audio sample entry box reserved[0]:0 reserved[1]:0 channelcount:2 samplesize:16 pre_defined:0 reserved_2:0 samplerate:56220000 |
moov:trak: mdia: minf: stbl: stsd : esds | esds: 包含在mp4a里, length:3 ES_ID:6400 streamDependenceFlag:0 URL_Flag=0 reserved=0 streamPriority:1 streamDependenceFlag:0 dependsOn_ES_ID:52685 m_iData_Size:23 //data的长度,算出来的 Data[23] //这里面有很一堆不知所云的数据, 一直到stts,但这堆数据极度重要,直接决定了解码器能否解码,在14496-1里有定义 //实践证明: 这组数据与采样率有关,44100一组, 22050一组, 48000又是一组, 44100可以与48000共用一组 |
moov:trak: mdia: minf: stbl: stts | stts: Time to sample 例1: 视频 entry_count:1 sample_count:37 //上面已经有duration时间了,duration指整个mdat中video的时长,这里37却为chunks数目 sample_delta:1001 //1001 * 37=37037 sample_delta*sample_count=duration 例2: 音频 sentry_count:1 sample_count:67 //音频分了67个chunks sample_delta:1024 //同上 |
moov:trak: mdia: minf: stbl: stss | stss: syn Sample box 例1: 视频 entry_count:1 sample_number:1 但音频里没有这个box, 这个Box非常重要, 决定了整个mp4文件是否可以拖拉, 如果这个box只有一个entry,则拖拉时将cpu达到100%, 如果这个box不存在, 可以拖拉, 也不会达到100%, 但是会略等一会, 通常做法可以搞100条. |
moov:trak: mdia: minf: stbl: stsc | stsc: Sample To Chunk Box 这个box非常重要,指示了在某一个chunk开始后面的chunks里每chunk有多少个sample, 一个sample就是一帧 例1:视频 entry_count:1 first chunk: 1, sample per chunk: 1, sample description index 1 例2:音频 entry_count:1 first chunk: 1, sample per chunk: 1, sample description index 1 |
moov:trak: mdia: minf: stbl: stsz | stsz: Sample Size Box, 这个box乃重中之重, 指示了每个sample的大小 例1:视频 sample size: 0 sample count: 37 5127 855 830 2327 2742 2373 2716 2365 3061 2170 1888 2427 2578 2218 2084 2138 2319 2586 2728 2322 3505 2624 1551 2725 2502 2072 1720 1382 2653 2177 1323 1492 1801 1765 1985 5028 3467 例2:音频 sample size: 0 sample count: 65 219 205 207 182 213 194 195 194 212 188 159 179 186 189 184 184 190 188 190 186 195 196 182 197 182 186 182 182 185 182 193 186 184 187 175 173 170 185 171 181 178 178 185 192 188 187 175 167 177 182 167 173 177 175 176 174 170 168 169 180 164 167 176 170 mdat Box中被划分为很多个chunk,这里指出了每个chunk的大小. |
moov:trak: mdia: minf: stbl: stco | stco: Chunk Offset Box,这也是最重要的box, 指示了每个chunk的开始位置 例1: 视频 entry_count:37 0x24 0x15d3 0x1aaf 0x1f84 0x2a20 0x35aa 0x404a 0x4c53 0x5705 0x6470 0x6da6 0x767c 0x8174 0x8d00 0x9725 0xa003 0xa9c9 0xb447 0xbfdc 0xcbf7 0xd5b8 0xe4c0 0xf064 0xf7da 0x103ea 0x10e70 0x117ff 0x1200d 0x126da 0x1328b 0x13bbd 0x14247 0x14973 0x151cd 0x15a0a 0x16272 0x17770 例2:音频 entry_count:65 0x142b 0x1506 0x192a 0x19f9 0x1ded 0x1ec2 0x289b 0x295e 0x34d6 0x3eef 0x3fab 0x4ae6 0x4b99 0x5590 0x564d 0x62fa 0x63b2 0x6cea 0x7506 0x75be 0x7ff7 0x80b1 0x8b86 0x8c4a 0x95aa 0x966f 0x9f49 0xa85d 0xa913 0xb2d8 0xb391 0xbe61 0xbf22 0xca84 0xcb3c 0xd509 0xe369 0xe416 0xef00 0xefb9 0xf673 0xf728 0x1027f 0x10331 0x10db0 0x11688 0x11744 0x11eb7 0x11f66 0x12573 0x12624 0x13137 0x131de 0x13b0c 0x140e8 0x14197 0x1481b 0x148c9 0x1507c 0x15124 0x158b2 0x15966 0x161cb 0x17616 0x176c6 |
moov:trak: mdia: minf: stbl: smhd | smhd: sound media header 例: balance:0 reserved:0 暂时未知语义 |
avcC: AVC descriptor box | avcC: AVC descriptor box非常重要, SPS PPS 都放这 在14496-15定义 例: configurationVersion:1 AVCProfileIndication:66 profile_compatibility:192 AVCLevelIndication;31 reserved_1:63 lengthSizeMinusOne:3 reserved_2:7 numOfSequenceParameterSets:1 numOfPictureParameterSets:1 SPS length: 24 //第一个SPS的长度, 多个SPS可以继续往下 PPS length: 4 //第一个PPS的长度, 多个PPS可以继续往下 aligned(8) class AVCDecoderConfigurationRecord { unsigned int(8) configurationVersion = 1; unsigned int(8) AVCProfileIndication; unsigned int(8) profile_compatibility; unsigned int(8) AVCLevelIndication; bit(6) reserved = ‘111111’b; unsigned int(2) lengthSizeMinusOne; bit(3) reserved = ‘111’b; unsigned int(5) numOfSequenceParameterSets; for (i=0; i< numOfSequenceParameterSets; i++) { unsigned int(16) sequenceParameterSetLength ; bit(8*sequenceParameterSetLength) sequenceParameterSetNALUnit; } unsigned int(8) numOfPictureParameterSets; for (i=0; i< numOfPictureParameterSets; i++) { unsigned int(16) pictureParameterSetLength; bit(8*pictureParameterSetLength) pictureParameterSetNALUnit; } } 见http://www.nhzjj.com/asp/admin/editor/newsfile/201011314552121.pdf |
btrt: bit rate box | btrt: bit rate box bufferSizeDB:7858 //告诉decoder开辟缓冲区大小? maxBitrate:413432 //最大Bit rate avgBitrate:371960 //平均Bit rate |
avc1 | class AVCSampleEntry() extends VisualSampleEntry (‘avc1’){ AVCConfigurationBox config; MPEG4BitRateBox (); // optional MPEG4ExtensionDescriptorsBox (); // optional } class AVCConfigurationBox extends Box(‘avcC’) { AVCDecoderConfigurationRecord() AVCConfig; } class MPEG4BitRateBox extends Box(‘btrt’){ unsigned int(32) bufferSizeDB; unsigned int(32) maxBitrate; unsigned int(32) avgBitrate; } class MPEG4ExtensionDescriptorsBox extends Box(‘m4ds’) { Descriptor Descr[0 .. 255]; } |