H.264复用到MP4使用libavformat不回放

本文介绍了H.264复用到MP4使用libavformat不回放的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想将H.264数据转换为MP4文件。将此H.264附件B数据保存到MP4文件中似乎没有错误，但该文件无法播放。

我做了二进制比较对文件和问题似乎是写在MP4文件的页脚（预告片）的某个地方。

我怀疑它必须是与

Init：

  AVOutputFormat * fmt = av_guess_format（0，out.mp4，0）; 
 oc = avformat_alloc_context（）; 
 oc-> oformat = fmt; 
 strcpy（oc-> filename，filename）;

这个原型应用程序的一部分是为每个IFrame创建一个png文件。所以当遇到第一个IFrame时，我创建视频流并写入头文件：

  void addVideoStream（AVCodecContext * decoder ）
 {
 videoStream = av_new_stream（oc，0）; 
 if（！videoStream）
 {
 cout<< ERROR creating video stream<< endl; 
 return; 
} 
 vi = videoStream-> index; 
 videoContext = videoStream-> codec; 
 videoContext-> codec_type = AVMEDIA_TYPE_VIDEO; 
 videoContext-> codec_id = decoder-> codec_id; 
 videoContext-> bit_rate = 512000; 
 videoContext-> width = decoder-> width; 
 videoContext-> height = decoder-> height; 
 videoContext-> time_base.den = 25; 
 videoContext-> time_base.num = 1; 
 videoContext-> gop_size = decoder-> gop_size; 
 videoContext-> pix_fmt = decoder-> pix_fmt; 
 
 if（oc-> oformat-> flags& AVFMT_GLOBALHEADER）
 videoContext-> flags | = CODEC_FLAG_GLOBAL_HEADER; 
 
 av_dump_format（oc，0，filename，1）; 
 
 if（！（oc-> oformat-> flags& AVFMT_NOFILE））
 {
 if（avio_open（& oc-> pb，filename，AVIO_FLAG_WRITE ）< 0）{
 cout<< 打开文件时出错< endl; 
} 
 avformat_write_header（oc，NULL）; 
}

我写信息包：

$ b b

  unsigned char * data = block-> getData（）; 
 unsigned char videoFrameType = data [4]; 
 int dataLen = block-> getDataLen（）; 
 
 // store pps 
 if（videoFrameType == 0x68）
 {
 if（ppsFrame！= NULL）
 {
 delete ppsFrame ; ppsFrameLength = 0; ppsFrame = NULL; 
} 
 ppsFrameLength = block-> getDataLen（）; 
 ppsFrame = new unsigned char [ppsFrameLength]; 
 memcpy（ppsFrame，block-> getData（），ppsFrameLength）; 
} 
 else if（videoFrameType == 0x67）
 {
 // sps 
 if（spsFrame！= NULL）
 {
 delete spsFrame; spsFrameLength = 0; spsFrame = NULL; 
} 
 spsFrameLength = block-> getDataLen（）; 
 spsFrame = new unsigned char [spsFrameLength]; 
 memcpy（spsFrame，block-> getData（），spsFrameLength）; 
} 
 
 if（videoFrameType == 0x65 || videoFrameType == 0x41）
 {
 videoFrameNumber ++; 
} 
 if（videoFrameType == 0x65）
 {
 decodeIFrame（videoFrameNumber，spsFrame，spsFrameLength，ppsFrame，ppsFrameLength，data，dataLen）; 
} 
 
 if（videoStream！= NULL）
 {
 AVPacket pkt = {0}; 
 av_init_packet（& pkt）; 
 pkt.stream_index = vi; 
 pkt.flags = 0; 
 pkt.pts = pkt.dts = 0; 
 
 if（videoFrameType == 0x65）
 {
 //组合SPS PPS& I frame together 
 pkt.flags | = AV_PKT_FLAG_KEY; 
 unsigned char * videoFrame = new unsigned char [spsFrameLength + ppsFrameLength + dataLen]; 
 memcpy（videoFrame，spsFrame，spsFrameLength）; 
 memcpy（& videoFrame [spsFrameLength]，ppsFrame，ppsFrameLength）; 
 memcpy（& videoFrame [spsFrameLength + ppsFrameLength]，data，dataLen）; 
 
 //覆盖起始码（32位长度的00 00 00 01）
 setLength（videoFrame，spsFrameLength-4）; 
 setLength（& videoFrame [spsFrameLength]，ppsFrameLength-4）; 
 setLength（& videoFrame [spsFrameLength + ppsFrameLength]，dataLen-4）; 
 pkt.size = dataLen + spsFrameLength + ppsFrameLength; 
 pkt.data = videoFrame; 
 av_interleaved_write_frame（oc，& pkt）; 
 delete videoFrame; videoFrame = NULL; 
} 
 else if（videoFrameType！= 0x67& amp; videoFrameType！= 0x68）
 {
 //发送除pps&捕获并存储的sps 
 pkt.size = dataLen; 
 pkt.data = data; 
 setLength（data，dataLen-4）; 
 av_interleaved_write_frame（oc，& pkt）; 
}

最后关闭文件：

  av_write_trailer（oc）; 
 int i = 0; 
 for（i = 0; i  {
 av_freep（& oc-> streams [i]  - > codec）; 
 av_freep（& oc-> streams [i]）; 
} 
 
 if（！（oc-> oformat-> flags& AVFMT_NOFILE））
 {
 avio_close（oc-> pb）; 
} 
 av_free（oc）;

如果我单独使用H.264数据并将其转换：

  ffmpeg -i recording.h264 -vcodec copy recording.mp4

除了文件的页脚外，所有文件都是相同的。

我的程序输出：
readrec recording.tcp out。 mp4
** START * 01-03-2013 14:26:01 180000
输出＃0，mp4到out.mp4：
Stream＃0：0：Video：h264，yuv420p，352x288，q = 2-31,512 kb / s，90k tbn，25 tbc
01-03-2013 14:27:01 102000
写了1499个视频帧。

如果我尝试使用ffmpeg转换使用CODE创建的MP4文件： / p>

ffmpeg -i out.mp4 -vcodec copy out2.mp4 ffmpeg version 0.11.1版权所有（c）2000- 2012的FFmpeg开发人员建于2013年3月7日12:49:22与suncc 0x5110 配置：--extra-cflags = -KPIC -g --disable-mmx --disable- protocol = udp --disable-encoder = nellymoser --cc = cc --cxx = CC libavutil 51. 54.100 / 51. 54.100 libavcodec 54. 23.100 / 54. 23.100 libavformat 54 。6.100 / 54. 6.100 libavdevice 54. 0.100 / 54. 0.100 libavfilter 2. 77.100 / 2. 77.100 libswscale 2. 1.100 / 2. 1.100 libswresample 0. 15.100 / 0. 15.100 h264 @ 12eaac0]无框架！最后一条消息重复1次 [h264 @ 12eaac0]切片类型太大（0）at 0 0 [h264 @ 12eaac0] decode_slice_header错误 [h264 @ 12eaac0] ！最后一条消息重复23次 [h264 @ 12eaac0]切片类型太大（0）在0 0 [h264 @ 12eaac0] decode_slice_header错误 [h264 @ 12eaac0] ！最后一条消息重复74次 [h264 @ 12eaac0]切片类型太大（0）at 0 0 [h264 @ 12eaac0] decode_slice_header错误 [h264 @ 12eaac0] ！最后一条消息重复64次 [h264 @ 12eaac0]切片类型太大（0）at 0 0 [h264 @ 12eaac0] decode_slice_header错误 [h264 @ 12eaac0] ！最后一条消息重复34次 [h264 @ 12eaac0]切片类型太大（0）在0 0 [h264 @ 12eaac0] decode_slice_header错误 [h264 @ 12eaac0] ！最后一条消息重复49次 [h264 @ 12eaac0]切片类型太大（0）在0 0 [h264 @ 12eaac0] decode_slice_header错误 [h264 @ 12eaac0] ！最后一条消息重复24次 [h264 @ 12eaac0]分区的H.264支持不完整 [h264 @ 12eaac0]无帧！最后一封信息重复23次 [h264 @ 12eaac0] sps_id超出范围 [h264 @ 12eaac0]无框架！最后一封信息重复148次 [h264 @ 12eaac0] sps_id（32）超出范围最后一封信息重复1次 [h264 @ 12eaac0] 最后一条消息重复33次 [h264 @ 12eaac0]切片类型太大（0）在0 0 [h264 @ 12eaac0] decode_slice_header错误 [h264 @ 12eaac0] ！最后一条消息重复128次 [h264 @ 12eaac0] sps_id（32）超出范围最后一条消息重复1次 [h264 @ 12eaac0] 最后一条消息重复3次 [h264 @ 12eaac0]切片类型太大（0）at 0 0 [h264 @ 12eaac0] decode_slice_header错误 [h264 @ 12eaac0] ！最后一条消息重复3次 [h264 @ 12eaac0]切片类型太大（0）at 0 0 [h264 @ 12eaac0] decode_slice_header错误 [h264 @ 12eaac0] ！最后一封信息重复309次 [h264 @ 12eaac0] sps_id（32）超出范围最后一封信息重复1次 [h264 @ 12eaac0] 最后一封信息重复192次 [h264 @ 12eaac0]分区的H.264支持不完整 [h264 @ 12eaac0]无帧！最后一条消息重复73次 [h264 @ 12eaac0] sps_id（32）超出范围最后一条消息重复1次 [h264 @ 12eaac0] 最后一条消息重复99次 [h264 @ 12eaac0] sps_id（32）超出范围最后一条消息重复1次 [h264 @ 12eaac0] 最后一条消息重复197次 [mov，mp4，m4a，3gp，3g2，mj2 @ 12e3100]解码流0失败 [mov，mp4，m4a，3gp，3g2， 12e3100]无法找到编解码器参数（视频：h264（avc1 / 0x31637661），393539 kb / s） out.mp4：找不到编解码器参数
我真的不知道问题在哪里，除了它必须与流的设置方式有关。我看了一些代码，其他人正在做类似的事情，并试图使用这个建议在设置流，但没有效果！

给我一个H.264 / AAC多路复用（同步）文件的最终代码如下。首先一点背景信息。数据来自IP摄像机。数据通过第三方API作为视频/音频数据包呈现。视频分组被呈现为RTP有效载荷数据（无报头），并且包括被重构并被转换为附件B格式的H.264视频的NALU。 AAC音频呈现为原始AAC，并转换为adts格式以启用播放。这些包已经被放入比特流格式，允许传输时间戳（1970年1月1日以来的64位毫秒）以及其他一些事情。

这是更多或更少的原型，在任何方面都不干净。它可能泄漏不好。

全局：
pre> AVFormatContext * oc = NULL; AVCodecContext * videoContext = NULL; AVStream * videoStream = NULL; AVCodecContext * audioContext = NULL; AVStream * audioStream = NULL; AVCodec * videoCodec = NULL; AVCodec * audioCodec = NULL; int vi = 0; //视频流 int ai = 1; //音频流 uint64_t firstVideoTimeStamp = 0; uint64_t firstAudioTimeStamp = 0; int audioStartOffset = 0; char * filename = NULL; Boolean first = TRUE; int videoFrameNumber = 0; int audioFrameNumber = 0;
主要：
int main（int argc，char * argv []） { if（argc！= 3） { cout< argv [0]< < stream playback file>< output mp4 file> << endl; return 0; } char * input_stream_file = argv [1]; filename = argv [2]; av_register_all（）; fstream inFile; inFile.open（input_stream_file，ios :: in）; //用于存储最新的pps& sps frames unsigned char * ppsFrame = NULL; int ppsFrameLength = 0; unsigned char * spsFrame = NULL; int spsFrameLength = 0; //设置MP4输出文件 AVOutputFormat * fmt = av_guess_format（0，filename，0）; oc = avformat_alloc_context（）; oc-> oformat = fmt; strcpy（oc-> filename，filename）; //以adts格式设置AAC的比特流过滤器。可能也可以实现 //这通过剥离前7个字节！ AVBitStreamFilterContext * bsfc = av_bitstream_filter_init（aac_adtstoasc）; if（！bsfc） { cout< 创建adtstoasc过滤器时出错<< endl; return -1; } while（inFile.good（）） { TcpAVDataBlock * block = new TcpAVDataBlock（）; block-> readStruct（inFile）; DateTime dt = block-> getTimestampAsDateTime（）; switch（block-> getPacketType（）） { case TCP_PACKET_H264： { if（firstVideoTimeStamp == 0） firstVideoTimeStamp = block- > getTimeStamp（）; unsigned char * data = block-> getData（）; unsigned char videoFrameType = data [4]; int dataLen = block-> getDataLen（）; // pps if（videoFrameType == 0x68） { if（ppsFrame！= NULL） { delete ppsFrame; ppsFrameLength = 0; ppsFrame = NULL; } ppsFrameLength = block-> getDataLen（）; ppsFrame = new unsigned char [ppsFrameLength]; memcpy（ppsFrame，block-> getData（），ppsFrameLength）; } else if（videoFrameType == 0x67） { // sps if（spsFrame！= NULL） { delete spsFrame; spsFrameLength = 0; spsFrame = NULL; } spsFrameLength = block-> getDataLen（）; spsFrame = new unsigned char [spsFrameLength]; memcpy（spsFrame，block-> getData（），spsFrameLength）; } if（videoFrameType == 0x65 || videoFrameType == 0x41） { videoFrameNumber ++; } //为每个I帧提取缩略图 if（videoFrameType == 0x65） { decodeIFrame（h264，spsFrame，spsFrameLength，ppsFrame，ppsFrameLength ，data，dataLen）; } if（videoStream！= NULL） { AVPacket pkt = {0}; av_init_packet（& pkt）; pkt.stream_index = vi; pkt.flags = 0; pkt.pts = videoFrameNumber; pkt.dts = videoFrameNumber; if（videoFrameType == 0x65） { pkt.flags = 1; unsigned char * videoFrame = new unsigned char [spsFrameLength + ppsFrameLength + dataLen]; memcpy（videoFrame，spsFrame，spsFrameLength）; ¥b $ b memcpy（& videoFrame [spsFrameLength]，ppsFrame，ppsFrameLength）; memcpy（& videoFrame [spsFrameLength + ppsFrameLength]，data，dataLen）; pkt.data = videoFrame; av_interleaved_write_frame（oc，& pkt）; delete videoFrame; videoFrame = NULL; } else if（videoFrameType！= 0x67&& videoFrameType！= 0x68） { pkt.size = dataLen; pkt.data = data; av_interleaved_write_frame（oc，& pkt）; } } break; } case TCP_PACKET_AAC： if（firstAudioTimeStamp == 0） { firstAudioTimeStamp = block-> getTimeStamp（）; uint64_t millseconds_difference = firstAudioTimeStamp - firstVideoTimeStamp; audioStartOffset = millseconds_difference * 16000/1000; cout<< audio offset：< audioStartOffset<< endl; } if（audioStream！= NULL） { AVPacket pkt = {0}; av_init_packet（& pkt）; pkt.stream_index = ai; pkt.flags = 1; pkt.pts = audioFrameNumber * 1024; pkt.dts = audioFrameNumber * 1024; pkt.data = block-> getData（）; pkt.size = block-> getDataLen（）; pkt.duration = 1024; AVPacket newpacket = pkt; int rc = av_bitstream_filter_filter（bsfc，audioContext， NULL， & newpacket.data，& newpacket.size， pkt.data，pkt.size， pkt.flags& AV_PKT_FLAG_KEY）; if（rc> = 0） { // cout< 写入音频帧< endl; newpacket.pts = audioFrameNumber * 1024; newpacket.dts = audioFrameNumber * 1024; audioFrameNumber ++; newpacket.duration = 1024; av_interleaved_write_frame（oc，& newpacket）; av_free_packet（& newpacket）; } else { cout< Error filtering aac packet< endl; } } break; case TCP_PACKET_START： break; case TCP_PACKET_END： break; } 删除块; } inFile.close（）; av_write_trailer（oc）; int i = 0; for（i = 0; i { av_freep（& oc-> streams [i] - > codec）; av_freep（& oc-> streams [i]）; } if（！（oc-> oformat-> flags& AVFMT_NOFILE）） { avio_close（oc-> pb）; } av_free（oc）; delete spsFrame; spsFrame = NULL; delete ppsFrame; ppsFrame = NULL; cout<< Wrote< videoFrameNumber<< 视频帧。 << endl; return 0; }
添加流流/编解码器， addVideoAndAudioStream（）。这个函数从decodeIFrame（）调用，因此有一些假设（这不一定是好的）
1.视频包先出现
2. AAC出现

decodeIFrame是一个单独的原型，我在其中为每个I帧创建一个缩略图。生成缩略图的代码来自：

decodeIFrame函数将AVCodecContext传递给addVideoAudioStream：
void addVideoAndAudioStream（AVCodecContext * decoder = NULL） { videoStream = av_new_stream（oc，0）; if（！videoStream） { cout<< ERROR creating video stream<< endl; return; } vi = videoStream-> index; videoContext = videoStream-> codec; videoContext-> codec_type = AVMEDIA_TYPE_VIDEO; videoContext-> codec_id = decoder-> codec_id; videoContext-> bit_rate = 512000; videoContext-> width = decoder-> width; videoContext-> height = decoder-> height; videoContext-> time_base.den = 25; videoContext-> time_base.num = 1; videoContext-> gop_size = decoder-> gop_size; videoContext-> pix_fmt = decoder-> pix_fmt; audioStream = av_new_stream（oc，1）; if（！audioStream） { cout< ERROR creating audio stream<< endl; return; } ai = audioStream-> index; audioContext = audioStream-> codec; audioContext-> codec_type = AVMEDIA_TYPE_AUDIO; audioContext-> codec_id = CODEC_ID_AAC; audioContext-> bit_rate = 64000; audioContext-> sample_rate = 16000; audioContext-> channels = 1; if（oc-> oformat-> flags& AVFMT_GLOBALHEADER） { videoContext-> flags | = CODEC_FLAG_GLOBAL_HEADER; audioContext-> flags | = CODEC_FLAG_GLOBAL_HEADER; } av_dump_format（oc，0，filename，1）; if（！（oc-> oformat-> flags& AVFMT_NOFILE）） { if（avio_open（& oc-> pb，filename，AVIO_FLAG_WRITE ）< 0）{ cout<< 打开文件时出错< endl; } } avformat_write_header（oc，NULL）; }
据我所知，事物，例如：
1.比特率。实际视频比特率是〜262k，而我指定512kbit
2. AAC通道。我指定mono，虽然实际输出是从内存的立体声

你仍然需要知道什么是帧速率（时基）是为视频&音频。

与许多其他示例相反，当设置pts& dts对视频包，它不可播放。我需要知道时基（25fps），然后设置pts& dts根据该时基，即第一帧= 0（PPS，SPS，I），第二帧= 1（中间帧，无论其被称为;））。

AAC我也不得不假设它是16000 hz。每AAC包1024个样本（你也可以有AAC @ 960样本我想）确定音频偏移。我把它添加到pts& dts。因此，pts / dts是要在其上回放的样本号。您还需要确保在写入之前在包中设置1024的持续时间。

-

我现在发现，附件B并不真正与任何其他播放器兼容，所以AVCC格式应该真正使用。

这些URLS帮助：

在构建视频流时，我填写了extradata& ; extradata_size：
// Extradata包含PPS& SPS用于AVCC格式 int extradata_len = 8 + spsFrameLen-4 + 1 + 2 + ppsFrameLen-4; videoContext-> extradata =（uint8_t *）av_mallocz（extradata_len）; videoContext-> extradata_size = extradata_len; videoContext-> extradata [0] = 0x01; videoContext-> extradata [1] = spsFrame [4 + 1]; videoContext-> extradata [2] = spsFrame [4 + 2]; videoContext-> extradata [3] = spsFrame [4 + 3]; videoContext-> extradata [4] = 0xFC | 3; videoContext-> extradata [5] = 0xE0 | 1; int tmp = spsFrameLen - 4; videoContext-> extradata [6] =（tmp>> 8）& 0x00ff; videoContext-> extradata [7] = tmp& 0x00ff; int i = 0; for（i = 0; i videoContext-> extradata [8 + i] = spsFrame [4 + i] videoContext-> extradata [8 + tmp] = 0x01; int tmp2 = ppsFrameLen-4; videoContext-> extradata [8 + tmp + 1] =（tmp2>> 8）& 0x00ff; videoContext-& extradata [8 + tmp + 2] = tmp2& 0x00ff; for（i = 0; i videoContext-> extradata [8 + tmp + 3 + i] = ppsFrame [4 + i]
写出框架时， PPS帧，只是写出I帧& P帧。此外，将前4个字节（0x00 0x00 0x00 0x01）中包含的附件B起始码替换为I / P帧的大小。
解决方案
请让我总结一下：（原始）代码的问题是 av_interleaved_write_frame（）的输入不应该以包长度开头。如果您不删除 00 00 00 01 开始代码，该文件仍然可以播放，但是IMHO是播放器的弹性行为，我不会指望这。

I am trying to mux H.264 data into a MP4 file. There appear to be no errors in saving this H.264 Annex B data out to an MP4 file, but the file fails to playback.
I've done a binary comparison on the files and the issue seems to be somewhere in what is being written to the footer (trailer) of the MP4 file.
I suspect it has to be something with the way the stream is being created or something.
Init:
AVOutputFormat* fmt = av_guess_format( 0, "out.mp4", 0 ); oc = avformat_alloc_context(); oc->oformat = fmt; strcpy(oc->filename, filename);
Part of this prototype app I have is creating a png file for each IFrame. So when the first IFrame is encountered, I create the video stream and write the av header etc:
void addVideoStream(AVCodecContext* decoder) { videoStream = av_new_stream(oc, 0); if (!videoStream) { cout << "ERROR creating video stream" << endl; return; } vi = videoStream->index; videoContext = videoStream->codec; videoContext->codec_type = AVMEDIA_TYPE_VIDEO; videoContext->codec_id = decoder->codec_id; videoContext->bit_rate = 512000; videoContext->width = decoder->width; videoContext->height = decoder->height; videoContext->time_base.den = 25; videoContext->time_base.num = 1; videoContext->gop_size = decoder->gop_size; videoContext->pix_fmt = decoder->pix_fmt; if (oc->oformat->flags & AVFMT_GLOBALHEADER) videoContext->flags |= CODEC_FLAG_GLOBAL_HEADER; av_dump_format(oc, 0, filename, 1); if (!(oc->oformat->flags & AVFMT_NOFILE)) { if (avio_open(&oc->pb, filename, AVIO_FLAG_WRITE) < 0) { cout << "Error opening file" << endl; } avformat_write_header(oc, NULL); }
I write packets out:
unsigned char* data = block->getData(); unsigned char videoFrameType = data[4]; int dataLen = block->getDataLen(); // store pps if (videoFrameType == 0x68) { if (ppsFrame != NULL) { delete ppsFrame; ppsFrameLength = 0; ppsFrame = NULL; } ppsFrameLength = block->getDataLen(); ppsFrame = new unsigned char[ppsFrameLength]; memcpy(ppsFrame, block->getData(), ppsFrameLength); } else if (videoFrameType == 0x67) { // sps if (spsFrame != NULL) { delete spsFrame; spsFrameLength = 0; spsFrame = NULL; } spsFrameLength = block->getDataLen(); spsFrame = new unsigned char[spsFrameLength]; memcpy(spsFrame, block->getData(), spsFrameLength); } if (videoFrameType == 0x65 || videoFrameType == 0x41) { videoFrameNumber++; } if (videoFrameType == 0x65) { decodeIFrame(videoFrameNumber, spsFrame, spsFrameLength, ppsFrame, ppsFrameLength, data, dataLen); } if (videoStream != NULL) { AVPacket pkt = { 0 }; av_init_packet(&pkt); pkt.stream_index = vi; pkt.flags = 0; pkt.pts = pkt.dts = 0; if (videoFrameType == 0x65) { // combine the SPS PPS & I frames together pkt.flags |= AV_PKT_FLAG_KEY; unsigned char* videoFrame = new unsigned char[spsFrameLength+ppsFrameLength+dataLen]; memcpy(videoFrame, spsFrame, spsFrameLength); memcpy(&videoFrame[spsFrameLength], ppsFrame, ppsFrameLength); memcpy(&videoFrame[spsFrameLength+ppsFrameLength], data, dataLen); // overwrite the start code (00 00 00 01 with a 32-bit length) setLength(videoFrame, spsFrameLength-4); setLength(&videoFrame[spsFrameLength], ppsFrameLength-4); setLength(&videoFrame[spsFrameLength+ppsFrameLength], dataLen-4); pkt.size = dataLen + spsFrameLength + ppsFrameLength; pkt.data = videoFrame; av_interleaved_write_frame(oc, &pkt); delete videoFrame; videoFrame = NULL; } else if (videoFrameType != 0x67 && videoFrameType != 0x68) { // Send other frames except pps & sps which are caught and stored pkt.size = dataLen; pkt.data = data; setLength(data, dataLen-4); av_interleaved_write_frame(oc, &pkt); }
Finally to close the file off:
av_write_trailer(oc); int i = 0; for (i = 0; i < oc->nb_streams; i++) { av_freep(&oc->streams[i]->codec); av_freep(&oc->streams[i]); } if (!(oc->oformat->flags & AVFMT_NOFILE)) { avio_close(oc->pb); } av_free(oc);
If I take the H.264 data alone and convert it:
ffmpeg -i recording.h264 -vcodec copy recording.mp4
All but the "footer" of the files are the same.
Output from my program: readrec recording.tcp out.mp4 ** START * 01-03-2013 14:26:01 180000 Output #0, mp4, to 'out.mp4': Stream #0:0: Video: h264, yuv420p, 352x288, q=2-31, 512 kb/s, 90k tbn, 25 tbc * END ** 01-03-2013 14:27:01 102000 Wrote 1499 video frames.
If I try to convert using ffmpeg the MP4 file created using CODE:
ffmpeg -i out.mp4 -vcodec copy out2.mp4 ffmpeg version 0.11.1 Copyright (c) 2000-2012 the FFmpeg developers built on Mar 7 2013 12:49:22 with suncc 0x5110 configuration: --extra-cflags=-KPIC -g --disable-mmx --disable-protocol=udp --disable-encoder=nellymoser --cc=cc --cxx=CC libavutil 51. 54.100 / 51. 54.100 libavcodec 54. 23.100 / 54. 23.100 libavformat 54. 6.100 / 54. 6.100 libavdevice 54. 0.100 / 54. 0.100 libavfilter 2. 77.100 / 2. 77.100 libswscale 2. 1.100 / 2. 1.100 libswresample 0. 15.100 / 0. 15.100 h264 @ 12eaac0] no frame! Last message repeated 1 times [h264 @ 12eaac0] slice type too large (0) at 0 0 [h264 @ 12eaac0] decode_slice_header error [h264 @ 12eaac0] no frame! Last message repeated 23 times [h264 @ 12eaac0] slice type too large (0) at 0 0 [h264 @ 12eaac0] decode_slice_header error [h264 @ 12eaac0] no frame! Last message repeated 74 times [h264 @ 12eaac0] slice type too large (0) at 0 0 [h264 @ 12eaac0] decode_slice_header error [h264 @ 12eaac0] no frame! Last message repeated 64 times [h264 @ 12eaac0] slice type too large (0) at 0 0 [h264 @ 12eaac0] decode_slice_header error [h264 @ 12eaac0] no frame! Last message repeated 34 times [h264 @ 12eaac0] slice type too large (0) at 0 0 [h264 @ 12eaac0] decode_slice_header error [h264 @ 12eaac0] no frame! Last message repeated 49 times [h264 @ 12eaac0] slice type too large (0) at 0 0 [h264 @ 12eaac0] decode_slice_header error [h264 @ 12eaac0] no frame! Last message repeated 24 times [h264 @ 12eaac0] Partitioned H.264 support is incomplete [h264 @ 12eaac0] no frame! Last message repeated 23 times [h264 @ 12eaac0] sps_id out of range [h264 @ 12eaac0] no frame! Last message repeated 148 times [h264 @ 12eaac0] sps_id (32) out of range Last message repeated 1 times [h264 @ 12eaac0] no frame! Last message repeated 33 times [h264 @ 12eaac0] slice type too large (0) at 0 0 [h264 @ 12eaac0] decode_slice_header error [h264 @ 12eaac0] no frame! Last message repeated 128 times [h264 @ 12eaac0] sps_id (32) out of range Last message repeated 1 times [h264 @ 12eaac0] no frame! Last message repeated 3 times [h264 @ 12eaac0] slice type too large (0) at 0 0 [h264 @ 12eaac0] decode_slice_header error [h264 @ 12eaac0] no frame! Last message repeated 3 times [h264 @ 12eaac0] slice type too large (0) at 0 0 [h264 @ 12eaac0] decode_slice_header error [h264 @ 12eaac0] no frame! Last message repeated 309 times [h264 @ 12eaac0] sps_id (32) out of range Last message repeated 1 times [h264 @ 12eaac0] no frame! Last message repeated 192 times [h264 @ 12eaac0] Partitioned H.264 support is incomplete [h264 @ 12eaac0] no frame! Last message repeated 73 times [h264 @ 12eaac0] sps_id (32) out of range Last message repeated 1 times [h264 @ 12eaac0] no frame! Last message repeated 99 times [h264 @ 12eaac0] sps_id (32) out of range Last message repeated 1 times [h264 @ 12eaac0] no frame! Last message repeated 197 times [mov,mp4,m4a,3gp,3g2,mj2 @ 12e3100] decoding for stream 0 failed [mov,mp4,m4a,3gp,3g2,mj2 @ 12e3100] Could not find codec parameters (Video: h264 (avc1 / 0x31637661), 393539 kb/s) out.mp4: could not find codec parameters
I really do not know where the issue is, except it has to be something to do with the way the streams are being set up. I've looked at bits of code from where other people are doing a similar thing, and tried to use this advice in setting up the streams, but to no avail!
The final code which gave me a H.264/AAC muxed (synced) file is as follows. First a bit of background information. The data is coming from an IP camera. The data is presented via a 3rd party API as video/audio packets. The video packets are presented as the RTP payload data (no header) and consist of NALU's that are reconstructed and converted to H.264 video in Annex B format. AAC audio is presented as raw AAC and is converted to adts format to enable playback. These packets have been put into a bitstream format that allows the transmission of the timestamp (64 bit milliseconds since Jan 1 1970) along with a few other things.
This is more or less a prototype and is not clean in any respects. It probably leaks bad. I do however, hope this helps anyone else out trying to achieve something similar to what I am.
Globals:
AVFormatContext* oc = NULL; AVCodecContext* videoContext = NULL; AVStream* videoStream = NULL; AVCodecContext* audioContext = NULL; AVStream* audioStream = NULL; AVCodec* videoCodec = NULL; AVCodec* audioCodec = NULL; int vi = 0; // Video stream int ai = 1; // Audio stream uint64_t firstVideoTimeStamp = 0; uint64_t firstAudioTimeStamp = 0; int audioStartOffset = 0; char* filename = NULL; Boolean first = TRUE; int videoFrameNumber = 0; int audioFrameNumber = 0;
Main:
int main(int argc, char* argv[]) { if (argc != 3) { cout << argv[0] << " <stream playback file> <output mp4 file>" << endl; return 0; } char* input_stream_file = argv[1]; filename = argv[2]; av_register_all(); fstream inFile; inFile.open(input_stream_file, ios::in); // Used to store the latest pps & sps frames unsigned char* ppsFrame = NULL; int ppsFrameLength = 0; unsigned char* spsFrame = NULL; int spsFrameLength = 0; // Setup MP4 output file AVOutputFormat* fmt = av_guess_format( 0, filename, 0 ); oc = avformat_alloc_context(); oc->oformat = fmt; strcpy(oc->filename, filename); // Setup the bitstream filter for AAC in adts format. Could probably also achieve // this by stripping the first 7 bytes! AVBitStreamFilterContext* bsfc = av_bitstream_filter_init("aac_adtstoasc"); if (!bsfc) { cout << "Error creating adtstoasc filter" << endl; return -1; } while (inFile.good()) { TcpAVDataBlock* block = new TcpAVDataBlock(); block->readStruct(inFile); DateTime dt = block->getTimestampAsDateTime(); switch (block->getPacketType()) { case TCP_PACKET_H264: { if (firstVideoTimeStamp == 0) firstVideoTimeStamp = block->getTimeStamp(); unsigned char* data = block->getData(); unsigned char videoFrameType = data[4]; int dataLen = block->getDataLen(); // pps if (videoFrameType == 0x68) { if (ppsFrame != NULL) { delete ppsFrame; ppsFrameLength = 0; ppsFrame = NULL; } ppsFrameLength = block->getDataLen(); ppsFrame = new unsigned char[ppsFrameLength]; memcpy(ppsFrame, block->getData(), ppsFrameLength); } else if (videoFrameType == 0x67) { // sps if (spsFrame != NULL) { delete spsFrame; spsFrameLength = 0; spsFrame = NULL; } spsFrameLength = block->getDataLen(); spsFrame = new unsigned char[spsFrameLength]; memcpy(spsFrame, block->getData(), spsFrameLength); } if (videoFrameType == 0x65 || videoFrameType == 0x41) { videoFrameNumber++; } // Extract a thumbnail for each I-Frame if (videoFrameType == 0x65) { decodeIFrame(h264, spsFrame, spsFrameLength, ppsFrame, ppsFrameLength, data, dataLen); } if (videoStream != NULL) { AVPacket pkt = { 0 }; av_init_packet(&pkt); pkt.stream_index = vi; pkt.flags = 0; pkt.pts = videoFrameNumber; pkt.dts = videoFrameNumber; if (videoFrameType == 0x65) { pkt.flags = 1; unsigned char* videoFrame = new unsigned char[spsFrameLength+ppsFrameLength+dataLen]; memcpy(videoFrame, spsFrame, spsFrameLength); memcpy(&videoFrame[spsFrameLength], ppsFrame, ppsFrameLength); memcpy(&videoFrame[spsFrameLength+ppsFrameLength], data, dataLen); pkt.data = videoFrame; av_interleaved_write_frame(oc, &pkt); delete videoFrame; videoFrame = NULL; } else if (videoFrameType != 0x67 && videoFrameType != 0x68) { pkt.size = dataLen; pkt.data = data; av_interleaved_write_frame(oc, &pkt); } } break; } case TCP_PACKET_AAC: if (firstAudioTimeStamp == 0) { firstAudioTimeStamp = block->getTimeStamp(); uint64_t millseconds_difference = firstAudioTimeStamp - firstVideoTimeStamp; audioStartOffset = millseconds_difference * 16000 / 1000; cout << "audio offset: " << audioStartOffset << endl; } if (audioStream != NULL) { AVPacket pkt = { 0 }; av_init_packet(&pkt); pkt.stream_index = ai; pkt.flags = 1; pkt.pts = audioFrameNumber*1024; pkt.dts = audioFrameNumber*1024; pkt.data = block->getData(); pkt.size = block->getDataLen(); pkt.duration = 1024; AVPacket newpacket = pkt; int rc = av_bitstream_filter_filter(bsfc, audioContext, NULL, &newpacket.data, &newpacket.size, pkt.data, pkt.size, pkt.flags & AV_PKT_FLAG_KEY); if (rc >= 0) { //cout << "Write audio frame" << endl; newpacket.pts = audioFrameNumber*1024; newpacket.dts = audioFrameNumber*1024; audioFrameNumber++; newpacket.duration = 1024; av_interleaved_write_frame(oc, &newpacket); av_free_packet(&newpacket); } else { cout << "Error filtering aac packet" << endl; } } break; case TCP_PACKET_START: break; case TCP_PACKET_END: break; } delete block; } inFile.close(); av_write_trailer(oc); int i = 0; for (i = 0; i < oc->nb_streams; i++) { av_freep(&oc->streams[i]->codec); av_freep(&oc->streams[i]); } if (!(oc->oformat->flags & AVFMT_NOFILE)) { avio_close(oc->pb); } av_free(oc); delete spsFrame; spsFrame = NULL; delete ppsFrame; ppsFrame = NULL; cout << "Wrote " << videoFrameNumber << " video frames." << endl; return 0; }
The stream stream/codecs are added and the header is created in a function called addVideoAndAudioStream(). This function is called from decodeIFrame() so there are a few assumptions (which aren't necessarily good)1. A video packet comes first2. AAC is present
The decodeIFrame was kind of a separate prototype by where I was creating a thumbnail for each I Frame. The code to generate thumbnails was from: https://gnunet.org/svn/Extractor/src/plugins/thumbnailffmpeg_extractor.c
The decodeIFrame function passes an AVCodecContext into addVideoAudioStream:
void addVideoAndAudioStream(AVCodecContext* decoder = NULL) { videoStream = av_new_stream(oc, 0); if (!videoStream) { cout << "ERROR creating video stream" << endl; return; } vi = videoStream->index; videoContext = videoStream->codec; videoContext->codec_type = AVMEDIA_TYPE_VIDEO; videoContext->codec_id = decoder->codec_id; videoContext->bit_rate = 512000; videoContext->width = decoder->width; videoContext->height = decoder->height; videoContext->time_base.den = 25; videoContext->time_base.num = 1; videoContext->gop_size = decoder->gop_size; videoContext->pix_fmt = decoder->pix_fmt; audioStream = av_new_stream(oc, 1); if (!audioStream) { cout << "ERROR creating audio stream" << endl; return; } ai = audioStream->index; audioContext = audioStream->codec; audioContext->codec_type = AVMEDIA_TYPE_AUDIO; audioContext->codec_id = CODEC_ID_AAC; audioContext->bit_rate = 64000; audioContext->sample_rate = 16000; audioContext->channels = 1; if (oc->oformat->flags & AVFMT_GLOBALHEADER) { videoContext->flags |= CODEC_FLAG_GLOBAL_HEADER; audioContext->flags |= CODEC_FLAG_GLOBAL_HEADER; } av_dump_format(oc, 0, filename, 1); if (!(oc->oformat->flags & AVFMT_NOFILE)) { if (avio_open(&oc->pb, filename, AVIO_FLAG_WRITE) < 0) { cout << "Error opening file" << endl; } } avformat_write_header(oc, NULL); }
As far as I can tell, a number of assumptions didn't seem to matter, for example:1. Bit Rate. The actual video bit rate was ~262k whereas I specified 512kbit2. AAC channels. I specified mono, although the actual output was Stereo from memory
You would still need to know what the frame rate (time base) is for the video & audio.
Contrary to a lot of other examples, when setting pts & dts on the video packets, it was not playable. I needed to know the time base (25fps) and then set the pts & dts according to that time base, i.e. first frame = 0 (PPS, SPS, I), second frame = 1 (intermediate frame, whatever its called ;)).
AAC I also had to make the assumption that it was 16000 hz. 1024 samples per AAC packet (You can also have AAC @ 960 samples I think) to determine the audio "offset". I added this to the pts & dts. So the pts/dts are the sample number that it is to played back at. You also need to make sure that the duration of 1024 is set in the packet before writing also.
--
I have found additionally today that Annex B isn't really compatible with any other player so AVCC format should really be used.
These URLS helped:Problem to Decode H264 video over RTP with ffmpeg (libavcodec)http://aviadr1.blogspot.com.au/2010/05/h264-extradata-partially-explained-for.html
When constructing the video stream, I filled out the extradata & extradata_size:
// Extradata contains PPS & SPS for AVCC format int extradata_len = 8 + spsFrameLen-4 + 1 + 2 + ppsFrameLen-4; videoContext->extradata = (uint8_t*)av_mallocz(extradata_len); videoContext->extradata_size = extradata_len; videoContext->extradata[0] = 0x01; videoContext->extradata[1] = spsFrame[4+1]; videoContext->extradata[2] = spsFrame[4+2]; videoContext->extradata[3] = spsFrame[4+3]; videoContext->extradata[4] = 0xFC | 3; videoContext->extradata[5] = 0xE0 | 1; int tmp = spsFrameLen - 4; videoContext->extradata[6] = (tmp >> 8) & 0x00ff; videoContext->extradata[7] = tmp & 0x00ff; int i = 0; for (i=0;i<tmp;i++) videoContext->extradata[8+i] = spsFrame[4+i]; videoContext->extradata[8+tmp] = 0x01; int tmp2 = ppsFrameLen-4; videoContext->extradata[8+tmp+1] = (tmp2 >> 8) & 0x00ff; videoContext->extradata[8+tmp+2] = tmp2 & 0x00ff; for (i=0;i<tmp2;i++) videoContext->extradata[8+tmp+3+i] = ppsFrame[4+i];
When writing out the frames, don't prepend the SPS & PPS frames, just write out the I Frame & P frames. In addition, replace the Annex B start code contained in the first 4 bytes (0x00 0x00 0x00 0x01) with the size of the I/P frame.
解决方案
Please let me sum it up: the problem with your (original) code was that the input to av_interleaved_write_frame() should not start with the packet length. The file may still be playable if you don't strip the 00 00 00 01 start codes, but that IMHO is a resilience behavior of the player, and I would not count on this.

这篇关于H.264复用到MP4使用libavformat不回放的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！