问题描述
我有一个应用程序使用 Kickflip 和 ButterflyTV libRTMP
现在应用程序在 99% 的情况下都可以正常工作,但有时我会遇到无法调试的本地分段错误,因为消息太神秘:
01-24 10:52:25.576 199-199/?A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***01-24 10:52:25.576 199-199/?A/DEBUG:构建指纹:'google/hammerhead/hammerhead:6.0.1/M4B30Z/3437181:user/release-keys'01-24 10:52:25.576 199-199/?A/DEBUG:修订:'11'01-24 10:52:25.576 199-199/?A/调试:ABI:手臂"01-24 10:52:25.576 199-199/?A/DEBUG:pid:14302,tid:14382,名称:MuxerThread>>>tv.myapp.broadcast.dev <<<01-24 10:52:25.576 199-199/?A/DEBUG:信号 11 (SIGSEGV),代码 2 (SEGV_ACCERR),故障地址 0x9fef100001-24 10:52:25.636 199-199/?A/DEBUG:中止消息:正在准备就绪!"01-24 10:52:25.636 199-199/?A/调试:r0 9c6f9500 r1 9c6f94fc r2 9fee900c r3 00007ff401-24 10:52:25.636 199-199/?A/调试:r4 9fee9010 r5 9fef0ffd r6 00007ff1 r7 9fef0d8801-24 10:52:25.636 199-199/?A/调试:r8 cfe40980 r9 9e0a6900 sl 00007ff4 fp 9c6f94fc01-24 10:52:25.636 199-199/?A/DEBUG: ip 9c6f9058 sp 9c6f94dc lr 000000e9 pc b3a33cb6 cpsr 800f003001-24 10:52:25.650 199-199/?A/DEBUG:回溯:01-24 10:52:25.651 199-199/?A/调试:#00 pc 00004cb6/data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so01-24 10:52:25.651 199-199/?A/DEBUG: #01 pc 00005189/data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so (rtmp_sender_write_video_frame+28)01-24 10:52:25.651 199-199/?A/DEBUG: #02 pc 00005599/data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so (Java_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo+60)01-24 10:52:25.651 199-199/?A/DEBUG: #03 pc 014e84e7/data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (int net.butterflytv.rtmp_client.RTMPMuxer.writeVideo(byte[],整数,整数,整数)+122)01-24 10:52:25.651 199-199/?A/DEBUG: #04 pc 014dbd55/data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix.writeThread()+2240)01-24 10:52:25.651 199-199/?A/DEBUG: #05 pc 014d8c41/data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix.access$000(io.kickflip.sdk.av.muxer.RtmpMuxerMix)+60)01-24 10:52:25.651 199-199/?A/DEBUG: #06 pc 014d819f/data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix$1.run()+98)01-24 10:52:25.651 199-199/?A/DEBUG: #07 pc 721e78d1/data/dalvik-cache/arm/system@framework@boot.oat (offset 0x1ed6000)
同样,在 2 小时的直播中,这可能永远不会发生,也可能会在直播 10 分钟后发生.调试起来非常困难,因为我不能强迫错误发生.
有什么办法可以改善我得到的调试信息?SEGV_ACCER 到底是什么意思?我读到这意味着您试图访问您无权访问的地址."但我不确定这意味着什么,因为我可以连续播放数小时而不会发生错误.
有什么方法可以捕捉到信号并继续?
添加更多信息,这是应用程序崩溃的本机库的一部分(使用 ndk-stack 找到):
JNIEXPORT jint JNICALLJava_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo(JNIEnv *env, jobject 实例,jbyteArray data_, jint 偏移量, jint 长度,jint 时间戳){jbyte *data = (*env)->GetByteArrayElements(env, data_, NULL);jint 结果 = rtmp_sender_write_video_frame(数据, 长度, 时间戳, 0, 0);(*env)->ReleaseByteArrayElements(env, data_, data, 0);返回结果;}int rtmp_sender_write_video_frame(uint8_t *data,整数大小,uint64_t dts_us,整数键,uint32_t abs_ts){uint8_t * buf;uint8_t * buf_offset;整数值 = 0;整数;uint32_t ts;uint32_t nal_len;uint32_t nal_len_n;uint8_t *nal;uint8_t *nal_n;字符 * 输出;uint32_t 偏移量 = 0;uint32_t body_len;uint32_t 输出长度;buf = 数据;buf_offset = 数据;总=大小;ts = (uint32_t)dts_us;//ts = RTMP_GetTime() - start_time;偏移量 = 0;nal = get_nal(&nal_len, &buf_offset, buf, 总计);(...)}静态 uint8_t * get_nal(uint32_t *len, uint8_t **offset, uint8_t *start, uint32_t 总计){uint32_t 信息;uint8_t *q ;uint8_t *p = *offset;*len = 0;如果((p - 开始)>= 总计)返回空值;而(1){info = find_start_code(p, 3);如果(信息 == 1)休息;p++;如果((p - 开始)>= 总计)返回空值;}q = p + 4;p = q;而(1){info = find_start_code(p, 3);如果(信息 == 1)休息;p++;如果((p - 开始)>= 总计)//返回空值;休息;}*len = (p - q);*偏移量 = p;返回q;}静态 uint32_t find_start_code(uint8_t *buf, uint32_t zeros_in_startcode){uint32_t 信息;uint32_t 我;信息 = 1;if ((info = (buf[zeros_in_startcode] != 1)? 0: 1) == 0)返回0;for (i = 0; i
find_start_code
中的 buf[zeros_in_startcode]
发生崩溃.我还删除了一些 android_log 行(不认为这很重要?).
据我了解,这个缓冲区应该是可访问的,它只是有时"崩溃是没有意义的.
PS.这是我从 Java 调用本机代码的地方:
private void writeThread() {而(真){框架框架=空;同步(mBufferLock){if (!mConfigBuffer.isEmpty()) {帧 = mConfigBuffer.peek();} else if (!mBuffer.isEmpty()) {帧 = mBuffer.remove();}如果(帧==空){尝试 {mBufferLock.wait();} 捕捉(InterruptedException e){}}}如果(帧==空){继续;} else if (frame instanceof Sentinel) {休息;}int writeResult = 0;同步(mWriteFence){如果(!mConnected){调试(警告,由于断开连接而跳帧");继续;}if (frame.getFrameType() == Frame.VIDEO_FRAME) {writeResult = mRTMPMuxer.writeVideo(frame.getData(), frame.getOffset(), frame.getSize(), frame.getTime());} else if (frame.getFrameType() == Frame.AUDIO_FRAME) {writeResult = mRTMPMuxer.writeAudio(frame.getData(), frame.getOffset(), frame.getSize(), frame.getTime());}if (writeResult < 0) {mRtmpListener.onDisconnected();mConnected = 假;} 别的 {//现在我们删除配置框架,前提是发送成功!如果(frame.isConfig()){同步(mBufferLock){mConfigBuffer.remove();}}}}}}
请注意,即使我根本不发送音频也会发生崩溃.
解决方案
参见 https://developer.android.com/training/articles/perf-jni.html
分析
一些思考和尝试:
它失败的代码非常通用,所以可能没有错误
一定是
frame
数据被移除/损坏/锁定/移动Java 垃圾收集器是否已删除或重新定位数据?
您可以将详细的调试信息写入文件,并在每个文件中覆盖它框架,因此您只有一个包含最后调试信息的小日志.
将
frame
变量信息的本地副本(使用ByteBuffer
)发送到mRTMPMuxer.writeVideo
与常规byte
缓冲区不同,在ByteBuffer
中,存储不分配在托管heap
上,并且可以始终访问直接来自本机代码.
实施
//从本机堆分配内存ByteBuffer 数据 = ByteBuffer.allocateDirect(frame.getData().length);数据.clear();//System.gc();//复制数据data.get(frame.getData(), 0, frame.getData().length);//data = (frame.getData() == null) ?空:frame.getData().clone();int offset = frame.getOffset();int size = frame.getSize();int time = frame.getTime();writeResult = mRTMPMuxer.writeVideo(数据, 偏移量, 大小, 时间);JNIEXPORT jint JNICALLJava_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo(JNIEnv *env,作业实例,jobject data_,//不是 jbyteArray data_,jint偏移,接头长度,jint时间戳){jbyte *data = env->GetDirectBufferAddress(env, data);//GetDirectBufferAddress NOT GetByteArrayElementsjint 结果 = rtmp_sender_write_video_frame(数据, 长度, 时间戳, 0, 0);//(*env)->ReleaseByteArrayElements(env, data_, data, 0);//????返回结果;}
调试
一些代码来自 SO 捕捉本地代码抛出的异常:
静态 uint32_t find_start_code(uint8_t *buf, uint32_t zeros_in_startcode){//...尝试 {if ((info = (buf[zeros_in_startcode] != 1)? 0: 1) == 0) return 0;//你的代码}//您可以捕获 std::exception 以进行更通用的错误处理捕获(标准::异常 e){throwJavaException (env, e.what());//见下面的方法}//...
然后是一个新方法:
void throwJavaException(JNIEnv *env, const char *msg){//你可以把你自己的异常放在这里jclass c = env->FindClass("java/lang/RuntimeException");如果 (NULL == c){//B计划:空指针...c = env->FindClass("java/lang/NullPointerException");}env->ThrowNew(c, msg);}}
不要太挂在
SEGV_ACCERR
,你有一个分段错误,SIGSEGV
(由一个程序试图读取或写入一个非法的内存位置,读在你的情况下).
来自 siginfo.h:
SEGV_MAPERR 表示您试图访问一个不映射到任何东西的地址.SEGV_ACCERR 表示您试图访问您无权访问的地址.
其他
这可能很有趣:
问:我注意到有 RTMP 支持.但是一个补丁删除RTMP 已被合并.
Q:你能告诉我为什么吗?
A:我们不认为 RTMP 服务于移动广播用例以及 HLS,
A:所以我们不想将我们有限的资源用于支持它.
参见:https://github.com/Kickflip/kickflip-android-sdk/issues/33
我建议您通过以下方式注册问题:
https://github.com/Kickflip/kickflip-android-sdk/issues
https://github.com/ButterflyTV/LibRtmp-Client-for-Android/问题
I have an app that streams video using Kickflip and ButterflyTV libRTMP
Now for 99% percent of the time the app is working ok, but from time to time I get a native segmentation fault that I am not able to debug, since messages are too cryptic:
01-24 10:52:25.576 199-199/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-24 10:52:25.576 199-199/? A/DEBUG: Build fingerprint: 'google/hammerhead/hammerhead:6.0.1/M4B30Z/3437181:user/release-keys'
01-24 10:52:25.576 199-199/? A/DEBUG: Revision: '11'
01-24 10:52:25.576 199-199/? A/DEBUG: ABI: 'arm'
01-24 10:52:25.576 199-199/? A/DEBUG: pid: 14302, tid: 14382, name: MuxerThread >>> tv.myapp.broadcast.dev <<<
01-24 10:52:25.576 199-199/? A/DEBUG: signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x9fef1000
01-24 10:52:25.636 199-199/? A/DEBUG: Abort message: 'Setting to ready!'
01-24 10:52:25.636 199-199/? A/DEBUG: r0 9c6f9500 r1 9c6f94fc r2 9fee900c r3 00007ff4
01-24 10:52:25.636 199-199/? A/DEBUG: r4 9fee9010 r5 9fef0ffd r6 00007ff1 r7 9fef0d88
01-24 10:52:25.636 199-199/? A/DEBUG: r8 cfe40980 r9 9e0a6900 sl 00007ff4 fp 9c6f94fc
01-24 10:52:25.636 199-199/? A/DEBUG: ip 9c6f9058 sp 9c6f94dc lr 000000e9 pc b3a33cb6 cpsr 800f0030
01-24 10:52:25.650 199-199/? A/DEBUG: backtrace:
01-24 10:52:25.651 199-199/? A/DEBUG: #00 pc 00004cb6 /data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so
01-24 10:52:25.651 199-199/? A/DEBUG: #01 pc 00005189 /data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so (rtmp_sender_write_video_frame+28)
01-24 10:52:25.651 199-199/? A/DEBUG: #02 pc 00005599 /data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so (Java_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo+60)
01-24 10:52:25.651 199-199/? A/DEBUG: #03 pc 014e84e7 /data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (int net.butterflytv.rtmp_client.RTMPMuxer.writeVideo(byte[], int, int, int)+122)
01-24 10:52:25.651 199-199/? A/DEBUG: #04 pc 014dbd55 /data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix.writeThread()+2240)
01-24 10:52:25.651 199-199/? A/DEBUG: #05 pc 014d8c41 /data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix.access$000(io.kickflip.sdk.av.muxer.RtmpMuxerMix)+60)
01-24 10:52:25.651 199-199/? A/DEBUG: #06 pc 014d819f /data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix$1.run()+98)
01-24 10:52:25.651 199-199/? A/DEBUG: #07 pc 721e78d1 /data/dalvik-cache/arm/system@framework@boot.oat (offset 0x1ed6000)
Again, in a 2 hour stream this might not ever happen or it might happen 10 minutes into the stream. It is super hard to debug because I cannot force the bug to happen.
Is there any way to improve the debugging information I get? What exactly does SEGV_ACCER mean? I've read that this "means you tried to access an address that you don't have permission to access." but I am unsure as what that means, as I can stream for hours without the bug happening.
Is there any way to catch the signal and just continue?
EDIT: to add more information, this is the part of the native library where the app crashes (found using ndk-stack):
JNIEXPORT jint JNICALL
Java_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo(JNIEnv *env, jobject instance,
jbyteArray data_, jint offset, jint length,
jint timestamp) {
jbyte *data = (*env)->GetByteArrayElements(env, data_, NULL);
jint result = rtmp_sender_write_video_frame(data, length, timestamp, 0, 0);
(*env)->ReleaseByteArrayElements(env, data_, data, 0);
return result;
}
int rtmp_sender_write_video_frame(uint8_t *data,
int size,
uint64_t dts_us,
int key,
uint32_t abs_ts)
{
uint8_t * buf;
uint8_t * buf_offset;
int val = 0;
int total;
uint32_t ts;
uint32_t nal_len;
uint32_t nal_len_n;
uint8_t *nal;
uint8_t *nal_n;
char *output ;
uint32_t offset = 0;
uint32_t body_len;
uint32_t output_len;
buf = data;
buf_offset = data;
total = size;
ts = (uint32_t)dts_us;
//ts = RTMP_GetTime() - start_time;
offset = 0;
nal = get_nal(&nal_len, &buf_offset, buf, total);
(...)
}
static uint8_t * get_nal(uint32_t *len, uint8_t **offset, uint8_t *start, uint32_t total)
{
uint32_t info;
uint8_t *q ;
uint8_t *p = *offset;
*len = 0;
if ((p - start) >= total)
return NULL;
while(1) {
info = find_start_code(p, 3);
if (info == 1)
break;
p++;
if ((p - start) >= total)
return NULL;
}
q = p + 4;
p = q;
while(1) {
info = find_start_code(p, 3);
if (info == 1)
break;
p++;
if ((p - start) >= total)
//return NULL;
break;
}
*len = (p - q);
*offset = p;
return q;
}
static uint32_t find_start_code(uint8_t *buf, uint32_t zeros_in_startcode)
{
uint32_t info;
uint32_t i;
info = 1;
if ((info = (buf[zeros_in_startcode] != 1)? 0: 1) == 0)
return 0;
for (i = 0; i < zeros_in_startcode; i++)
if (buf[i] != 0)
{
info = 0;
break;
};
return info;
}
Crash happens at
buf[zeros_in_startcode]
in find_start_code
. I have removed a few android_log lines as well (dont think this matters?).
To my understanding, this buffer should be accessible, it makes no sense that it crashes only "sometimes".
PS. this is where I call the native code from Java:
private void writeThread() {
while (true) {
Frame frame = null;
synchronized (mBufferLock) {
if (!mConfigBuffer.isEmpty()) {
frame = mConfigBuffer.peek();
} else if (!mBuffer.isEmpty()) {
frame = mBuffer.remove();
}
if (frame == null) {
try {
mBufferLock.wait();
} catch (InterruptedException e) {
}
}
}
if (frame == null) {
continue;
} else if (frame instanceof Sentinel) {
break;
}
int writeResult = 0;
synchronized (mWriteFence) {
if (!mConnected) {
debug(WARN, "Skipping frame due to disconnection");
continue;
}
if (frame.getFrameType() == Frame.VIDEO_FRAME) {
writeResult = mRTMPMuxer.writeVideo(frame.getData(), frame.getOffset(), frame.getSize(), frame.getTime());
} else if (frame.getFrameType() == Frame.AUDIO_FRAME) {
writeResult = mRTMPMuxer.writeAudio(frame.getData(), frame.getOffset(), frame.getSize(), frame.getTime());
}
if (writeResult < 0) {
mRtmpListener.onDisconnected();
mConnected = false;
} else {
//Now we remove the config frame, only if sending was successful!
if (frame.isConfig()) {
synchronized (mBufferLock) {
mConfigBuffer.remove();
}
}
}
}
}
}
Note that the crash happens even when I dont send audio at all.
解决方案
See https://developer.android.com/training/articles/perf-jni.html
Analysis
Some musings and things to try:
The code where it falls over is very generic, so probably no bug there
It must be the
frame
data has been removed/damaged/locked/movedHas the Java garbage collector removed OR relocated the data ?
You could write detailed debug to a file, overwriting it on everyframe, so you only have a small log with the last debug info.
send a local copy of the
frame
variable info (usingByteBuffer
) tomRTMPMuxer.writeVideo
Unlike regularbyte
buffers,inByteBuffer
the storage is not allocated on the managedheap
, and can always be accessed directly from native code.
Implementation
Some code from SO Catching exceptions thrown from native code:
static uint32_t find_start_code(uint8_t *buf, uint32_t zeros_in_startcode){
//...
try {
if ((info = (buf[zeros_in_startcode] != 1)? 0: 1) == 0) return 0;//your code
}
// You can catch std::exception for more generic error handling
catch (std::exception e){
throwJavaException (env, e.what());//see method below
}
//...
Then a new method:
void throwJavaException(JNIEnv *env, const char *msg)
{
// You can put your own exception here
jclass c = env->FindClass("java/lang/RuntimeException");
if (NULL == c)
{
//B plan: null pointer ...
c = env->FindClass("java/lang/NullPointerException");
}
env->ThrowNew(c, msg);
}
}
Don't get too hung up on
SEGV_ACCERR
, you have a segmentation fault,SIGSEGV
(caused by a program trying to read or write an illegal memory location, read in your case).
From siginfo.h:
SEGV_MAPERR means you tried to access an address that doesn't map to anything.SEGV_ACCERR means you tried to access an address that you don't have permission to access.
Other
This may be of interest:
see: https://github.com/Kickflip/kickflip-android-sdk/issues/33
I suggest you register an issue with:
https://github.com/Kickflip/kickflip-android-sdk/issues
https://github.com/ButterflyTV/LibRtmp-Client-for-Android/issues
这篇关于如何调试 SEGV_ACCERR的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!