问题描述
我在Android NDK中遇到了一件很奇怪的事情.
There is very odd thing that I faced in Android NDK.
我有一个循环
#include <chrono>
#include <android/log.h>
#include <vector>
while (true)
{
const int sz = 2048*2048*3;
std::vector<unsigned char> v;
{
auto startTime = std::chrono::system_clock::now();
v.resize(sz);
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::system_clock::now() - startTime);
__android_log_print(ANDROID_LOG_ERROR, "READFILE 1", "v.resize(%d) time : %lld\n", sz, duration.count());
}
{
auto startTime = std::chrono::system_clock::now();
v.resize(0);
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::system_clock::now() - startTime);
__android_log_print(ANDROID_LOG_ERROR, "READFILE 2", "v.resize(0) time : %lld\n", duration.count());
}
{
auto startTime = std::chrono::system_clock::now();
v.resize(sz);
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::system_clock::now() - startTime);
__android_log_print(ANDROID_LOG_ERROR, "READFILE 3", "v.resize(%d) time : %lld\n", sz, duration.count());
}
}
我得到一条日志:
34.4171: v.resize(12582912) time : 845977
34.9682: v.resize(0) time : 550995
35.5293: v.resize(12582912) time : 561165
36.6121: v.resize(12582912) time : 530845
37.1612: v.resize(0) time : 548528
37.7183: v.resize(12582912) time : 556559
38.7811: v.resize(12582912) time : 515162
39.3312: v.resize(0) time : 550630
39.8883: v.resize(12582912) time : 556319
40.9711: v.resize(12582912) time : 530739
41.5182: v.resize(0) time : 546654
42.0733: v.resize(12582912) time : 554924
43.1321: v.resize(12582912) time : 511659
43.6802: v.resize(0) time : 547084
44.2373: v.resize(12582912) time : 557001
45.3201: v.resize(12582912) time : 530313
所以,首先
- 如您所见,
- 我仅获得
resize(0)
即可获得550毫秒...应该是最多1微秒而不是MILLI - 其次是为什么
resize(size)
再次达到550毫秒载体的容量是否不变?
- as you can see I get 550 milliseconds just for
resize(0)
... It should be maximum 1 MICRO second not MILLI - and secondly why it get again 550 millisecond for
resize(size)
if capacity of vector wasn't changed?
这是2种非常奇怪的行为.
It is 2 very odd behavior.
我们欢迎您采用这段代码,如果您不相信我,请检查一下自己:) 但只需要在Android NDK上签入 ,而不要在Visual Studio上签入项目,因为它像它应该的那样工作.
You are welcome to take this snip of code and check for yourself if you don't believe me:) But just check in on Android NDK, not Visual Studio project, because there it is works like it should.
这真的看起来像是虫子...
It is really looks like bug...
还是我做错了什么?
编辑
我检查了是否使用resize()
方法会出现这样的循环
I checked that if go down to resize()
method I came to such loop
template <class _Tp, class _Allocator>
inline _LIBCPP_INLINE_VISIBILITY
void
__vector_base<_Tp, _Allocator>::__destruct_at_end(pointer __new_last) _NOEXCEPT
{
pointer __soon_to_be_end = __end_;
while (__new_last != __soon_to_be_end)
__alloc_traits::destroy(__alloc(), _VSTD::__to_raw_pointer(--__soon_to_be_end));
__end_ = __new_last;
}
因此,这意味着存在一个循环遍历调整大小范围内的每个元素并调用destroy
So, it is means that there is a loop that goes over every element that in resize range and call destroy
如果您不持有具有析构函数的平凡对象也没有问题,但是如果您持有矢量(如我的情况)那样的琐碎且没有析构函数的int对象,那么...是非常奇怪的行为,如何从实际上没有析构函数的对象中调用析构函数?
And there is no problem IF you hold not trivial objects that has a destructor, BUT if you hold in vector(like in my case) int objects which are trivial and they don't have a destructor, so... it is very strange behaviour, how you can call destructor from object that actually don't have a destructor?
看起来像编译器错误吗?
Is it looks like compiler bug?
推荐答案
在Maciej的答案和Andy的评论中,让我们检查生成的代码.
Adding to Maciej's answer and Andy's comment, let's check the code that is generated.
使用此Makefile:
Using this Makefile:
CXX = $(NDKPATH)/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android29-clang++
CC = $(NDKPATH)/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android29-clang++
INC = -I$(NDKPATH)/cxx-stl/llvm-libc++/include/
LIB = -L$(NDKPATH)/cxx-stl/llvm-libc++/lib/
CXXFLAGS = -ggdb -O$(OPTLEVEL)
.PHONY: all clean dump
all: dump
dump: test
$(NDKPATH)/toolchains/llvm/prebuilt/linux-x86_64/aarch64-linux-android/bin/objdump -d -C test | gawk '/<big|<small|::resize/ {p=1} /^$$/ {p=0} {if (p) print $0}'
clean:
$(RM) test.o test
test: test.o
...和一个非常简单的test.cpp:
...and a very simple test.cpp:
#include <vector>
using std::vector;
void big(vector<int>& v) {
v.resize(10000000);
}
void small(vector<int>& v) {
v.resize(0);
}
int main() {
return 0;
}
未经优化的编译(-O0
),请注意big()
和small()
都如何调用resize()
,这在循环中完成了很多工作(您也可以在源代码中找到).
Compiling without optimization (-O0
), note how both big()
and small()
call resize()
, which does a whole bunch of stuff in a loop (as you've also found in the source code).
ndk-vector-speed$ export NDKPATH=~/.androidsdk/ndk-bundle
ndk-vector-speed$ make clean && OPTLEVEL=0 make dump
rm -f test.o test
/home/snild/.androidsdk/ndk-bundle/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android29-clang++ -ggdb -O0 -c -o test.o test.cpp
/home/snild/.androidsdk/ndk-bundle/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android29-clang++ test.o -o test
/home/snild/.androidsdk/ndk-bundle/toolchains/llvm/prebuilt/linux-x86_64/aarch64-linux-android/bin/objdump -d -C test | gawk '/<big|<small|::resize/ {p=1} /^$/ {p=0} {if (p) print }'
0000000000000f04 <big(std::__ndk1::vector<int, std::__ndk1::allocator<int> >&)>:
f04: d10083ff sub sp, sp, #0x20
f08: a9017bfd stp x29, x30, [sp,#16]
f0c: 910043fd add x29, sp, #0x10
f10: d292d001 mov x1, #0x9680 // #38528
f14: f2a01301 movk x1, #0x98, lsl #16
f18: f90007e0 str x0, [sp,#8]
f1c: f94007e0 ldr x0, [sp,#8]
f20: 94000013 bl f6c <std::__ndk1::vector<int, std::__ndk1::allocator<int> >::resize(unsigned long)>
f24: a9417bfd ldp x29, x30, [sp,#16]
f28: 910083ff add sp, sp, #0x20
f2c: d65f03c0 ret
0000000000000f30 <small(std::__ndk1::vector<int, std::__ndk1::allocator<int> >&)>:
f30: d10083ff sub sp, sp, #0x20
f34: a9017bfd stp x29, x30, [sp,#16]
f38: 910043fd add x29, sp, #0x10
f3c: d2800001 mov x1, #0x0 // #0
f40: f90007e0 str x0, [sp,#8]
f44: f94007e0 ldr x0, [sp,#8]
f48: 94000009 bl f6c <std::__ndk1::vector<int, std::__ndk1::allocator<int> >::resize(unsigned long)>
f4c: a9417bfd ldp x29, x30, [sp,#16]
f50: 910083ff add sp, sp, #0x20
f54: d65f03c0 ret
0000000000000f6c <std::__ndk1::vector<int, std::__ndk1::allocator<int> >::resize(unsigned long)>:
f6c: d100c3ff sub sp, sp, #0x30
f70: a9027bfd stp x29, x30, [sp,#32]
f74: 910083fd add x29, sp, #0x20
f78: f81f83a0 stur x0, [x29,#-8]
f7c: f9000be1 str x1, [sp,#16]
f80: f85f83a0 ldur x0, [x29,#-8]
f84: f90003e0 str x0, [sp]
f88: 94000020 bl 1008 <std::__ndk1::vector<int, std::__ndk1::allocator<int> >::size() const>
f8c: f90007e0 str x0, [sp,#8]
f90: f94007e0 ldr x0, [sp,#8]
f94: f9400be1 ldr x1, [sp,#16]
f98: eb01001f cmp x0, x1
f9c: 1a9f27e8 cset w8, cc
fa0: 37000048 tbnz w8, #0, fa8 <std::__ndk1::vector<int, std::__ndk1::allocator<int> >::resize(unsigned long)+0x3c>
fa4: 14000007 b fc0 <std::__ndk1::vector<int, std::__ndk1::allocator<int> >::resize(unsigned long)+0x54>
fa8: f9400be8 ldr x8, [sp,#16]
fac: f94007e9 ldr x9, [sp,#8]
fb0: eb090101 subs x1, x8, x9
fb4: f94003e0 ldr x0, [sp]
fb8: 9400001e bl 1030 <std::__ndk1::vector<int, std::__ndk1::allocator<int> >::__append(unsigned long)>
fbc: 14000010 b ffc <std::__ndk1::vector<int, std::__ndk1::allocator<int> >::resize(unsigned long)+0x90>
fc0: f94007e8 ldr x8, [sp,#8]
fc4: f9400be9 ldr x9, [sp,#16]
fc8: eb09011f cmp x8, x9
fcc: 1a9f97ea cset w10, hi
fd0: 3700004a tbnz w10, #0, fd8 <std::__ndk1::vector<int, std::__ndk1::allocator<int> >::resize(unsigned long)+0x6c>
fd4: 1400000a b ffc <std::__ndk1::vector<int, std::__ndk1::allocator<int> >::resize(unsigned long)+0x90>
fd8: b27e03e8 orr x8, xzr, #0x4
fdc: f94003e9 ldr x9, [sp]
fe0: f9400129 ldr x9, [x9]
fe4: f9400bea ldr x10, [sp,#16]
fe8: 9b0a7d08 mul x8, x8, x10
fec: 8b080128 add x8, x9, x8
ff0: f94003e0 ldr x0, [sp]
ff4: aa0803e1 mov x1, x8
ff8: 94000054 bl 1148 <std::__ndk1::vector<int, std::__ndk1::allocator<int> >::__destruct_at_end(int*)>
ffc: a9427bfd ldp x29, x30, [sp,#32]
1000: 9100c3ff add sp, sp, #0x30
1004: d65f03c0 ret
有了-O2
,编译器可以为我们做很多优化.
With -O2
, the compiler can do lots of optimization for us.
首先,resize()
完全消失了;它已被删除,因为没有人需要它.
First of all, resize()
is completely gone; it has been removed because no one needs it anymore.
big()
内联了resize()
的需求,而是直接调用__append()
,并且看上去比我们之前调用的完整resize()
函数更简单.由于我尚未运行此代码,因此无法就此对速度有多大帮助做出任何声明.
big()
has inlined what it needs from resize()
, calling __append()
directly instead, and looks generally simpler than the full resize()
function we called before. Since I haven't run this code, I can't make any claims regarding how much this helps with speed.
small()
现在没有函数调用,没有循环,并且只有五个指令(我在下面手动注释了).它实质上已成为if (v.begin != v.end) v.end = v.begin
.这当然会很快.
small()
now has no function calls, no loops, and only five instructions (which I've annotated manually below). It has essentially become if (v.begin != v.end) v.end = v.begin
. This will of course be very fast.
ndk-vector-speed$ make clean && OPTLEVEL=2 make dump
rm -f test.o test
/home/snild/.androidsdk/ndk-bundle/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android29-clang++ -ggdb -O2 -c -o test.o test.cpp
/home/snild/.androidsdk/ndk-bundle/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android29-clang++ test.o -o test
/home/snild/.androidsdk/ndk-bundle/toolchains/llvm/prebuilt/linux-x86_64/aarch64-linux-android/bin/objdump -d -C test | gawk '/<big|<small|::resize/ {p=1} /^$/ {p=0} {if (p) print }'
0000000000000e64 <big(std::__ndk1::vector<int, std::__ndk1::allocator<int> >&)>:
e64: a9402408 ldp x8, x9, [x0]
e68: 5292d00a mov w10, #0x9680 // #38528
e6c: 72a0130a movk w10, #0x98, lsl #16
e70: cb080129 sub x9, x9, x8
e74: 9342fd2b asr x11, x9, #2
e78: eb0a017f cmp x11, x10
e7c: 54000062 b.cs e88 <big(std::__ndk1::vector<int, std::__ndk1::allocator<int> >&)+0x24>
e80: cb0b0141 sub x1, x10, x11
e84: 14000011 b ec8 <std::__ndk1::vector<int, std::__ndk1::allocator<int> >::__append(unsigned long)>
e88: 528b400a mov w10, #0x5a00 // #23040
e8c: 72a04c4a movk w10, #0x262, lsl #16
e90: eb0a013f cmp x9, x10
e94: 540000a0 b.eq ea8 <big(std::__ndk1::vector<int, std::__ndk1::allocator<int> >&)+0x44>
e98: 528b4009 mov w9, #0x5a00 // #23040
e9c: 72a04c49 movk w9, #0x262, lsl #16
ea0: 8b090108 add x8, x8, x9
ea4: f9000408 str x8, [x0,#8]
ea8: d65f03c0 ret
0000000000000eac <small(std::__ndk1::vector<int, std::__ndk1::allocator<int> >&)>:
eac: a9402408 ldp x8, x9, [x0] // load the first two values (begin and end) from v
eb0: eb08013f cmp x9, x8 // compare them
eb4: 54000040 b.eq ebc <small(std::__ndk1::vector<int, std::__ndk1::allocator<int> >&)+0x10>
// skip to 'ret' if they were equal
eb8: f9000408 str x8, [x0,#8] // write v.begin to v.end
ebc: d65f03c0 ret // return.
结论:Maciej和Andy是正确的;您没有在启用优化的情况下进行构建.
Conclusion: Maciej and Andy are correct; you're not building with optimizations enabled.
这篇关于Android NDK:vector.resize()太慢,与分配有关吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!