问题描述
我已经在 Direct3D9
中使用 ID3DXSprite
创建了简单的,帧独立的,可变时间步长,线性运动。大多数用户不会注意到,但在一些(包括我的)计算机,它经常发生,有时它很苦恼。
-
启用和禁用
VSync
会产生口吃。 -
我发现在
OpenGL
renderer中也会发生同样的情况。 -
它不是浮点问题。
-
似乎只有
AERO Transparent Glass
窗口模式才有问题
EDIT:
$ b
似乎我的框架增量时间测量日志代码被错误。
- 正常情况下,启用VSync的帧渲染时间为17ms,但有时(可能发生扫描时)会跳到25-30ms。
(我在应用程序退出时仅转储日志一次,而不是在运行时呈现,因此不会影响性能)
device->清除(0,0,D3DCLEAR_TARGET,D3DCOLOR_ARGB(255,255,255,255),0,0);
device-> BeginScene();
sprite-> Begin(D3DXSPRITE_ALPHABLEND);
QueryPerformanceCounter(& counter);
float time = counter.QuadPart /(float)frequency.QuadPart;
float deltaTime = time - currentTime;
currentTime = time;
position.x + = velocity * deltaTime;
if(position.x> 640)
velocity = -250;
else if(position.x< 0)
velocity = 250;
position.x =(int)position.x;
sprite-> Draw(texture,0,0,& position,D3DCOLOR_ARGB(255,255,255,255));
sprite-> End();
device-> EndScene();
device-> Present(0,0,0,0);
由于Eduard Wirch和Ben Voigt(虽然不能解决初始问题)
float time()
{
static LARGE_INTEGER start = {0};
static LARGE_INTEGER frequency;
if(start.QuadPart == 0)
{
QueryPerformanceFrequency(& frequency);
QueryPerformanceCounter(& start);
}
LARGE_INTEGER counter;
QueryPerformanceCounter(& counter);
return(float)((counter.QuadPart - start.QuadPart)/(double)frequency.QuadPart);
}
EDIT#2: b
$ b
到目前为止,我已经尝试过三种更新方法:
1)变量时间步长
x + = velocity * deltaTime;
2)固定时间步骤
x + = 4;
3)固定时间步+内插
accumulator + = deltaTime;
float updateTime = 0.001f;
while(accumulator> updateTime)
{
previousX = x;
x + = velocity * updateTime;
accumulator - = updateTime;
}
float alpha = accum / updateTime;
float interpolatedX = x * alpha + previousX *(1 - alpha);
所有方法的工作方式基本相同,固定时间步骤看起来更好,但它不是一个依赖对帧速率和它不能完全解决问题(仍然时常跳跃(stutters))。
到目前为止禁用
我使用的是 NVIDIA
最新驱动程序 GeForce 332.21驱动程序
和 Windows 7 x64 Ultimate
。
解决方案的一部分是一个简单的精度数据类型问题。用常数交换速度计算,你会看到一个非常平滑的运动。分析计算表明,你存储的结果从 QueryPerformanceCounter()
在一个浮动。 QueryPerformanceCounter()
返回一个在我的计算机上看起来像这样的数字: 724032629776
。此数字要求至少存储5个字节。然而, float
使用4个字节(实际数字只有24位)来存储该值。因此,当将 QueryPerformanceCounter()
的结果转换为 float
时,精度会丢失。有时这会导致 deltaTime
为零造成口吃。
这部分地解释了为什么有些用户没有体验问题。这一切都取决于 QueryPerformanceCounter()
的结果是否适合 float
。
这部分问题的解决方案是:use double
(或者Ben Voigt建议:存储初始性能计数器,转换为 float
之前的新值,这将给你至少更多的头部空间,但最终可能会遇到 float
(取决于性能计数器的增长速度)。)
修复这个问题后,口吃少了很多,但是不完全消失。分析运行时行为显示现在跳过一个帧,然后。应用程序GPU命令缓冲区由 Present
清除,但是现在命令保留在应用程序上下文队列中,直到下一个vsync(即使 Present
在vsync(14ms)之前很久被调用。进一步的分析表明,背景过程(f.lux)告诉系统一段时间设置伽马斜坡。此命令需要完整的GPU队列在执行之前运行干。可能会避免副作用。这个GPU刷新在present命令被移动到GPU队列之前开始。系统阻止视频调度,直到GPU干了。这直到下一个vsync。因此,当前分组没有移动到GPU队列,直到下一帧。这种可见的效果:口吃。
这不太可能是你的电脑上运行f.lux。但你可能经历了类似的背景干预。您需要自己在系统上查找问题的根源。我写了一篇关于如何诊断框架跳过的博文:。
但是,即使你发现你的框架跳过的来源,我怀疑你'在dwm窗口组成启用时,实现稳定的60fps。原因是,你不是直接画到屏幕。而是你绘制到dwm的共享表面。由于它是一个共享资源,它可以被其他人锁定任意时间,使您不能保持帧速率稳定为您的应用程序。如果你真的需要一个稳定的帧速率,去全屏或禁用窗口组成(在Windows 7上,Windows 8不允许禁用窗口组成):
#include< dwmapi.h>
...
HRESULT hr = DwmEnableComposition(DWM_EC_DISABLECOMPOSITION);
if(!SUCCEEDED(hr)){
//日志消息或以不同方式作出反应
}
I have created simple, frame independent, variable time step, linear movement in Direct3D9
using ID3DXSprite
. Most users cant notice it, but on some (including mine) computers it happens often and sometimes it stutters a lot.
Stuttering occurs with
VSync
enabled and disabled.I figured out that same happens in
OpenGL
renderer.Its not floating point problem.
Seems like problem only exist in
AERO Transparent Glass
windowed mode (fine or at least much less noticeable in fullscreen, borderless full screen window or with aero disabled), even worse when window lost focus.
EDIT:
Seems like my frame delta time measurement log code was bugged. I fixed it now.
- Normally with VSync enabled frame renders 17ms, but sometimes (probably when sutttering happens) it jumps to 25-30ms.
(I dump log only once at application exit, not while running, rendering, so its does not affect performance)
device->Clear(0, 0, D3DCLEAR_TARGET, D3DCOLOR_ARGB(255, 255, 255, 255), 0, 0);
device->BeginScene();
sprite->Begin(D3DXSPRITE_ALPHABLEND);
QueryPerformanceCounter(&counter);
float time = counter.QuadPart / (float) frequency.QuadPart;
float deltaTime = time - currentTime;
currentTime = time;
position.x += velocity * deltaTime;
if (position.x > 640)
velocity = -250;
else if (position.x < 0)
velocity = 250;
position.x = (int) position.x;
sprite->Draw(texture, 0, 0, &position, D3DCOLOR_ARGB(255, 255, 255, 255));
sprite->End();
device->EndScene();
device->Present(0, 0, 0, 0);
Fixed timer thanks to Eduard Wirch and Ben Voigt (although it doesnt fix initial problem)
float time()
{
static LARGE_INTEGER start = {0};
static LARGE_INTEGER frequency;
if (start.QuadPart == 0)
{
QueryPerformanceFrequency(&frequency);
QueryPerformanceCounter(&start);
}
LARGE_INTEGER counter;
QueryPerformanceCounter(&counter);
return (float) ((counter.QuadPart - start.QuadPart) / (double) frequency.QuadPart);
}
EDIT #2:
So far I have tried three update methods:
1) Variable time step
x += velocity * deltaTime;
2) Fixed time step
x += 4;
3) Fixed time step + Interpolation
accumulator += deltaTime;
float updateTime = 0.001f;
while (accumulator > updateTime)
{
previousX = x;
x += velocity * updateTime;
accumulator -= updateTime;
}
float alpha = accumulator / updateTime;
float interpolatedX = x * alpha + previousX * (1 - alpha);
All methods work pretty much same, fixed time step looks better, but it's not quite an option to depend on frame rate and it doesn't solve problem completely (still jumps (stutters) from time to time rarely).
So far disabling AERO Transparent Glass
or going full screen is only significant positive change.
I am using NVIDIA
latest drivers GeForce 332.21 Driver
and Windows 7 x64 Ultimate
.
Part of the solution was a simple precision data type problem. Exchange the speed calculation by a constant, and you'll see a extremely smooth movement. Analysing the calculation showed that you're storing the result from QueryPerformanceCounter()
inside a float. QueryPerformanceCounter()
returns a number which looks like this on my computer: 724032629776
. This number requires at least 5 bytes to be stored. How ever a float
uses 4 bytes (and only 24 bits for actual number) to store the value. So precision is lost when you convert the result of QueryPerformanceCounter()
to float
. And sometimes this leads to a deltaTime
of zero causing stuttering.
This explains partly why some users do not experience this problem. It all depends on if the result of QueryPerformanceCounter()
does fit into a float
.
The solution for this part of the problem is: use double
(or as Ben Voigt suggested: store the initial performance counter, and subtract this from new values before converting to float
. This would give you at least more head room, but might eventually hit the float
resolution limit again, when the application runs for a long time (depends on the growth speed of the performance counter).)
After fixing this, the stuttering was much less but did not disappear completely. Analyzing the runtime behaviour showed that a frame is skipped now and then. The application GPU command buffer is flushed by Present
but the present command remains in the application context queue until the next vsync (even though Present
was invoked long before vsync (14ms)). Further analysis showed that a back ground process (f.lux) told the system to set the gamma ramp once in a while. This command required the complete GPU queue to run dry before it was executed. Probably to avoid side effects. This GPU flush was started just before the 'present' command was moved to the GPU queue. The system blocked the video scheduling until the GPU ran dry. This took until the next vsync. So the present packet was not moved to GPU queue until the next frame. The visible effect of this: stutter.
It's unlikely that you're running f.lux on your computer too. But you're probably experiencing a similar background intervention. You'll need to look for the source of the problem on your system yourself. I've written a blog post about how to diagnose frame skips: Diagnose frame skips and stutter in DirectX applications. You'll also find the whole story of diagnosing f.lux as the culprit there.
But even if you find the source of your frame skip, I doubt that you'll achieve stable 60fps while dwm window composition is enabled. The reason is, you're not drawing to the screen directly. But instead you draw to a shared surface of dwm. Since it's a shared resource it can be locked by others for an arbitrary amount of time making it impossible for you to keep the frame rate stable for your application. If you really need a stable frame rate, go full screen, or disable window composition (on Windows 7. Windows 8 does not allow disabling window composition):
#include <dwmapi.h>
...
HRESULT hr = DwmEnableComposition(DWM_EC_DISABLECOMPOSITION);
if (!SUCCEEDED(hr)) {
// log message or react in a different way
}
这篇关于线性运动条纹的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!