问题描述
什么是可以被表示为双(相邻的)整数精确范围(相应浮动?)我想问的原因是因为我很好奇的的。
这是
- 什么是最小正整数
M
,使得M + 1
不能准确地表示为双(相应浮动)? - 什么是最大的负整数
-n
,使得-n-1
不能准确地表示为双(相应浮动)? (可能是与上述相同)。 - What is the least positive integer
m
such thatm+1
cannot be precisely expressed as a double (resp. float)? - What is the greatest negative integer
-n
such that-n-1
cannot be precisely expressed as a double (resp. float)? (May be the same as the above).
醇>
这意味着,之间的每个整数 -n
和 M
有一个确切的浮点表示。基本上,我在寻找的范围 [ - N,M]
为花车和双打
让的限制范围到 32位和64位浮点表示。我知道,浮子具有精度24位和双拥有53位(均与一个隐藏的前导位),但由于浮点表示我正在寻找这一个权威的答案的复杂性。请不要挥舞你手中!
(理想的答案会证明一切从 0
的整数<$ C $ç> M 的表达,而 M + 1
不是。)
既然你问IEEE浮点类型,语言并不重要。
的#include<&iostream的GT;
使用命名空间std;
INT的main(){
浮动F0 = 16777215; // 2 ^ 24 - 1
浮动F1 = 16777216; // 2 ^ 24
浮动F2 = 16777217; // 2 ^ 24 + 1
COUT<< (F0 == F1)LT;< ENDL;
COUT<< (F1 F2 ==)LT;< ENDL;
双D0 = 9007199254740991; // 2 ^ 53 - 1
双D1 = 9007199254740992; // 2 ^ 53
双D2 = 9007199254740993; // 2 ^ 53 + 1
COUT<< (D0 == D1)所述;&下; ENDL;
COUT<< (D1 D2 ==)LT;< ENDL;
}
输出:
0
1
0
1
所以浮法限制是2 ^ 24。而对于双限为2 ^ 53。底片是相同的,因为唯一的区别是符号位
What is the exact range of (contiguous) integers that can be expressed as a double (resp. float?) The reason I ask is because I am curious for questions such as this one when a loss of accuracy will occur.
That is
This means that every integer between -n
and m
has an exact floating-point representation. I'm basically looking for the range [-n, m]
for both floats and doubles.
Let's limit the scope to the standard IEEE 754 32-bit and 64-bit floating point representations. I know that the float has 24 bits of precision and the double has 53 bits (both with a hidden leading bit), but due to the intricacies of the floating point representation I'm looking for an authoritative answer for this. Please don't wave your hands!
(Ideal answer would prove that all the integers from 0
to m
are expressible, and that m+1
is not.)
Since you're asking about IEEE floating-point types, the language does not matter.
#include <iostream>
using namespace std;
int main(){
float f0 = 16777215.; // 2^24 - 1
float f1 = 16777216.; // 2^24
float f2 = 16777217.; // 2^24 + 1
cout << (f0 == f1) << endl;
cout << (f1 == f2) << endl;
double d0 = 9007199254740991.; // 2^53 - 1
double d1 = 9007199254740992.; // 2^53
double d2 = 9007199254740993.; // 2^53 + 1
cout << (d0 == d1) << endl;
cout << (d1 == d2) << endl;
}
Output:
0
1
0
1
So the limit for float is 2^24. And the limit for double is 2^53. Negatives are the same since the only difference is the sign bit.
这篇关于可以精确地被表示为浮点数/双打整数范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!