本文介绍了在32个十进制数字浮点/双精度precision分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从另一个男人.c文件,我看到了这一点:

From a .c file of another guy, I saw this:

const float c = 0.70710678118654752440084436210485f;

在那里,他希望避免的开方运算

(1/2)

这能真的不知何故与普通的 C / C ++ 存储?我的意思是没有松动的precision。这似乎是不可能给我。

Can this be really stored somehow with plain C/C++? I mean without loosing precision. It seems impossible to me.

我使用C ++,但我不相信,precision差异之间的两种语言是太大了(如果有的话),这就是为什么我没有测试它。

I am using C++, but I do not believe that precision difference between this two languages are too big (if any), that' why I did not test it.

所以,我写这几行,看看在code的行为:

So, I wrote these few lines, to have a look at the behaviour of the code:

std::cout << "Number:    0.70710678118654752440084436210485\n";

const float f = 0.70710678118654752440084436210485f;
std::cout << "float:     " << std::setprecision(32) << f << std::endl;

const double d = 0.70710678118654752440084436210485; // no f extension
std::cout << "double:    " << std::setprecision(32) << d << std::endl;

const double df = 0.70710678118654752440084436210485f;
std::cout << "doublef:   " << std::setprecision(32) << df << std::endl;

const long double ld = 0.70710678118654752440084436210485;
std::cout << "l double:  " << std::setprecision(32) << ld << std::endl;

const long double ldl = 0.70710678118654752440084436210485l; // l suffix!
std::cout << "l doublel: " << std::setprecision(32) << ldl << std::endl;

的输出是这样的:

The output is this:

                   *       ** ***
                   v        v v
Number:    0.70710678118654752440084436210485    // 32 decimal digits
float:     0.707106769084930419921875            // 24 >>      >>
double:    0.70710678118654757273731092936941
doublef:   0.707106769084930419921875            // same as float
l double:  0.70710678118654757273731092936941    // same as double
l doublel: 0.70710678118654752438189403651592    // suffix l

其中, * 是最后的准确数字浮动 ** 的最后一个准确的数字双击 *** 的最后一个准确的数字长双

where * is the last accurate digit of float, ** the last accurate digit of double and *** the last accurate digit of long double.

的输出双击有32个十进制数字,因为我已设置的std :: COUT 在该值。

The output of double has 32 decimal digits, since I have set the precision of std::cout at that value.

浮动输出有24个,如预期,正如所说here:

float output has 24, as expected, as said here:

float has 24 binary bits of precision, and double has 53.

我希望最后的输出为与pre-去年相同,即该˚F后缀不会$ P $成为pvent数一个双击。我的认为的,当我写这篇文章:

I would expect the last output to be the same with the pre-last, i.e. that the f suffix would not prevent the number from becoming a double. I think that when I write this:

const double df = 0.70710678118654752440084436210485f;

什么情况是,第一个数字就变成了浮动一个,然后保存为一个双击,打完24日十进制数字,它具有零,这就是为什么在双击 precision停在那里。

what happens is that first the number becomes a float one and then stored as a double, so after the 24th decimal digits, it has zeroes and that's why the double precision stops there.

我是正确的?

从this答案我找到了一些相关的信息:

From this answer I found some relevant information:

float x = 0 has an implicit typecast from int to float.
float x = 0.0f does not have such a typecast.
float x = 0.0 has an implicit typecast from double to float.

关于 __ float128 ,它不是标准的,因此它的退出了竞争。查看更多here.

About __float128, it is not standard, thus it's out of the competition. See more here.

推荐答案

从标准:

有三种浮动点类型:浮动,双,和长双。
  double类型提供至少高达precision为float,以及
  长型双提供至少高达precision为双。该
  float类型的值的集合是一组值中的一个子集
  double类型;设置double类型的值的一个子集
  集合类型长双值的。再presentation价值
  FL浮点类型是实现德网络定义。

所以,你可以看到你的问题,这个问题:标准实际上并没有说precise彩车如何。

So you can see your issue with this question: the standard doesn't actually say how precise floats are.

在标准实现方面,你需要看看IEEE754,这意味着从Irineau和Davidmh其他两个答案是完全有效的方法来解决问题。

In terms of standard implementations, you need to look at IEEE754, which means the other two answers from Irineau and Davidmh are perfectly valid approaches to the problem.

至于后缀字母来表示类型,再望着标准:

As to suffix letters to indicate type, again looking at the standard:

A型浮动文字的两倍,除非明确特定网络版
  一肃FFI的X.苏FFI XES F和F指定浮动,苏FFI XES L和L指定
  长双。

所以,你试图创建一个长双将只具有相同的precision为双击字面除非您使用后缀你分配给它。

So your attempt to create a long double will just have the same precision as the double literal you are assigning to it unless you use the L suffix.

据我所知,有些答案似乎不尽如人意,但有很多阅读的背景要对相关标准进行之前,你可以辞退的答案。这个答案已经超过预期,所以我不会尝试在这里解释一切。

I understand that some of these answers may not seem satisfactory, but there is a lot of background reading to be done on the relevant standards before you can dismiss answers. This answer is already longer than intended so I won't try and explain everything here.

和作为最后需要注意:由于precision没有明确的规定,为什么没有一个恒定的那长于它需要?似乎很有道理始终定义一个常量,它是precise足以随时重新presentable与类型无关。

And as a final note: Since the precision is not clearly defined, why not have a constant that's longer than it needs to be? Seems to make sense to always define a constant that is precise enough to always be representable regardless of type.

这篇关于在32个十进制数字浮点/双精度precision分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 08:40