Precision浮点数据转换为预期值

Precision浮点数据转换为预期值

本文介绍了IBM Single Precision浮点数据转换为预期值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从二进制文件中读取值,数据格式为IBM单精度浮点数(4字节十六进制指数数据),并将该值用作十进制数.我有从文件读取并取出每个字节并像这样存储的C ++代码

I need to read values from a binary file, the data format is IBM single Precision Floating Point (4-byte Hexadecimal Exponent Data) and use the value as a decimal number. I have C++ code that reads from the file and takes out each byte and stores it like so

 unsigned char buf[BUF_LEN];

        for (long position = 0; position < fileLength; position += BUF_LEN) {
            file.read((char* )(&buf[0]), BUF_LEN);

           // printf("\n%8ld:  ", pos);

            for (int byte = 0; byte < BUF_LEN; byte++) {
               // printf(" 0x%-2x", buf[byte]);
            }
        }

这会打印出每个字节的十六进制值.

This prints out the hexadecimal values of each byte.

此图指定了IBM单精度浮点数 IBM单精度浮点数我不明白什么是24位正二进制分数.我确实知道如何在hex dec binary之间进行转换,所以我的基本理解是将所有q都当作一个非常长的二进制段,将Q24(2)^(23)用作最大的二进制段.值,同时将所有前面的值加在一起,然后将数字乘以10 ^ -24.但是我的直觉告诉我这是错误的.弄清什么是小数点或最高有效位将是有帮助的.

this picture specifies IBM single precision floating pointIBM single precision floating pointI do not understand what a 24-bit positive binary fraction is. I do know how to convert between hex<->dec<->binary so my basic understanding would be to take all the q's and treat them as one very long binary segment that would use Q24(2)^(23) as the largest value while adding all the preceding values together, and then just times the number by 10^-24 . But my intuition tells me this is wrong. Clarification on what a radix point or MSB is would be helpful.

推荐答案

该格式实际上非常简单,并且与IEEE 754 binary32格式没有什么特别的区别(实际上更简单,不支持任何魔术" NaN/Inf值,并且没有次正规数,因为此处的尾数在左侧具有隐式0,而不是隐式1).

The format is actually quite simple, and not particularly different than IEEE 754 binary32 format (it's actually simpler, not supporting any of the "magic" NaN/Inf values, and having no subnormal numbers, because the mantissa here has an implicit 0 on the left instead of an implicit 1).

维基百科所述,

如果我们假设您读取的字节位于uint8_t b[4]中,则结果值应类似于:

If we imagine that the bytes you read are in a uint8_t b[4], then the resulting value should be something like:

uint32_t mantissa = (b[1]<<16) | (b[2]<<8) | b[3];
int exponent = (b[0] & 127) - 64;
double ret = mantissa * exp2(-24 + 4*exponent);
if(b[0] & 128) ret *= -1.;

请注意,这里我在double中计算了结果,因为IEEE 754 float的范围不足以表示相同大小的IBM单精度值(也相反).另外,请记住,由于字节顺序问题,您可能必须还原上述代码中的索引.

Notice that here I calculated the result in a double, as the range of a IEEE 754 float is not enough to represent the same-sized IBM single precision value (also the opposite holds). Also, keep in mind that, due to endian issues, you may have to revert the indexes in my code above.

编辑: @Eric Postpischil 正确地指出,如果您有C99或POSIX 2001,则应使用ldexp(mantissa, -24 + 4*exponent)而不是mantissa * exp2(-24 + 4*exponent).在各个实现中更加精确(甚至可能更快).

Edit: @Eric Postpischil correctly points out that, if you have C99 or POSIX 2001 available, instead of mantissa * exp2(-24 + 4*exponent) you should use ldexp(mantissa, -24 + 4*exponent), which should be more precise (and possibly faster) across implementations.

这篇关于IBM Single Precision浮点数据转换为预期值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-15 18:12