本文介绍了为什么8和256在计算机科学中如此重要?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对RAM和HDD架构或电子设备如何处理内存块不是很了解,但这总是引起我的好奇:为什么我们选择对计算机值中的最小元素停止在8位?

我的问题可能看起来很愚蠢,因为答案很明显,但是我不太确定...

是因为2 ^ 3允许它在寻址内存时完全适合吗?电子设备是否专门设计为存储8位块?如果是,为什么不使用更宽泛的词呢?这是因为它将32、64和128相除,所以可以给处理器字中的几个?在如此小的空间中具有256的值是否方便?

您怎么看?

我的问题有点形而上学,但是我想确保这只是历史原因,而不是技术或数学原因.

对于轶事,我也在考虑ASCII标准,在该标准中,大多数第一个字符对于诸如UTF-8之类的东西都是无用的,我还试图考虑一些更小巧,更快速的字符编码...

解决方案

从历史上看,字节的大小并不总是8位(因此,计算机也不必是二进制的,但也可以是非二进制的)在实践中几乎没有采取任何行动).因此,IETF和ISO标准经常使用术语 octet -他们不使用 byte ,因为他们不想假设它在8位时表示8位.不是.

实际上,当创建了 byte 字节时,它被定义为1-6位单元.整个历史记录中使用的字节大小包括7、9、36和字节大小可变的机器.

8是商业成功的混合体,对于考虑它的人们来说,这是一个足够方便的数字(它们会相互融合),并且无疑是我完全不知道的其他原因.

您提到的ASCII标准假设一个7位字节,并且是基于较早的6位通信标准.


这可能值得补充,因为有些人坚持认为那些说字节始终是八位字节的人会将字节与字混淆.

八位位组是一个以8位为单位的名称(拉丁语为8位).如果您正在使用字节为8位的计算机(或更高抽象级别的编程语言),则此操作很容易,否则,您将需要一些转换代码(或硬件覆盖).在网络标准中,与本地计算相比, octet 概念的出现更多,因为与体系结构无关,它允许创建可用于具有不同字节大小的机器之间进行通信的标准.在IETF和ISO标准中使用(顺便说一句,ISO/IEC 10646使用 octet ,其中Unicode标准使用 byte 进行本质上的规定-在后一部分上有一些较小的额外限制-尽管Unicode标准确实详细说明了它们的意思是 octet by byte (即使字节在不同机器上的大小可能不同),但该标准还是相同的. octet 的概念之所以存在,恰恰是因为8位字节是通用的(因此可以选择使用它们作为此类标准的基础),而不是通用的(因此需要另一个单词来避免歧义). /p>

从历史上看,一个字节是用于存储字符的大小,此问题又建立在惯例,标准和实际标准之上,这些惯例早于用于电传和其他通信方法的计算机,始于1870年的Baudot (我之前不知道,但可以进行更正).

以下事实反映了这一事实:在C和C ++中,用于存储字节的单元称为char,其位大小由标准limits.h标头中的CHAR_BIT定义.不同的机器将使用5、6、7、8、9或更多位来定义字符.当然,这些天来,我们将字符定义为21位,并使用不同的编码将其存储为8位,16位或32位单元(以及其他大小的UTF-7等非Unicode授权方式),但从历史上看,原来是这样.

在旨在使机器之间更加一致而不是反映机器体系结构的语言中,byte倾向于以该语言固定,并且如今,这通常意味着该语言在语言中被定义为8位.从历史上看,它们的制造时间已经到来,并且大多数机器现在都具有8位字节,这种区别在很大程度上没有意义,尽管在不同大小的机器上为此类语言实现编译器,运行时等并不是不可能的.个字节,只是没有那么容易.

单词是给定计算机的自然"大小.这个定义不太明确,因为它会影响一些重叠的关注点,这些关注点通常是一致的,但可能不会.一台机器上的大多数寄存器将是这种大小,但有些可能不是.最大的地址大小通常是一个字,尽管事实并非如此(Z80具有8位字节和1个字节的字,但是允许寄存器加倍以提供某些16位支持,包括16位寻址).

同样,我们在这里看到C和C ++之间的区别,其中int是根据字长定义的,而long是为了利用具有长字"概念的处理器而定义的,尽管这样在给定情况下可能与int相同.最小值和最大值再次在limits.h标头中. (实际上,随着时间的流逝,int可以定义为小于自然字长,可以与其他地方的一致性,减少整数数组的内存使用量以及其他可能的问题结合起来使用,不知道).

Java和.NET语言采用的方法是将intlong定义为在所有架构师中都固定不变,并使处理差异成为运行时(尤其是JITter)要处理的问题.值得注意的是,即使在.NET中,指针的大小(使用不安全的代码)也将根据体系结构而有所不同,具体取决于基础字大小,而不是语言施加的字大小.

因此,尽管八位位组==字节和字的关系是整数字节(以及整个二进制整数,如2、4、8等),所以八位字节,字节和字彼此非常独立. )今天很常见.

I don't know very well about RAM and HDD architecture, or how electronics deals with chunks of memory, but this always triggered my curiosity:Why did we choose to stop at 8 bits for the smallest element in a computer value ?

My question may look very dumb, because the answer are obvious, but I'm not very sure...

Is it because 2^3 allows it to fit perfectly when addressing memory ?Are electronics especially designed to store chunk of 8 bits ? If yes, why not use wider words ?It is because it divides 32, 64 and 128, so that processor words can be be given several of those words ?Is it just convenient to have 256 value for such a tiny space ?

What do you think ?

My question is a little too metaphysical, but I want to make sure it's just an historical reason and not a technological or mathematical reason.

For the anecdote, I was also thinking about the ASCII standard, in which most of the first characters are useless with stuff like UTF-8, I'm also trying to think about some tinier and faster character encoding...

解决方案

Historically, bytes haven't always been 8-bit in size (for that matter, computers don't have to be binary either, but non-binary computing has seen much less action in practice). It is for this reason that IETF and ISO standards often use the term octet - they don't use byte because they don't want to assume it means 8-bits when it doesn't.

Indeed, when byte was coined it was defined as a 1-6 bit unit. Byte-sizes in use throughout history include 7, 9, 36 and machines with variable-sized bytes.

8 was a mixture of commercial success, it being a convenient enough number for the people thinking about it (which would have fed into each other) and no doubt other reasons I'm completely ignorant of.

The ASCII standard you mention assumes a 7-bit byte, and was based on earlier 6-bit communication standards.


Edit: It may be worth adding to this, as some are insisting that those saying bytes are always octets, are confusing bytes with words.

An octet is a name given to a unit of 8 bits (from the Latin for eight). If you are using a computer (or at a higher abstraction level, a programming language) where bytes are 8-bit, then this is easy to do, otherwise you need some conversion code (or coversion in hardware). The concept of octet comes up more in networking standards than in local computing, because in being architecture-neutral it allows for the creation of standards that can be used in communicating between machines with different byte sizes, hence its use in IETF and ISO standards (incidentally, ISO/IEC 10646 uses octet where the Unicode Standard uses byte for what is essentially - with some minor extra restrictions on the latter part - the same standard, though the Unicode Standard does detail that they mean octet by byte even though bytes may be different sizes on different machines). The concept of octet exists precisely because 8-bit bytes are common (hence the choice of using them as the basis of such standards) but not universal (hence the need for another word to avoid ambiguity).

Historically, a byte was the size used to store a character, a matter which in turn builds on practices, standards and de-facto standards which pre-date computers used for telex and other communication methods, starting perhaps with Baudot in 1870 (I don't know of any earlier, but am open to corrections).

This is reflected by the fact that in C and C++ the unit for storing a byte is called char whose size in bits is defined by CHAR_BIT in the standard limits.h header. Different machines would use 5,6,7,8,9 or more bits to define a character. These days of course we define characters as 21-bit and use different encodings to store them in 8-, 16- or 32-bit units, (and non-Unicode authorised ways like UTF-7 for other sizes) but historically that was the way it was.

In languages which aim to be more consistent across machines, rather than reflecting the machine architecture, byte tends to be fixed in the language, and these days this generally means it is defined in the language as 8-bit. Given the point in history when they were made, and that most machines now have 8-bit bytes, the distinction is largely moot, though it's not impossible to implement a compiler, run-time, etc. for such languages on machines with different sized bytes, just not as easy.

A word is the "natural" size for a given computer. This is less clearly defined, because it affects a few overlapping concerns that would generally coïncide, but might not. Most registers on a machine will be this size, but some might not. The largest address size would typically be a word, though this may not be the case (the Z80 had an 8-bit byte and a 1-byte word, but allowed some doubling of registers to give some 16-bit support including 16-bit addressing).

Again we see here a difference between C and C++ where int is defined in terms of word-size and long being defined to take advantage of a processor which has a "long word" concept should such exist, though possibly being identical in a given case to int. The minimum and maximum values are again in the limits.h header. (Indeed, as time has gone on, int may be defined as smaller than the natural word-size, as a combination of consistency with what is common elsewhere, reduction in memory usage for an array of ints, and probably other concerns I don't know of).

Java and .NET languages take the approach of defining int and long as fixed across all architecutres, and making dealing with the differences an issue for the runtime (particularly the JITter) to deal with. Notably though, even in .NET the size of a pointer (in unsafe code) will vary depending on architecture to be the underlying word size, rather than a language-imposed word size.

Hence, octet, byte and word are all very independent of each other, despite the relationship of octet == byte and word being a whole number of bytes (and a whole binary-round number like 2, 4, 8 etc.) being common today.

这篇关于为什么8和256在计算机科学中如此重要?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-21 16:59