本文介绍了是否可以使用物理传感器生成随机数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我听说有人使用光传感器、盖革计数器和其他物理传感器来生成随机数,但我持怀疑态度.真的有办法通过测量物理世界(使用 Arduino 或任何其他微控制器)来生成随机数吗?如果是这样,这些数字真的会是随机的吗?

澄清:问题是关于使用微控制器收集的数据生成随机数的可行性,这些随机数可以很好地应用于密码学 - 依赖设备熵的替代方法.

解决方案

采用模拟真实世界"测量通常是 (又名真实随机数据).模拟源总是会叠加一些不可预测的噪声,这些噪声可以收集".然而,如前所述,测量数据很少是无偏的.

模拟电气测量也可能或多或少容易受到无法控制的影响甚至来自外部的攻击,例如.通过导致传感器饱和.EMI 也可能干扰测量;在通话期间放置在合理接近电路的普通手机很可能会对任何模拟信号造成严重破坏.

偏差

无偏的、均匀分布的高熵数通常是人们想要的,因为它们的属性(不是)在某种程度上被归一化,因此可以更可靠地预测.

当以 10 位分辨率测量模拟输入时,理想情况下,从测量中收集的数字范围将涵盖从 0 到 1024 的所有值,并且每个值将与来自那个范围.

实际上,这些值通常会(或多或少)正态分布(高斯分布),围绕具有某些特征标准偏差的某个平均值,例如每个样本大约 500 @ 10 位.

去偏置

因此,为了生成具有所需属性的随机值(见上文),需要进行一些去偏差:A 需要某种随机性提取器.

使用加密函数,例如(单向)散列函数或密码算法,通常也很容易产生所需的结果;但这需要付出代价:这些功能的混合"设计非常强,这使得无法确定转换后源数据的质量(= 随机性).即使是最简单的确定性值序列,如 {1,2,3,4,5...},在散列时产生的数据很可能会通过任何和所有统计随机性测试,尽管它不是随机"的全部.

在 µC 上生成熵

时间

在微控制器环境中,很少想到的一个很好的熵来源是事件的时间.使用高速计时器,可以通过读取计时器值以响应某些异步事件来收集真正的熵.这种与正在运行的计时器无关的事件可能是用户按下按钮、另一个(子)系统或 IC 启动的通信开始,或者基本上任何其他未由 µC(或任何同步子系统)本身.

事实上,熵甚至可以从两个独立时钟源中获取;例如通过一个时钟的计时周期通过另一个时钟.根据 µC 的功能,这为 Atmel AVR µC(在 Arduino 中使用)开辟了一些非常有趣的可能性:

  • 大多数 AVR 都有内部 EEPROM 存储器.对此存储器的写入操作由独立于主系统时钟的专用计时器计时(注意在一些 AVR 中,例如 ATTiny25/45/85,EEPROM 时序来自内部 RC 振荡器,因此当该振荡器也被选为系统时钟源时,无法收集熵);这可能取决于主时钟源(内部 R/C 与外部晶振/谐振器).因此,相对于主系统时钟,写入 EEPROM 所需的时间会出现一些(真正随机的)抖动,这也可以用高速定时器/计数器来衡量.

  • 较新的 AVR 能够让看门狗定时器生成软件中断而不是硬件复位.看门狗定时器设计为由其自己的独立时钟源控制,从而产生可以测量的相对抖动.

  • 许多 AVR 能够使用外部 32kHz 晶振为专用定时器/计数器提供时钟,以提高实时测量的准确性.该外部晶体是与主时钟无关的另一个事件源.(否则原本多余的水晶就没有用了……)

后者似乎因其相对高带宽的潜力而大有希望:当使用系统定时器对每个 32kHz 时钟周期进行计时时,运行速度明显更快(在当前的 AVR 上可以达到 600+ 倍@20 MHz!)并且保守地假设每次测量只有 1 位熵,这导致 每秒 32000+ 位熵 - 远远超过 µC 自身消耗的量.

同时,我对 32kHz 定时器方法进行了一些简单的测试,短期结果似乎质量很低.每个样本生成的熵的上限似乎非常低,而我什至没有测试样本是否存在或多或少有规律的相移引起的非明显模式.这个结果可能是由于我的 DUT 的主时钟由外部晶体驱动,当在有限的时间范围内观察时,它可能在频率上(在测量的精度范围内)与 32kHz 石英一样稳定.延长两次采样之间的时间(分钟?)可能会返回良好的熵,但带宽非常低.(注意:测量的抖动也可能部分是由于中断延迟的变化,这取决于触发中断时执行的机器指令.)

编辑 #2:我的 DUT (ATmega1284) 的内部 RC 振荡器似乎产生了明显的频率抖动(几 kHz/s);当由外部 32kHz 晶体计时时,在该振荡器上运行似乎确实产生了相当多的熵 (kBits/s).

我最近在一个小实验中研究了前两种方法.在我的 DUT 上,EEPROM 时序通常优于 WDT:

对 EEPROM 写入进行定时每次写入操作产生大约 4.82 位的熵,而看门狗定时器似乎在频率方面更稳定,每次看门狗超时产生大约 3.92 位.此外,EEPROM 的写入时间似乎更平滑地呈高斯分布,其中 WDT 的分布似乎有些不对称并且有很多像差.

N.b.:为单个熵测量聚合多个随机"事件实际上可能会降低获得的熵:源中的快速随机波动可能会相互补偿,从而产生与平均值偏差较小的结果值.因此,例如,除了计时,一个实时秒(RTC 晶体的 32k 周期)可以通过在相同的时间内获取 32k 计时(一个用于每个晶体周期)获得更多的熵.时间.

未初始化的 RAM

Avr-gcc 编译的应用程序通常在执行用户代码之前将整个片上 RAM 清除为 0x00,即 main().将代码放入早期的 .init 部分,可以在被 gcc 的初始化例程覆盖之前访问原始未初始化的 RAM 内容.

由于 RAM 的物理存储单元(位)的微小差异,并且取决于一些真正的随机热噪声(和其他影响),当(重新)施加电源时,并非每个单元都会将自身初始化为相同的已知状态芯片.在加电后立即将芯片 RAM 的内容与某些功能相结合,可以产生大量的熵供以后使用.- 这样做的缺点是它只有在电源关闭一段时间然后再次打开时才能可靠地工作.正常的芯片重置,通过硬件、软件或外部信号,将保留 RAM 以前的内容,因此(总是)不是一个好的熵源.然而,由于在相当复杂的应用中很难预测重置时整个系统 (RAM) 的状态,因此无论如何可能会在重置后立即收集一些熵.

带宽要求

熵源的质量必须与其带宽和应用程序使用熵的带宽相关.某些熵收集方法在几秒钟内可能不会产生超过 1 位的熵,而其他方法(不是真的在 µCs 上……)可能会产生 100 kbit/s 或更多.

必须注意的是,不能通过算法从现有熵中创建"新的熵!- 一位熵不能通过计算转换为两位熵.

因此,每个时间单位消耗的(实际)熵不能(平均而言)多于同时从熵源收集的熵.

提案

当需要强随机数时,将一个或多个实熵源与强PRNG结合起来的情况并不少见.a>,每次有新的熵可用时,使用收集到的熵来(重新)播种 PRNG.

PRNG 可用于生成比熵源在同一时间内实际提供的数据更多的基本不可预测的数据.

作为一种特殊的 PRNG,可以使用加密密码函数,其中熵用于初始化和更新密码的密钥.

Linux 的 /dev/urandom 通常以这种方式实现.>

结论

如上所述,在普通微控制器上生成真正的随机数是非常可行的.对于所有其他熵源,需要分析熵源提供的原始数据,以了解它们包含的实际熵量和每时间单位生成的熵量,以确定该源是否适用于用例与否.

真正的熵源和强 PRNG 的组合是普遍实现的方法,也应该在微控制器上使用.

PRNG 方法可能不是加密密钥生成的最佳选择.为此,应该只使用真正的随机位来生成安全密钥.收集这么多的熵可能需要一些时间(可能是几秒钟),但由于密钥生成通常不会在 µC 上非常频繁地执行,因此这可能是可以接受的.(在负载较重的服务器上,每秒有数百个或更多 SSL (HTTPS) 连接,这将是另一个问题...)

要生成适合密钥生成的高质量高熵比特流,应采用上述随机抽取器.

(另一方面,如果可以测量或估计源输出中的熵量,则可以通过 (密钥的比特长度)/(每比特采样的熵)的因子简单地缩放密钥长度 然后直接使用来自熵源的原始低熵数据来生成这个更长的密钥,然后它将具有与原始长度的完全随机密钥相同的整体熵.但是,如果真的这样做,技巧取决于,关于密码如何处理不同长度的密钥.)

I've heard about people using light sensors, geiger counters, and other physical sensors to generate random numbers, but I'm skeptical. Is there really a way to generate random numbers from taking measurements of the physical world (using an Arduino or any other microcontroller)? If so, would these numbers ever be really random?

to clarify: the question is about the feasibility of using microcontroller-gathered data to generate random numbers that could be applied soundly to cryptography-- an alternative to relying on a device's entropy.

解决方案

Taking analog "real world" measurements usually is a good source of entropy (a.k.a. real random data). Analog sources always have some unpredictable noise superimposed which can be "harvested". However, as was stated before, the measured data is rarely un-biased.

Analog electrical measurements may also be more or less susceptible to uncontrollable influence or even attacks from outside, e.g. by causing saturation of the sensor(s). EMI is also likely to interfere with the measurement; a regular cell phone placed reasonably close to the circuit during a call will most likely wreak havoc on any analog signals.

Bias

Un-biased, uniformly distributed numbers of high entropy are commonly those one wants, because their properties (not values) are somewhat normalized and can therefore be more reliably predicted.

When measuring analog input with, say, 10 bit resolution, ideally the range of numbers gathered from the measurement will cover all values from 0 to 1024 and each value will occur with the same frequency (or probability) as any other value from that range.

In reality, those values will often be (more or less) normally distributed (Gauss distributed) around some avarage value with some characteristic standard deviation, for example around 500 @ 10 bit per sample.

De-biasing

So, in order to generate random values with the desired properties (see above), some de-biasing needs to be done: A randomness extractor of some kind is needed.

Using cryptographic functions, like (one way) hash functions or cipher algorithms, usually easily produces the desired result as well; this comes at a cost though: The "mixing" of those functions is by design extremely strong, which makes it impossible to determine the quality (= randomness) of the source data after the transformation. Even the simplest deterministic sequences of values, like {1,2,3,4,5...}, when hashed produce data that will most likely pass any and all statistical randomness tests pretty well, although it is not "random" at all.

Generating entropy on a µC

Timing

In microcontroller environments a good source of entropy that is seldom thought of is the timing of events. Using a high-speed timer, true entropy can be gathered by reading the timer value in response to some asynchronous event. Such an event, which is uncorrelated with the running timer, may be the push of a button by a user, the start of communication initiated by another (sub-)system or IC, or basically any other event not triggered by the µC (or any synchronous subsystem) itself.

In fact, entropy can even be harvested from just two independent clock sources; for instance by timing cycles of one clock via the other clock. This opens a couple of very interesting possibilities on Atmel AVR µCs (which are used in the Arduino) depending on the µC's capabilities:

  • Most AVRs have internal EEPROM memory. Write operations to this memory are timed by a dedicated timer which is independent of the main system clock (edit: note that in some AVRs, ATTiny25/45/85 for example, the EEPROM timing is derived from the internal RC oscillator, so that no entropy can be gathered when this oscillator is also selected as the system clock source); this may depend on the main clock source (internal R/C vs. external crystal/resonator). Therefore, there is some (truly random) jitter to be expected in the time it takes to write to the EEPROM with respect to the main system clock, which again can be measured be a high-speed timer/counter.

  • Newer AVRs have the capability to let the watchdog timer generate a software interrupt instead of a hardware reset. The watchdog timer is by design controlled by its own independent clock source, which yields the relative jitter one can measure.

  • Many AVRs have the capability to have a dedicated timer/counter be clocked from an external 32kHz crystal for improved accuracy of real-time measurements. This external crystal is another source of events uncorrelated with the main clock. (Otherwise there would be no use in the extra crystal in the first place...)

The latter seems to be promising for its potential of relatively high bandwidth: When timing each 32kHz clock cycle with a system timer running significantly faster (a factor of 600+ can be achieved on current AVRs @ 20 MHz!) and conservatively assuming only 1 bit of entropy per measurement, this results in 32000+ bits of entropy per second - far more than a µC will ever consume by itself.

EDIT: Meanwhile, I have conducted some simple tests of the 32kHz timer approach, and the short-term results seem to be of pretty low quality. The upper boundary on the generated entropy per sample seems to be really low, while I have not even tested the samples for non-obvious patterns originating from more or less regular phase shifts. This result may be due to the fact that my DUT had its main clock driven by an external crystal which may be expected to be (within the precison of the measurements) equally stable in frequency as the 32kHz quartz when observed over a limited time range. Extending the the time between taking two samples (minutes?) will probably return good entropy, yet at a very low bandwith. (N.b.: The jitter measured may also be partly due to varying interrupt latency depending on the machine instruction executed right at the time the interrupt is triggered.)

EDIT #2: It appears that the internal RC oscillator of my DUT (ATmega1284) produces significant frequency jitter (several kHz/s); running on this oscillator indeed seems to produce pretty much entropy (kBits/s) when timed by the external 32kHz crystal.

In a little experiment I recently investigated the former two methods. On my DUT the EEPROM timing would generally be advantageous over the WDT:

Timing the EEPROM write produced about 4.82 bits of entropy per write operation, while the watchdog timer seems more stable frequency-wise yielding about 3.92 bits per watchdog timeout. Additionally, the EEPROM's write times seem much more smoothly Gauss-distributed where the WDT's distribution seems somewhat asymmetric and with a lot of aberrations.

N.b.: Aggregating multiple "random" events for a single entropy measurement may actually degrade the entropy obtained: Fast, random fluctuations in the source may partially compensate each other, yielding result values with lower deviation from the mean value. So, instead of timing, for instance, one real time second (32k cycles of the RTC crystal) much more entropy can be expected from taking 32k timings (one for each cycle of the crystal) during the same time.

Uninitialized RAM

Avr-gcc compiled applications usually have the whole on-chip RAM cleared to 0x00 before executing user code, i.e. main(). Putting code into an early .init section provides access to the raw uninitialized RAM content before it is overwritten by gcc's initialization routines.

Due to miniscule variances in the RAM's physical storage cells (bits) and depending on some true random thermal noise (and other effects), not every cell will initialize itself to the same known state when power is (re-)applied to the chip. Combining the contents of the chip's RAM right after power up with some function can yield significant amounts of entropy to be used later. - The downside of this is that it will only work reliably when power has been turned off for some time and is then turned on again. A normal chip reset, by hardware, software, or external signal, will preserve the RAM's previous content and is therefore not (always) a good source of entropy. However, since the state of the whole system (RAM) at the time of the reset can hardly be predicted in a reasonably complex application some entropy may be gathered immediately after a reset anyway.

Bandwidth requirements

The quality of an entropy source has to be seen in relation to its bandwidth and the bandwidth of the use of entropy by the application. Some methods of entropy gathering may not yield more than one bit of entropy during some seconds time while others (not really on µCs...) may produce 100 kbit/s or more.

It must be noted that one cannot algorithmically "create" new entropy from existing entropy! - One bit of entropy cannot be computationally transformed to two bits of entropy.

Thus, one cannot (on avarage) consume more (real) entropy per time unit than what is gathered from the entropy source(s) in the same time.

Proposal

When in need of strong random numbers, it is not uncommon to combine one or more sources of real entropy with a strong PRNG, using the entropy gathered to (re-)seed the PRNG each time new entropy is available.

The PRNG can be used to generate much more basically unpredictable data than the entropy source would actually provide in the same time.

As a special kind of PRNG, cryptographic cipher functions can be used, where entropy is used to initialize and update the cipher's key.

Linux's /dev/urandom is commonly implemented this way.

Conclusion

As discussed above, it is quite feasible to generate truly random numbers on a common microcontroller. As for all other sources of entropy, one needs to analyze the raw numbers provided by the entropy source(s) for the amount of real entropy they contain and for the amount of entropy generated per time unit to determine if the source is suitable for the use case or not.

The combination of a true entropy source and a strong PRNG is the approach that is commonly implemented and which should be used on a microcontroller aswell.

EDIT:

The PRNG approach may not be the best choice for cryptographic key generation. For that one should only use truly random bits to produce a secure key. Gathering this amount of entropy may take some time (seconds maybe), but since key generation is usually not performed very frequently on a µC this may well be acceptable. (On a heavily loaded server witch hundreds or more SSL (HTTPS) connections per second this will be quite another issue...)

To produce quality high entropy bitstreams suitable for key generation a randomness extractor as mentioned above should be employed.

(On the other hand, if the amount of entropy in the source's output can be measured or estimated one may simply scale the key length by the factor of (bitlength of key)/(entropy per bit sampled) and then use the raw low entropy data from the entropy source directly to generate this longer key, which will then have the same overall entropy as a fully random key of the original length. If this really does the trick depends, however, on how the cipher handles keys of different lenghts.)

这篇关于是否可以使用物理传感器生成随机数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-23 15:21