本文介绍了sem_post,信号处理程序和未定义的行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在信号处理程序中使用 sem_post()是否依赖未定义的行为?

Does this use of sem_post() in a signal handler rely on undefined behavior?

/*
 * excerpted from the 2017-09-15 Linux man page for sem_wait(3)
 * http://man7.org/linux/man-pages/man3/sem_wait.3.html
 */
...
sem_t sem;
...
static void
handler(int sig)
{
    write(STDOUT_FILENO, "sem_post() from handler\n", 24);
    if (sem_post(&sem) == -1) {
        write(STDERR_FILENO, "sem_post() failed\n", 18);
        _exit(EXIT_FAILURE);
    }
}

信号量 sem 具有静态存储期限.虽然对 sem_post ()的调用是异步信号安全的,但 POSIX.1-2008对信号动作的处理似乎不允许引用该信号量本身:

The semaphore sem has static storage duration. While the call to sem_post() is async-signal-safe, the POSIX.1-2008 treatment of signal actions seems to disallow referencing that semaphore itself:

推荐答案

从技术上讲,是的;在某些情况下,行为是不确定的.

Technically, yes; there are situations where the behaviour is undefined.

我自己使用这种模式的次数很多,我看过的几乎所有信号感知程序也是如此.即使没有任何标准规定,它也有望在实践中工作并且可跨系统移植.

I myself use this pattern quite a lot, and so does almost all signal-aware programs I've looked at. It is expected to work in practice, and be portable across systems, even if not dictated by any standard.

POSIX.1标准将其定义为未定义的行为",不是因为它希望程序避免这种访问,而是因为定义安全访问情况太复杂了,并且可能会限制将来的实现,因为这样做几乎没有收益.是所有此类访问的众所周知的解决方法:捕获信号的专用线程.

The POSIX.1 standard defines it as Undefined Behaviour, not because it expects programs to avoid such access, but because defining the safe access situations would be too complicated and possibly limit future implementations, for very little to no gain, as there is a well-known workaround for all such accesses: a dedicated thread catching the signals.

添加于2018-06-21:

Added on 2018-06-21:

首先让我们总结在信号处理程序中sem_post(&sem)访问 有效的情况(例如,可以通过任何异步信号安全功能引用具有静态存储持续时间的对象),基于 POSIX.1-2018 :

Let's first summarize the cases where the sem_post(&sem) access is valid in a signal handler (i.e., one can refer to objects with static storage duration, for example via any async signal safe functions), based on POSIX.1-2018:

  • 当进程只有一个线程时,信号处理程序将作为同一线程中调用abort()raise()kill()pthread_kill()sigqueue()的线程的结果而执行,并且信号没有/没有在用于执行处理程序的线程中被阻塞.

  • When the process has only one thread, the signal handler is executed as a result of a thread in that same process calling abort(), raise(), kill(), pthread_kill(), or sigqueue(), and the signal is/was not blocked in the thread that was used to execute the handler.

当进程只有一个线程时,该信号在变为挂起状态时将被阻止,并且在取消阻止该信号的调用返回之前将其传递.

When the process has only one thread, the signal was blocked when it became pending, and it was delivered before the call that unblocked the signal returns.

排除了最常见的情况:多线程进程,以及进程外部生成的信号的处理程序(例如,当进程在前台运行时,用户按下 +,则为SIGINT ;或在进程正在运行的会话关闭时为SIGHUP).

This leaves out the most common cases: multithreaded processes, and also handlers for signals generated externally to the process (for example, SIGINT when the process runs in the foreground, and the user presses +; or SIGHUP when the session the process is running in is closed).

我对这种情况的理解是,每个人都希望通过异步信号安全功能引用具有静态存储持续时间的对象的信号处理程序不会在任何理智的POSIXy体系结构上触发未定义的行为;如果在具有静态存储持续时间的对象上使用多线程安全(MT安全)异步信号安全功能,则它将在多线程进程中的工作原理与在单线程进程中的工作原理完全相同;由alarm()setitimer()timer_settime()触发的信号的行为与由raise()sigqueue()触发的信号相同;其他进程发送的信号与目标进程中的raise()sigqueue()触发的信号行为相同;唯一的区别是siginfo结构中的某些字段具有不同的值.

My understanding of the situation is that everybody expects that signal handlers that refer to objects with static storage duration via async-signal safe functions, will not trigger undefined behaviour on any sane POSIXy architectures; if one uses the multithread-safe (MT-safe) async-signal safe functions on objects with static storage duration, it will work exactly the same in a multithreaded process as it would in a single-threaded process; that signals triggered by alarm(), setitimer(), and timer_settime() behave the same as those triggered by raise() or sigqueue(); and that signals sent by other processes behave the same as those triggered by raise() or sigqueue() in the target process; with the only difference being some fields in the siginfo structure having different values.

措词应该具有访问而不是引用的可能性很小.确实的确,即使在多线程进程(例如sem_post(). >卡洛·伍德(Carlo Wood)的答案.

There is even a small possibility that the wording should have accesses instead of refers to. That indeed would indeed allow passing the address of any object with static storage duration to async-signal safe functions like sem_post() even in multithreaded processes, like Carlo Wood's answer posits.

但是,我认为这种措辞的原因比较微妙,涉及并发访问和执行上下文信号处理程序的硬件实现方面的差异:在某些POSIX OS可能表现不同的情况下的行为太复杂了,以至于被标准化,因此只需将其保留为未定义"即可.

However, I believe that the reason for this wording is more subtle, and involves differences in hardware implementations regarding concurrent accesses and the contexts signal handlers are executed in: the behaviour in cases where some POSIX OSes might behave differently was too complicated to be standardized, so was simply left Undefined instead.

我的其余答案试图为那些希望生成可在所有POSIXy系统上运行的可靠,健壮的程序,并且不了解POSIX.1规范中当前措辞的细微之处的开发人员描述这些内容.

The rest of my answer attempts to describe those, for developers who do wish to produce reliable, robust programs that work on all POSIXy systems, and do not understand the subtleness of the current wording in the POSIX.1 spec.

信号处理程序可以安全访问哪些对象的问题很复杂. POSIX标准起草者并没有打开蠕虫的全部,而是对其进行了标记,并声明了行为未定义.

The issue of exactly what objects a signal handler can access safely is complex. Rather than open up the whole can of worms, the POSIX standard drafters just punted it, and declared the behaviour undefined.

最难定义的部分是与并发访问和陷阱表示有关的详细信息.不仅由同一进程中的其他线程,而且由内核. (因为我们只考虑具有静态存储持续时间的对象,所以我们可以避免共享内存和那里所有相关的复杂性.)特别是,如果对象具有陷阱表示,并且该对象被非原子地修改,则中间阶段可能会出现分配导致陷阱.尽管某些架构可能存在硬件限制,但陷阱本身可能会引发信号.

The hardest part to define would be the details related to concurrent access and trap representations. Not just by other threads in the same process, but also by the kernel. (Because we are considering only objects with static storage duration, we can avoid shared memory and all associated complexity there.) In particular, if an object has trap representations, and the object is modified non-atomically, it is possible that the intermediate stages of assignment cause a trap. And that trap itself may cause a signal to be raised, although there may be hardware limitations on some architectures.

因此,与陷阱表示有关的任何事情基本上都太复杂了,无法在标准中解决.

So, anything related to trap representations is basically too complicated to resolve in the standard.

好的,让我们假设该标准将对具有静态存储持续时间的对象进行安全的读取访问,这些对象不会被中断的线程,进程中的任何其他线程或内核同时进行修改;对具有静态存储期限的对象的写入访问权限,这些对象不会被中断的线程,进程中的任何其他线程或内核同时读取或修改.而且所访问的对象根本没有陷阱表示.

Okay, let's assume the standard would limit safe read access to objects with static storage duration, that are not being concurrently modified by the interrupted thread, any other thread in the process, nor the kernel; and write access to objects with static storage duration that are not being concurrently read or modified by the interrupted thread, any other thread in the process, nor the kernel. And that the object being accessed has no trap representations at all.

我们仍然需要考虑一些特定于硬件的信号:至少SIGSEGVSIGBUSSIGILLSIGFPE.不幸的是,某些架构此时可能尚不知道其他信号,因此我们需要定义受影响的信号类型:访问内存时内核发出的信号(SIGFPE,仅当架构在加载时将其引发值,而不仅仅是在对这些值进行算术运算时).如果对具有静态存储持续时间的对象的访问可能引发这些信号之一,则该访问是不安全的,因为它可能导致一系列信号处理程序. (因为没有将标准POSIX信号排队,所以只有每种类型的第一个信号都可以执行,并且进程状态可能会丢失,从而迫使内核终止进程.)

We still have a few hardware-specific signals to consider: SIGSEGV, SIGBUS, SIGILL, and SIGFPE at least. Unfortunately, some architectures may have additional signals not known at this time, so we'd need to define the type of signal affected: signals that are raised by the kernel when memory is accessed (SIGFPE only if the architecture raises it when loading the value, and not just when doing arithmetic etc. on such values). If the access to an object with static storage duration may raise one of these signals, then the access is not safe, as it can lead to a cascade of signal handlers. (Because standard POSIX signals are not queued, only the first signal of each kind gets to execute, and the process state can be lost, forcing the kernel to kill the process.)

从POSIX C编译器的角度来看,如果您考虑将指针作为有效负载获取信号的信号处理程序(在siginfo_t中的si_value.sival_ptr),则整个情况将变得更加复杂:该访问是否导致未定义的行为,取决于目标是否具有静态存储持续时间?

From the POSIX C compiler point of view, the entire situation gets much more complicated if you consider a signal handler that obtains a pointer as payload (si_value.sival_ptr in the siginfo_t): does the access lead to Undefined Behaviour, depending on whether the target has static storage duration or not?

在当前所有的POSIXy系统上,通过原子内置访问静态存储持续时间对象,或者当其他线程或内核不读取/修改静态存储持续时间对象或内核以及中间存储形式时,都不会引发信号,在POSIX实时信号处理程序中或在内存访问未引发的POSIX信号处理程序中,此方法是安全的. 但不能保证在将来也可能是正确的.这就是为什么POSIX标准未对其进行标准化的核心所在.

On all current POSIXy systems, accessing static storage duration objects through atomic built-ins, or when they are not being read/modified by any other threads or the kernel and the intermediate storage forms do not cause a signal to be raised, in a POSIX realtime signal handler, or in a POSIX signal handler that is not raised by memory access, is safe. This is likely, but not guaranteed, to be true in the future, too. And that is at the core of why the POSIX standard does not standardize it.

一个冷酷的事实是,对于所有需要访问具有静态存储持续时间的对象的模式,都有一个POSIX兼容的解决方法:一个单独的线程,专用于通过sigwaitinfo()处理信号,而所有其他信号均被阻塞线程.该线程不仅限于使用异步信号安全功能,其他信号处理程序限制也不适用于该线程. (如果我们考虑信号传递和它所中断的代码之间的相互作用,即使使用SA_RESTART标志定义的处理程序,也可能会认为基于线程的方法是两者中较好的一种.)

The cold fact is, there is a POSIX-compliant workaround for all the patterns requiring access to an object with static storage duration: a separate thread, dedicated to handling signals via sigwaitinfo(), with all those signals blocked in all other threads. That thread is not limited to using async-signal safe functions, nor do the other signal handler limitations apply to it. (If we consider the interaction between signal delivery and the code it interrupts, even with handlers defined with the SA_RESTART flag, one could argue that the thread-based approach is the better one of the two.)

简而言之:由于存在已知的解决方法,并且定义安全访问案例太复杂并且限制了未来的实现,因此POSIX标准根本没有将这种传统使用案例标准化.并不是因为人们期望它不起作用-恰恰相反.它在所有当前的POSIXy系统中都可以正常工作-但是,因为定义安全访问用例(errnovolatile sig_atomic_t除外)既不值得复杂性和可能的​​限制,这两个条件都需要并获得POSIX C编译器的支持).

Simply put: Because known workarounds exist, and defining the safe access cases would be too complicated and limit future implementations, the POSIX standard does not standardize this traditional use case at all. It is not because it is expected to not work -- quite the opposite; it works fine in all current POSIXy systems --, but because it is not worth the complexity and possible limitations to define the safe access cases (other than errno and volatile sig_atomic_t, which both require and have support from the POSIX C compilers).

这篇关于sem_post,信号处理程序和未定义的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-23 07:41