Warm tip: This article is reproduced from stackoverflow.com, please click
c clang fork optimization

Code shows different behaviour when optimized

发布于 2020-04-08 23:37:27

Summary

I tried to code a Monte Carlo simulation that forks into up to number of cores processes. After a certain amount of time the parent sends SIGUSR1 to all children which then should stop calculating an send results back to the parent.

When I compile without any optimization (clang thread_stop.c) the behavior is as expected. When I try to optimize the code (clang -O1 thread_stop.c) the signals are caught, but the children do not stop.

Code

I cut the code down to the smallest piece which behaves the same:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <sys/types.h>  /* pid_t */
#include <sys/mman.h>   /* mmap */

#define MAX 1           /* Max time to run */

static int a=0; /* int to be changed when signal arrives */

void sig_handler(int signo) {
    if (signo == SIGUSR1){
        a=1;
        printf("signal caught\n");
    }
}

int main(void){

    int * comm;
    pid_t pid;

    /* map to allow child processes access same array */
    comm = mmap(NULL, sizeof(int), PROT_READ | PROT_WRITE,
                    MAP_SHARED | MAP_ANONYMOUS, -1, 0);
    *comm = 0;
    pid=fork();
    if(pid == 0){ /* child process */
        signal(SIGUSR1, sig_handler); /* catch signal */ 

        do {
            /* do things */
        } while(a == 0);

        printf("Child exit(0)\n");
        *comm = 2;
        exit(0); /* exit for child process */
    } /* if(pid == 0) - code below is parent only */

    printf("Started child process, sleeping %d seconds\n", MAX);
    sleep(MAX);
    printf("Send signal to child\n");
    kill(pid, SIGUSR1); /* send SIGUSR1 */
    while(*comm != 2) usleep(10000);
    printf("Child process ended\n");

/* clean up */

    munmap(comm, sizeof(int));
    return 0;
}

System

clang shows this on termux (clang 9.0.1) and lubuntu (clang 6.0.0-lubuntu2).

Questioner
Florian
Viewed
68
M.M 2020-02-01 08:59

There are restrictions on what you can do in a signal handler that is called asynchronously. In your code this happens because kill is called from a separate process.

In ISO C the only permitted observable action is to modify a variable of type sig_atomic_t .

In POSIX there is a bit more leniency:

the behavior is undefined if the signal handler refers to any object other than errno with static storage duration other than by assigning a value to an object declared as volatile sig_atomic_t, or if the signal handler calls any function defined in this standard other than one of the functions listed in the following table.

The following table defines a set of functions that shall be async-signal-safe. Therefore, applications can call them, without restriction, from signal-catching functions. Note that, although there is no restriction on the calls themselves, for certain functions there are restrictions on subsequent behavior after the function is called from a signal-catching function (see longjmp).

The printf function is not in the table, so your program causes undefined behaviour when the signal is executed (which means unexpected results may follow).


So you will need to stop calling printf in the signal handler, and also change a to have type volatile sig_atomic_t.

There is also a race condition on the memory location *comm. One thread reads it while another may simultaneously write it, with no synchronization. However I haven't been able to find in the POSIX documentation what the consequences of this are.