What is the easiest way to determine all forks are done in Python

Amadan 2020-02-04 16:09

If you insist on using os.fork, I believe the best way is to use signal to trap SIGCHLD (the signal sent to the parent process whenever a child process terminates). Increment a counter every time you fork, decrement and check for zero every time you catch SIGCHLD. Since the variable is only in the main thread of the parent process, there is no synchronisation problems.

A better way, if it works for you, is to use the multiprocessing module instead of os.fork, as it will do a lot of bookkeeping for you (as well as solve a lot of other recurring problems, like interprocess communication).

EDIT: Here's a toy example for the os.fork case.

import os
import sys
import random
import signal
import time

def do_something(i):
    print(f"Process {i} starting")
    time.sleep(random.uniform(1, 10))
    print(f"Process {i} ending")
    sys.exit()

running = 0

for i in range(10):
    if os.fork():          # in parent fork (we got the child PID)
        running += 1
    else:                  # in child fork (we got zero)
        do_something(i)

def chld_handler(signo, frame):
    global running
    running -= 1
    print(f"A child died. {running} remain.")
    if not running:
        print("All dead. Exiting.")
        sys.exit()

signal.signal(signal.SIGCHLD, chld_handler)

while True: time.sleep(1) # loop forever

logic1976 2020-02-04 14:08:56

Thanks for the help but I can't figure out to do what you're recommending. Also, I can't use multiprocessing because it does not result in an increase in speed whereas forking does. So when I used the code pid = os.fork() what do I do with the pid?

logic1976 2020-02-04 14:11:53

Hold on, I think I got it.

Amadan 2020-02-04 14:17:45

What do you mean, multiprocessing does not result in increase of speed whereas forking does? multiprocessing should do the same thing fork does (if you don't mess up), it just helps you do it safely and easily. Are you confusing multiprocessing with threading? It is very hard to give specific answers as you posted no minimal reproducible example. Regarding pid, the result of os.fork is always zero in child, and non-zero in parent; but you don't need it for anything in this question.

logic1976 2020-02-04 14:22:38

I'll respond in 90 minutes. Now on the road.

logic1976 2020-02-04 15:49:25

Actually, looking at my notes multiprocessing does speed up the program. However, it looks like it would take 30 minutes to rewrite my program using multiprocessing and I don't need to do that given forking already works. My new problem is that I cannot check if a process exists. I kill the program once and I should get an error when I kill it a second time but that's not happening. But it doesn't really matter that much since I mainly want to run the program on Linux and right now I'm using a Mac.

Related issues

How to use python cut method to create bins, accept one parameter and return appropriate bin?

Create a dictionary from a list of lists with certain criteria

selecting columns based on row value, Python, Pandas

plotting count of zeros and ones in a dataframe

BeautifulSoup find.all() web scraping returns empty

python function. output a keys list from a dictionary if the key is todays date

Best way to perform multiple amount of Pandas lookups between two DataFrames

How to get the number of columns and the width of each column in a Pandas pivot table?

Display a column when a desired value is missing while grouping in Pandas dataframe

Python hide ticks but show tick labels