进程是操作系统分配资源的最小单位,进程之间是相互隔离的,一般一个应用程序就对应一个进程。进程中可以包含多个线程,同一进程内的多线程可以共享进程内的部分资源。由于进程之间有隔离机制,因此在并发编程中,进程之间更加注重通信(或者说资源共享),而多线程编程更加注重线程同步(协同执行)。
Python中有 spawn、fork、forkserver 三种创建子进程的模式,创建子进程的模式与操作系统密切相关,不同模式下创建的子进程,所具有的共享资源有所差异。
The parent process starts a fresh python interpreter process. The child process will only inherit those resources necessary to run the process object’s run() method. In particular, unnecessary file descriptors and handles from the parent process will not be inherited. Starting a process using this method is rather slow compared to using fork or forkserver.
Available on Unix and Windows. The default on Windows and macOS.
import multiprocessing as mp
# ---------------------------------------------------------
# 抛出异常,子进程中没有name资源;NameError: name 'name' is not defined
def f():
print(name)
if __name__ == "__main__":
mp.set_start_method("spawn")
name = "123"
p = mp.Process(target=f) # name 变量并未拷贝到子进程中,而需要通过参数的形式传递给子进程
p.start()
# ------------------------------------------------------------
# 将所需资源传递给子进程
def f(name):
print(name)
print(f"id of name: {id(name)}")
if __name__ == "__main__":
mp.set_start_method("spawn")
name = "123"
p = mp.Process(target=f, args=(name,))
p.start()
print(f"parent process's id of name: {id(name)}") # id of name 的值不同,说明子进程会将传递进来的资源深拷贝一份。
def f(x):
print(x)
if __name__ == "__main__":
mp.set_start_method("spawn")
fb = open("test.txt", "wt")
lock = threading.Lock()
p1 = mp.Process(target=f, args=(fb,))
p1.start() # TypeError: cannot serialize '_io.TextIOWrapper' object
p2 = mp.Process(target=f, args=(lock,))
p2.start() # TypeError: can't pickle _thread.lock objects
在 “if __name__ == "__main__:” 后创建子进程。
在windows环境下,调用Flask的run方法,以多进程模型启动,会抛出“ValueError: Your platform does not support forking.”, 说明flask中的多进程默认使用 fork 模式,而windows系统并不支持此模式。
The parent process uses os.fork() to fork the Python interpreter. The child process, when it begins, is effectively identical to the parent process. All resources of the parent are inherited by the child process. Note that safely forking a multithreaded process is problematic.
Available on Unix only. The default on Unix.
def f():
time.sleep(30)
if __name__ == "__main__":
mp.set_start_method("fork")
p2 = mp.Process(target=f)
p2.start()
time.sleep(60) # 通过ps -ef|grep python,发现前30秒有两个python解释器,后30秒只有一个
复制父进程的全部资源。
同时支持“文件对象”、“线程锁”等对象传参到子进程的run方法中。
可以在任意位置创建子进程。
如果父进程包含多线程,fork模式存在安全性问题。因此 flask 中仅支持多进程单线程或者单进程多线程。
os.fork 返回进程id,该返回值有些特殊,在父进程中打印返回值,会输出子进程的进程id,如果在子进程中打印该值,返回的是0。
import os
import time
pid = os.fork() # only linux
print(f"pid: {pid}")
print("Bobby")
if pid == 0:
print(f"child process: {os.getpid()}, parent process: {os.getppid()}")
else:
print(f"parent process:{pid}")
time.sleep(2)
When the program starts and selects the forkserver start method, a server process is started. From then on, whenever a new process is needed, the parent process connects to the server and requests that it fork a new process. The fork server process is single threaded so it is safe for it to use os.fork(). No unnecessary resources are inherited.
Available on Unix platforms which support passing file descriptors over Unix pipes.
windows仅支持spawn, unix支持fork、spawn、forkserver(部分系统支持)。在项目main模块的“if __name__ == "__main__:” 下调用“multiprocessing.set_start_method”最多一次。
使用multiprocessing.get_context返回一个上下文对象,上下文对象与multiprocessing有着一致的接口。