-
multi processing, threading, multi threadingPython 2020. 9. 11. 20:33
Multiprocessing
Pros
- Separate memory space
- Code is usually straightforward
- Takes advantage of multiple CPUs & cores
- Avoids GIL limitations for cPython
- Eliminates most needs for synchronization primitives unless if you use shared memory (instead, it's more of a communication model for IPC)
- Child processes are interruptible/killable
- Python multiprocessing module includes useful abstractions with an interface much like threading.Thread
- A must with cPython for CPU-bound processing
- Multiprocessing achieves true parallelism and is used for CPU-bound tasks
- - Multithreading cannot achieve this because the GIL prevents threads from running in parallel.
- - Multithreading is concurrent and is used for IO-bound tasks
Cons
- IPC a little more complicated with more overhead (communication model vs. shared memory/objects)
- Larger memory footprint
Threading
Pros
- Lightweight - low memory footprint
- Shared memory - makes access to state from another context easier
- Allows you to easily make responsive UIs
- cPython C extension modules that properly release the GIL will run in parallel
- Great option for I/O-bound applications
Cons
- cPython - subject to the GIL
- Not interruptible/killable
- If not following a command queue/message pump model (using the Queue module), then manual use of synchronization primitives become a necessity (decisions are needed for the granularity of locking)
- Code is usually harder to understand and to get right - the potential for race conditions increases dramatically
[NOTE] GIL이 적용되는 것은 cpu 동작에서이고 쓰레드가 cpu 동작을 마치고 I/O 작업을 실행하는 동안에는 다른 쓰레드가 cpu 동작을 동시에 실행할 수 있다. 따라서 cpu 동작이 많지 않고 I/O동작이 더 많은 프로그램에서는 멀티 쓰레드만으로 성능적으로 큰 효과를 얻을 수 있다.
reference: https://monkey3199.github.io/develop/python/2018/12/04/python-pararrel.html
Template Code
Multi-processing
def add(queue, num1, num2): result = num1 + num2 queue.put(result) if __name__ == '__main__': # multi-processing must be used under this from multiprocessing import Process, Queue num_sets = ((1, 1), (2, 2), (3, 3), (4, 4)) queue = Queue() procs = [] for num_set in num_sets: proc = Process(target=add, args=(queue, num_set[0], num_set[0], )) # make sure to put , (comma) at the end procs.append(proc) proc.start() for p in procs: p.join() # make each process wait until all the other process ends. # check results in the queue print(queue.qsize()) for i in range(queue.qsize()): print(queue.get())
Single threading
from threading import Thread import time def logger(result, fname, delay): time.sleep(delay) with open(fname, 'w') as f: f.write('{}\n'.format(result)) print('* logging is finished.') num1 = 1 num2 = 2 result = num1 + num2 # log with a thread thd = Thread(target=logger, args=(result, 'C:/temp01/logger.txt', 5)) thd.start() # while logging with the thread, the script goes on. print(result)
Multi threading
from threading import Thread import time def logger(result, fname, delay): time.sleep(delay) with open(fname, 'w') as f: f.write('{}\n'.format(result)) print('* logging is finished.') num1 = 1 num2 = 2 result = num1 + num2 # log with a thread thd1 = Thread(target=logger, args=(result, 'C:/temp01/logger1.txt', 3)) thd2 = Thread(target=logger, args=(result, 'C:/temp01/logger2.txt', 5)) thd1.start() thd2.start() # If you wanna make sure all the threads end at the same time. Use .join() # while logging with the thread, the script goes on. print(result)
Multi threading: feching thousands of images from a website
Get+imgs+on+the+internet+with+multi-threading.ipynb0.00MBsingle threading multi threading (#threads: 16) 'Python' 카테고리의 다른 글
[numpy, np] arctan2 (0) 2020.09.12 matplotlib 한글 폰트 적용 (0) 2020.09.11 Regular Expression (0) 2020.09.11