Python Pool Multiprocessing with functions -
okay i've been playing code partly better understanding of python, , partly scrape data web. part of want learn if using python multiprocessing , pool.
i've got basics working, because wrote procedure single threaded first, , moved use pool multi-thread process, have both global variables, , calls globally defined functions. i'm guessing both of these both bad, searching web, things seem complicated fast or don't answer questions.
can confirm firstly global variables bad, , lead problems, me makes sense because 2 threads access same variable @ same time, hence problems.
secondly, if have globally defined function, sake of argument processes string , returns it, using standard string functions, okay call within pool process?
multithreading , multiprocessing quite different when comes how variables , functions can accessed. separate processes (multiprocessing) have different memory spaces , therefore cannot access same (instances of) functions or variables, concept of global variables doesn't exist. sharing data between processes has done via pipes or queues can pass data you. both main process , child process can have access same queue though, in way think of type of global variable.
with multithreading can access global variables , can way program if program simple. example, child thread may read value of variable in main thread , use flag in child thread's function. need aware of threadsafe operations however; complex operations multiple threads on same object can result in conflicts. in case need use thread locking or other safe method. many operations naturally atomic , therefore threadsafe, instance reading single variable. there's list of threadsafe operations , thread syncing on page.
generally multiprocessing , multithreading have time consuming function pass thread or process, won't rerunning same instance of function. below example shows valid use case multiple threads atomically accessing global variable. separate processes won't able to.
import multiprocessing mp import threading import time work_flag = true def worker_func(): global work_flag while true: if work_flag: # stuff time.sleep(1) print mp.current_process().name, 'working, work_flag =', work_flag else: time.sleep(0.1) def main(): global work_flag # processes can't access same "instance" of work_flag! process = mp.process(target = worker_func) process.daemon = true process.start() # threads can safely read global work_flag thread = threading.thread(target = worker_func) thread.daemon = true thread.start() while true: time.sleep(3) # changing flag stop thread, not process work_flag = false if __name__ == '__main__': main()
Comments
Post a Comment