I am looking for a definitive answer to MATLAB's parfor for Python (Scipy, Numpy).
Is there a solution similar to parfor? If not, what is the complication for creating one?
UPDATE: Here is a typical numerical computation code that I need speeding up
import numpy as np N = 2000 output = np.zeros([N,N]) for i in range(N): for j in range(N): output[i,j] = HeavyComputationThatIsThreadSafe(i,j)
An example of a heavy computation function is:
import scipy.optimize def HeavyComputationThatIsThreadSafe(i,j): n = i * j return scipy.optimize.anneal(lambda x: np.sum((x-np.arange(n)**2)), np.random.random((n,1)))[0,0]
There are many Python frameworks for parallel computing. The one I happen to like most is IPython, but I don't know too much about any of the others. In IPython, one analogue to parfor would be
client.MultiEngineClient.map() or some of the other constructs in the documentation on quick and easy parallelism.
The one built-in to python would be
multiprocessing docs are here. I always use
multiprocessing.Pool with as many workers as processors. Then whenever I need to do a for-loop like structure I use
As long as the body of your function does not depend on any previous iteration then you should have near linear speed-up. This also requires that your inputs and outputs are
pickle-able but this is pretty easy to ensure for standard types.
UPDATE: Some code for your updated function just to show how easy it is:
from multiprocessing import Pool from itertools import product output = np.zeros((N,N)) pool = Pool() #defaults to number of available CPU's chunksize = 20 #this may take some guessing ... take a look at the docs to decide for ind, res in enumerate(pool.imap(Fun, product(xrange(N), xrange(N))), chunksize): output.flat[ind] = res