Parfor for Python


I am looking for a definitive answer to MATLAB's parfor for Python (Scipy, Numpy).

Is there a solution similar to parfor? If not, what is the complication for creating one?

UPDATE: Here is a typical numerical computation code that I need speeding up

import numpy as np

N = 2000
output = np.zeros([N,N])
for i in range(N):
    for j in range(N):
        output[i,j] = HeavyComputationThatIsThreadSafe(i,j)

An example of a heavy computation function is:

import scipy.optimize

def HeavyComputationThatIsThreadSafe(i,j):
    n = i * j

    return scipy.optimize.anneal(lambda x: np.sum((x-np.arange(n)**2)), np.random.random((n,1)))[0][0,0]
5/6/2015 11:15:44 AM

Accepted Answer

There are many Python frameworks for parallel computing. The one I happen to like most is IPython, but I don't know too much about any of the others. In IPython, one analogue to parfor would be or some of the other constructs in the documentation on quick and easy parallelism.

5/6/2015 4:31:05 PM

The one built-in to python would be multiprocessing docs are here. I always use multiprocessing.Pool with as many workers as processors. Then whenever I need to do a for-loop like structure I use Pool.imap

As long as the body of your function does not depend on any previous iteration then you should have near linear speed-up. This also requires that your inputs and outputs are pickle-able but this is pretty easy to ensure for standard types.

UPDATE: Some code for your updated function just to show how easy it is:

from multiprocessing import Pool
from itertools import product

output = np.zeros((N,N))
pool = Pool() #defaults to number of available CPU's
chunksize = 20 #this may take some guessing ... take a look at the docs to decide
for ind, res in enumerate(pool.imap(Fun, product(xrange(N), xrange(N))), chunksize):
    output.flat[ind] = res

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow