Slow data analysis code can be a real drag. There are numerous ways to accelerate bottleneck code in Numpy such as compiling expressions with NumExpr or Pythran. However, if you are calling a third-party module you may not be able to use these approaches. In this case your best option might be to do a parallel loop through the array, calling a function on each iteration.
Trying to get your head around libraries for parallel processing in python can be bewildering - there are so many libraries to get to grips with. In this repository I’ve set out an example notebook called
numpyParallelSimple.ipynb with some of the best libraries for this task.
If you don’t want to run the notebook you can test the core idea in a python kernel that runs in your browser using WebAssembly!
1 2 3 4 5 6 7 8 9 import numpy as np from joblib import Parallel,delayed xyLength = 3 timesteps 5 arr = np.random.standard_normal(size=(xyLength,xyLength,timesteps)) def timestepFunc(array2D,timeIndex): return np.exp(array2D),timeIndex
The joblib function is then:
1 2 3 4 5 6 7 8 9 10 11 def joblibProcessing(arr:np.ndarray,backend = "threading",nJobs:int=-1): # Iterate through the third-dimension of the array in parallel resultList = Parallel(backend=backend,n_jobs=nJobs)(delayed(timestepFunc)(arr[:,:,timestep],timestep) for timestep in range(arr.shape)) # Sort the results back into their original order resultList = sorted(resultList,key=lambda x:x) resultList = [el for el in resultList] # Convert the list of results back into a three-dimensional numpy array return np.stack(resultList,axis=2) # Run the function with threading and check that the outputs are the same as for the serial processing outputJoblib = joblibProcessing(arr=arr)
You can then test the performance for difference values of
1 2 %timeit -n 1 -r 1 joblibProcessing(arr=arr,backend="threading") %timeit -n 1 -r 1 joblibProcessing(arr=arr,backend="multiprocessing")