matrix - Serialize iterator object to be passed between processes in Python -


i have python script calculates eigenvalues of matrices list, , insert these eigenvalues collection in same order original matrix , spawning multiple processes.

here code:

import time import collections import numpy np scipy import linalg la  joblib import parallel, delayed  def computeeigenv(unit_of_work):     current_index = unit_of_work[0]     current_matrix = unit_of_work[1]      e_vals, e_vecs = la.eig(current_matrix)     finished_unit = (current_index, lowev[::-1])     return finished_unit  def run(work_list):     pool = parallel( n_jobs = -1, verbose = 1, pre_dispatch = 'all')     results = pool(delayed(computeeigenv)(unit_of_work) unit_of_work in work_list) return results  if __name__ == '__main__':     # create original array of matrices     original_matrix_list = []     work_list = []      #basic set can run test     in range(0, 100):         # generate matrix & unit or work         matrix = np.random.random_integers(0, 100, (500, 500))         #insert respective resources         original_matrix_list.append(matrix)      i, matrix in enumerate(original_matrix_list):         unit_of_work = [i, matrix]         work_list.append(unit_of_work)      work_result = run(work_list) 

so work_result should hold eigenvalues each matrix after processes finish. , iterator using unit_of_work list containing index of matrix (from original_matrix_list) , matrix itself.

the weird thing is, if run code doing python matrix.py works perfectly. when use auto (a program calculations differential equations?) run script, typing auto matrix.py gives me following error:

traceback (most recent call last):   file "matrix.py", line 50, in <module>     work_result = run(work_list)   file "matrix.py", line 27, in run     results = pool(delayed(computeeigenv)(unit_of_work) unit_of_work in work_list)   file "/library/python/2.7/site-packages/joblib/parallel.py", line 805, in __call__     while self.dispatch_one_batch(iterator):   file "/library/python/2.7/site-packages/joblib/parallel.py", line 658, in dispatch_one_batch     tasks = batchedcalls(itertools.islice(iterator, batch_size))   file "/library/python/2.7/site-packages/joblib/parallel.py", line 69, in __init__     self.items = list(iterator_slice)   file "matrix.py", line 27, in <genexpr>     results = pool(delayed(computeeigenv)(unit_of_work) unit_of_work in work_list)   file "/library/python/2.7/site-packages/joblib/parallel.py", line 162, in delayed     pickle.dumps(function) typeerror: expected string or unicode object, nonetype found 

note: when ran auto had change if __name__ == '__main__': if __name__ == '__builtin__':

i looked error , seems not serializing iterator unit_of_work correctly when passing around different processes. have tried use serialized_unit_of_work = pickle.dumps(unit_of_work), pass around, , pickle.loads when need use iterator, still same error.

can please point me in right direction how can fix this? hesitate use pickle.dump(obj, file[, protocol]) because running calculate eigenvalues of thousands of matrices , don't want create many files store serialized iterator if possible.

thanks!! :)

you can't pickle iterator in python2.7 (but can 3.4 onward).

also, pickling works differently in __main__ different when not in __main__, , seem auto doing odd __main__. observe when pickling fails on particular object if instead of running script object in directly, run script main imports portion of script "difficult-to-serialize" object, pickling succeed. because object pickle reference @ namespace level above "difficult" object lives… it's never directly pickled.

so, can away pickling want, adding reference layer… file import or class. but, if want pickle iterator, out of luck unless move @ least python3.4.


Comments

Popular posts from this blog

java - Static nested class instance -

c# - Bluetooth LE CanUpdate Characteristic property -

JavaScript - Replace variable from string in all occurrences -