python - Ignore nested structures in numpy's array creation -


i want write vlen hdf5 dataset, using h5py.dataset.write_direct speed process. suppose have list of numpy arrays (e.g. given cv2.findcontours), , dataset:

dataset = h5file.create_dataset('dataset', \                                 shape=..., \                                 dtype=h5py.special_type(vlen='int32')) contours = [numpy array, ...] 

for writing contours destination given slice dest, must first convert contours numpy array of numpy arrays:

contours = numpy.array(contours) # shape=(len(contours),); dtype=object dataset.write_direct(contours, none, dest) 

but works, if numpy arrays in contours have different shapes, e.g.:

contours = [np.zeros((10,), 'int32'), np.zeros((10,), 'int32')] contours = numpy.array(contours) # shape=(2,10); dtype='int32' 

the question is: how can tell numpy create array of objects?


possible solutions:

manual creation:

contours_np = np.empty((len(contours),), dtype=object) i, contour in enumerate(contours):     contours_np[i] = contour 

but loops super slow, using map:

map(lambda (i, contour): contour.__setitem_(i, contour),  \     enumerate(contours)) 

i have tested second option, twice fast above, super ugly:

contours = np.array(contours + [none])[:-1] 

here micro benchmarks:

l = [np.random.normal(size=100) _ in range(1000)] 

option 1:

$ start = time.time(); l_array = np.zeros(shape=(len(l),), dtype='o'); map(lambda (i, c): l_array.__setitem__(i, c), enumerate(l)); end = time.time(); print("%fms" % ((end - start) * 10**3)) 0.950098ms 

option 2:

$ start = time.time(); np.array(l + [none])[:-1]; end = time.time(); print("%fms" % ((end - start) * 10**3)) 0.409842ms 

this looks kind of ugly, other suggestions?

in version

contours_np = np.empty((len(contours),), dtype=object) i, contour in enumerate(contours):     contours_np[i] = contour 

you can replace loop single statement

contours_np[...] = contours 

Comments

Popular posts from this blog

wordpress - (T_ENDFOREACH) php error -

Export Excel workseet into txt file using vba - (text and numbers with formulas) -

Using django-mptt to get only the categories that have items -