python - Ignore nested structures in numpy's array creation -
i want write vlen hdf5 dataset, using h5py.dataset.write_direct speed process. suppose have list of numpy arrays (e.g. given cv2.findcontours), , dataset:
dataset = h5file.create_dataset('dataset', \ shape=..., \ dtype=h5py.special_type(vlen='int32')) contours = [numpy array, ...] for writing contours destination given slice dest, must first convert contours numpy array of numpy arrays:
contours = numpy.array(contours) # shape=(len(contours),); dtype=object dataset.write_direct(contours, none, dest) but works, if numpy arrays in contours have different shapes, e.g.:
contours = [np.zeros((10,), 'int32'), np.zeros((10,), 'int32')] contours = numpy.array(contours) # shape=(2,10); dtype='int32' the question is: how can tell numpy create array of objects?
possible solutions:
manual creation:
contours_np = np.empty((len(contours),), dtype=object) i, contour in enumerate(contours): contours_np[i] = contour but loops super slow, using map:
map(lambda (i, contour): contour.__setitem_(i, contour), \ enumerate(contours)) i have tested second option, twice fast above, super ugly:
contours = np.array(contours + [none])[:-1] here micro benchmarks:
l = [np.random.normal(size=100) _ in range(1000)] option 1:
$ start = time.time(); l_array = np.zeros(shape=(len(l),), dtype='o'); map(lambda (i, c): l_array.__setitem__(i, c), enumerate(l)); end = time.time(); print("%fms" % ((end - start) * 10**3)) 0.950098ms option 2:
$ start = time.time(); np.array(l + [none])[:-1]; end = time.time(); print("%fms" % ((end - start) * 10**3)) 0.409842ms this looks kind of ugly, other suggestions?
in version
contours_np = np.empty((len(contours),), dtype=object) i, contour in enumerate(contours): contours_np[i] = contour you can replace loop single statement
contours_np[...] = contours
Comments
Post a Comment