python - Find the speed of download for a progressbar -
i'm writing script download videos website. i've added report hook download progress. so, far shows percentage , size of downloaded data. thought it'd interesting add download speed , eta.
problem is, if use simple speed = chunk_size/time
speeds shown accurate enough jump around crazy. so, i've used history of time taken download individual chunks. like, speed = chunk_size*n/sum(n_time_history)
.
shows stable download speed, wrong because value in few bits/s, while downloaded file visibly grows @ faster pace.
can tell me i'm going wrong?
here's code.
def dlprogress(count, blocksize, totalsize): global init_count global time_history try: time_history.append(time.monotonic()) except nameerror: time_history = [time.monotonic()] try: init_count except nameerror: init_count = count percent = count*blocksize*100/totalsize dl, dlu = unitsize(count*blocksize) #returns size in kb, mb, gb, etc. tdl, tdlu = unitsize(totalsize) count -= init_count #because continuation of partial downloads supported if count > 0: n = 5 #length of time history consider _count = n if count > n else count time_history = time_history[-_count:] time_diff = [i-j i,j in zip(time_history[1:],time_history[:-1])] speed = blocksize*_count / sum(time_diff) else: speed = 0 n = int(percent//4) try: eta = format_time((totalsize-blocksize*(count+1))//speed) except: eta = '>1 day' speed, speedu = unitsize(speed, true) #returns speed in b/s, kb/s, mb/s, etc. sys.stdout.write("\r" + percent + "% |" + "#"*n + " "*(25-n) + "| " + dl + dlu + "/" + tdl + tdlu + speed + speedu + eta) sys.stdout.flush()
edit:
corrected logic. download speed shown better.
increase length of history used calculate speed, stability increases sudden changes in speed (if download stops, etc.) aren't shown.
how make stable, yet sensitive large changes?
i realize question more math oriented, it'd great if me out or point me in right direction.
also, please tell me if there's more efficient way accomplish this.
_count = n if count > n else count time_history = time_history[-_count:] time_weights = list(range(1,len(time_history))) #just simple linear weights time_diff = [(i-j)*k i,j in zip(time_history[1:], time_history[:-1],time_weights)] speed = blocksize*(sum(time_weights)) / sum(time_diff)
to make more stable , not react when download spikes or down add well:
_count = n if count > n else count time_history = time_history[-_count:] time_history.remove(min(time_history)) time_history.remove(max(time_history)) time_weights = list(range(1, len(time_history))) #just simple linear weights time_diff = [(i-j)*k i,j in zip(time_history[1:], time_history[:-1],time_weights)] speed = blocksize*(sum(time_weights)) / sum(time_diff)
this remove highest , lowest spike in time_history
make number displayed more stable. if want picky, generate weights before removal, , filter mapped values using time_diff.index(min(time_diff))
.
also using non-linear function (like sqrt()
) weights generation give better results. oh , said in comments: adding statistical methods filter times should marginally better, suspect it's not worth overhead add.
Comments
Post a Comment