python: Dowloading and caching XML files - how to handle encoding declaration? -


from urllib.request import urlopen lxml import objectify 

i trying write program download xml files cache , open them using objectify. if download files using urlopen() can read them in using objectify.fromstring() fine:

r = urlopen(my_url) o = objectify.fromstring(r.read()) 

however, if download them , write them file, end encoding declaration @ top of file objectify doesn't like. wit:

# download file my_file = 'foo.xml' r = urlopen(my_url)  # save locally open(my_file, 'wb') fp:     fp.write(r.read())  # open saved copy open(my_file, 'r') fp:     o1 = objectify.fromstring(fp.read()) 

results in valueerror: unicode strings encoding declaration not supported. please use bytes input or xml fragments without declaration.

if use objectify.parse(fp) works fine- soo-- go through , change client code use parse() instead, feel not right approach. have other xml files stored locally .fromstring() works fine-- based on cursory review appear have utf-8 encoding.

i don't know right resolution here- should change encoding when save file? should strip encoding declaration? should fill code try.. except valueerror clauses? please advise.

the file needs opened in binary mode rather text mode.

open(my_file, 'rb') # b stands binary 

as suggested exception: ... please use bytes input ...


Comments

Popular posts from this blog

wordpress - (T_ENDFOREACH) php error -

Export Excel workseet into txt file using vba - (text and numbers with formulas) -

Using django-mptt to get only the categories that have items -