How to input a line word by word in Python? -


i have multiple files, each line with, ~10m numbers each. want check each file , print 0 each file has numbers repeated , 1 each doesn't.

i using list counting frequency. because of large amount of numbers per line want update frequency after accepting each number , break find repeated number. while simple in c, have no idea how in python.

how input line in word-by-word manner without storing (or taking input) whole line?

edit: need way doing live input rather file.

read line, split line, copy array result set. if size of set less size of array, file contains repeated elements

with open('filename', 'r') f:     line in f:         # here said above 

to read file word word, try this

import itertools  def readwords(file_object):     word = ""     ch in itertools.takewhile(lambda c: bool(c), itertools.imap(file_object.read, itertools.repeat(1))):         if ch.isspace():             if word: # in case of multiple spaces                 yield word                 word = ""             continue         word += ch     if word:         yield word # handles last word before eof 

then can do:

with open('filename', 'r') f:     num in itertools.imap(int, readwords(f)):         # store numbers in set, , use set check if number exists 

this method should work streams because reads 1 byte @ time , outputs single space delimited string input stream.


after giving answer, i've updated method quite bit. have look

<script src="https://gist.github.com/smac89/bddb27d975c59a5f053256c893630cdc.js"></script>


Comments

Popular posts from this blog

wordpress - (T_ENDFOREACH) php error -

Export Excel workseet into txt file using vba - (text and numbers with formulas) -

Using django-mptt to get only the categories that have items -