How can I read large text files in Python, line by line, without loading it into memory?


Question

I need to read a large file, line by line. Lets say that file has more than 5GB and I need to read each line, but obviously I do not want to use readlines() because it will create a very large list in the memory.

How will the code below work for this case? Is xreadlines itself reading one by one into memory? Is the generator expression needed?

f = (line for line in open("log.txt").xreadlines())  # how much is loaded in memory?

f.next()  

Plus, what can I do to read this in reverse order, just as the Linux tail command?

I found:

http://code.google.com/p/pytailer/

and

"python head, tail and backward read by lines of a text file"

Both worked very well!

1
206
1/21/2019 12:52:57 PM

Accepted Answer

I provided this answer because Keith's, while succinct, doesn't close the file explicitly

with open("log.txt") as infile:
    for line in infile:
        do_something_with(line)
266
6/25/2011 3:28:39 AM

All you need to do is use the file object as an iterator.

for line in open("log.txt"):
    do_something_with(line)

Even better is using context manager in recent Python versions.

with open("log.txt") as fileobject:
    for line in fileobject:
        do_something_with(line)

This will automatically close the file as well.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon