CSV in Python adding an extra carriage return, on Windows


Question

In Python 2.7 running on Windows XP pro:

import csv
outfile = file('test.csv', 'w')
writer = csv.writer(outfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
writer.writerow(['hi','dude'])
writer.writerow(['hi2','dude2'])
outfile.close()

It generates a file, test.csv, with an extra \r at each row, like so:

test.csv

hi,dude\r\r\nhi2,dude2\r\r\n

instead of the expected:

hi,dude\r\nhi2,dude2\r\n

Why is this happening, or is this actually the desired behavior?

1
194
7/31/2018 8:22:54 AM

Accepted Answer

On Windows, always open your files in binary mode ("rb" or "wb") before passing them to csv.reader or csv.writer.

Although the file is a text file, CSV is regarded a binary format by the libraries involved, with "\r\n" separating records. If that separator is written in text mode, the Python runtime replaces the "\n" with "\r\n" hence the "\r\r\n" that you observed in your file.

See this previous answer.


This answer was posted in 2010 and does not address the problem in Python3.

One of the possible fixes in Python3, as described in @YiboYang's answer, is opening the file with the newline parameter set to be an empty string:

f = open(path_to_file, 'w', newline='')
writer = csv.writer(f)
...
...
255
1/25/2019 2:17:30 AM

While @john-machin gives a good answer, it's not always the best approach. For example, it doesn't work on Python 3 unless you encode all of your inputs to the CSV writer. Also, it doesn't address the issue if the script wants to use sys.stdout as the stream.

I suggest instead setting the 'lineterminator' attribute when creating the writer:

import csv
import sys

doc = csv.writer(sys.stdout, lineterminator='\n')
doc.writerow('abc')
doc.writerow(range(3))

That example will work on Python 2 and Python 3 and won't produce the unwanted newline characters. Note, however, that it may produce undesirable newlines (omitting the LF character on Unix operating systems).

In most cases, however, I believe that behavior is preferable and more natural than treating all CSV as a binary format. I provide this answer as an alternative for your consideration.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon