"Unicode Error "unicodeescape" codec can't decode bytes... Cannot open text files in Python 3


Question

I am using python 3.1, on a windows 7 machines. Russian is the default system language, and utf-8 is the default encoding.

Looking at the answer to a previous question, I have attempting using the "codecs" module to give me a little luck. Here's a few examples:

>>> g = codecs.open("C:\Users\Eric\Desktop\beeline.txt", "r", encoding="utf-8")
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-4: truncated \UXXXXXXXX escape (<pyshell#39>, line 1)
>>> g = codecs.open("C:\Users\Eric\Desktop\Site.txt", "r", encoding="utf-8")
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-4: truncated \UXXXXXXXX escape (<pyshell#40>, line 1)
>>> g = codecs.open("C:\Python31\Notes.txt", "r", encoding="utf-8")
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 11-12: malformed \N character escape (<pyshell#41>, line 1)
>>> g = codecs.open("C:\Users\Eric\Desktop\Site.txt", "r", encoding="utf-8")
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-4: truncated \UXXXXXXXX escape (<pyshell#44>, line 1)

My last idea was, I thought it might have been the fact that windows "translates" a few folders, such as the "users" folder, into Russian (though typing "users" is still the correct path), so I tried it in the Python31 folder. Still, no luck. Any ideas?

1
202
5/23/2017 12:18:22 PM

Accepted Answer

The problem is with the string

"C:\Users\Eric\Desktop\beeline.txt"

Here, \U in "C:\Users... starts an eight-character Unicode escape, such as \U00014321. In your code, the escape is followed by the character 's', which is invalid.

You either need to duplicate all backslashes:

"C:\\Users\\Eric\\Desktop\\beeline.txt"

Or prefix the string with r (to produce a raw string):

r"C:\Users\Eric\Desktop\beeline.txt"
427
7/25/2019 6:20:07 AM

Typical error on Windows because the default user directory is C:\user\<your_user>, so when you want to use this path as an string parameter into a Python function, you get a Unicode error, just because the \u is a Unicode escape. Any character not numeric after this produces an error.

To solve it, just double the backslashes: C:\\user\\<\your_user>...


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon