I try to read .txt with missing values using pandas.read_csv. My data is of the format:
with thousands of samples with same name of the point, gps position, and other readings. I use a code:
myData = read_csv('~/data.txt', sep=',', na_values='')
The code is wrong as na_values does not gives NaN or other indicator. Columns should have the same size but I finish with different length.
I don't know what exactly should be typed in after na_values (did try all different things). Thanks
na_values must be "list like" (see this answer).
A string is "list like" so:
na_values='abc' # would transform the letters 'a', 'b' and 'c' each into `nan` # is equivalent to na_values=['a','b','c']`
na_values='' # is equivalent to na_values= # and this is not what you want!
This means that you need to use
What version of pandas are you on? Interpreting empty string as NaN is the default behavior for pandas and seem to parse the empty strings fine in your data snippet both in v0.7.3 and current master without using the
na_values parameter at all.
In : data = """\ 10/08/2012,12:10:10,name1,0.81,4.02,50;18.5701400N,4;07.7693770E,7.92,10.50,0.0106,4.30,0.0301 10/08/2012,12:10:11,name2,,,,,10.87,1.40,0.0099,9.70,0.0686 """ In : read_csv(StringIO(data), header=None).T Out: 0 1 X.1 10/08/2012 10/08/2012 X.2 12:10:10 12:10:11 X.3 name1 name2 X.4 0.81 NaN X.5 4.02 NaN X.6 50;18.5701400N NaN X.7 4;07.7693770E NaN X.8 7.92 10.87 X.9 10.5 1.4 X.10 0.0106 0.0099 X.11 4.3 9.7 X.12 0.0301 0.0686