I wonder if there is a direct way to import the contents of a CSV file into a record array, much in the way that R's
read.csv() family imports data to R's data frame?
Or is the best way to use csv.reader() and then apply something like
You can use Numpy's
genfromtxt() method to do so, by setting the
delimiter kwarg to a comma.
from numpy import genfromtxt my_data = genfromtxt('my_file.csv', delimiter=',')
More information on the function can be found at its respective documentation.
I would recommend the
read_csv function from the
import pandas as pd df=pd.read_csv('myfile.csv', sep=',',header=None) df.values array([[ 1. , 2. , 3. ], [ 4. , 5.5, 6. ]])
This gives a pandas DataFrame - allowing many useful data manipulation functions which are not directly available with numpy record arrays.
DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table...
I would also recommend
genfromtxt. However, since the question asks for a record array, as opposed to a normal array, the
dtype=None parameter needs to be added to the
Given an input file,
1.0, 2, 3 4, 5.5, 6 import numpy as np np.genfromtxt('myfile.csv',delimiter=',')
gives an array:
array([[ 1. , 2. , 3. ], [ 4. , 5.5, 6. ]])
gives a record array:
array([(1.0, 2.0, 3), (4.0, 5.5, 6)], dtype=[('f0', '<f8'), ('f1', '<f8'), ('f2', '<i4')])
This has the advantage that file with multiple data types (including strings) can be easily imported.