I have practically no knowledge of Matlab, and need to translate some parsing routines into Python. They are for large files, that are themselves divided into 'blocks', and I'm having difficulty right from the off with the checksum at the top of the file.
What exactly is going on here in Matlab?
status = fseek(fid, 0, 'cof'); fposition = ftell(fid); disp(' '); disp(['** Block ',num2str(iBlock),' File Position = ',int2str(fposition)]); % ----------------- Block Start ------------------ % [A, count] = fread(fid, 3, 'uint32'); if(count == 3) magic_l = A(1); magic_h = A(2); block_length = A(3); else if(fposition == file_length) disp(['** End of file OK']); else disp(['** Cannot read block start magic ! Note File Length = ',num2str(file_length)]); end ok = 0; break; end
fid is the file currently being looked at iBlock is a counter for which 'block' you're in within the file
magic_l and magic_h are to do with checksums later, here is the code for that (follows straight from the code above):
disp(sprintf(' Magic_L = %08X, Magic_H = %08X, Length = %i', magic_l, magic_h, block_length)); correct_magic_l = hex2dec('4D445254'); correct_magic_h = hex2dec('43494741'); if(magic_l ~= correct_magic_l | magic_h ~= correct_magic_h) disp(['** Bad block start magic !']); ok = 0; return; end remaining_length = block_length - 3*4 - 3*4; % We read Block Header, and we expect a footer disp(sprintf(' Remaining Block bytes = %i', remaining_length));
Really though, I want to know how to replicate
[A, count] = fread(fid, 3, 'uint32'); in Python, as
io.readline() is just pulling the first 3 characters of the file. Apologies if I'm missing the point somewhere here. It's just that using
io.readline(3) on the file seems to return something it shouldn't, and I don't understand how the
block_length can fit in a single byte when it could potentially be very long.
Thanks for reading this ramble. I hope you can understand kind of what I want to know! (Any insight at all is appreciated.)
From the documentation of
fread, it is a function to read binary data. The second argument specifies the size of the output vector, the third one the size/type of the items read.
In order to recreate this in Python, you can use the
f = open(...) import array a = array.array("L") # L is the typecode for uint32 a.fromfile(f, 3)
This will read read three uint32 values from the file
f, which are available in
a afterwards. From the documentation of
Read n items (as machine values) from the file object f and append them to the end of the array. If less than n items are available, EOFError is raised, but the items that were available are still inserted into the array. f must be a real built-in file object; something else with a read() method won’t do.
Arrays implement the sequence protocol and therefore support the same operations as lists, but you can also use the
.tolist() method to create a normal list from the array.
import numpy as np with open(inputfilename, 'rb') as fid: data_array = np.fromfile(fid, np.int16)
Some advantages of using
numpy.fromfile versus other Python solutions include:
count=argument, but it defaults to
-1which indicates reading the entire file.
Being able to specify either an open file object (as I did above with
fid) or you can specify a filename. I prefer using an open file object, but if you wanted to use a filename, you could replace the two lines above with:
data_array = numpy.fromfile(inputfilename, numpy.int16)
fread has the ability to read the data into a matrix of form
[m, n] instead of just reading it into a column vector. For instance, to read data into a matrix with 2 rows use:
fid = fopen(inputfilename, 'r'); data_array = fread(fid, [2, inf], 'int16'); fclose(fid);
You can handle this scenario in Python using Numpy's
import numpy as np with open(inputfilename, 'rb') as fid: data_array = np.fromfile(fid, np.int16).reshape((-1, 2)).T
numpy.reshapeto infer the length of the array for that dimension based on the other dimension—the equivalent of Matlab's
.Ttransposes the array so that it is a 2-dimensional array with the first dimension—the axis—having a length of 2.