Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Reading "a flat, binary array of 16-bit signed, little-endian (LSB) integers" from file in python

I’m trying to read a old file of snow data from here, but I’m having a ton of trouble just opening a single file and getting data out. In the user guide, it says "Each monthly binary data file with the file extension ".NSIDC8" contains a flat, binary array of 16-bit signed, little-endian (LSB) integers, 721 columns by 721 rows (row-major order, i.e. the top row of the array comprises the first 721 values in the file, etc.)." The data is 20 to 50 years old so there’s not much coding documentation

If I just open the file and run readlines, with this code:

with open(os.path.join(folder,file), 'rb') as f:
# contents = f.read()
lines = f.readlines()

I get something looking like this:
\x00P\x00@\x00\x19\x00\x13\x00C\x00F\x00\x11\x00\r\x00:\x00.\x00\x02

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

If I use np.load(), the results are number like: -6.85682214e+304

I imagine I need to use the struct package and the unstruct function, but I have no idea what format to use, and my attempts are not getting reasonable answers. For instance, I’ve tried just reading the first four bytes and using ‘<i’ as the format, as shown in the code below

with open(os.path.join(folder,file), 'rb') as f:
print(struct.unpack('<i', f.read(4)))

And the print statement showed (-13041864,), which doesn’t make sense. Any insights would be greatly appreciated

>Solution :

You can unpack the data 16 bits at a time and specify this in your unpack format string. You’re using <i, which wants 4 bytes. The data is in 16 bit numbers, which wants 2 bytes. Instead, use <h.

For example,

# I chose a random file from their setup
with open("NL198303.v01.NSIDC8", "rb") as dfile:
    print(struct.unpack("<h", dfile.read(2)))
# prints -200, which is a "fixed value for corners" according to their docs

Here, h means "signed short".

I looked at several random locations in the file and only saw -200 and -250, corresponding to some sort of fixed boundary and ocean spots. Presumably there are other values somewhere, but I didn’t look.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading