Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Converting .data file into numpy arrays

My file.data looks like this:

   "3.0,1.5,0\n
     4.6,0.7,1\n
     5.8,2.7,2"

And I want to load this data into two numpy arrays so that it looks like this in the end:

X = [ [3.0, 1.5],
      [4.6, 0.7],
      [5.8, 2.7] ]

y = [0, 1, 2]

If I do the following…

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

fname = open("file.data", "r")
for line in fname.readlines():
    print(line)

…I can read line by line as strings, but what would be the best way to separate these values and put them into the two numpy arrays as shown above?

Is there a nice module or function in numpy that does this really efficiently?

>Solution :

  1. If your data file is a simple txt file with a delimiter as you shown, then you can use numpy.loadtxt to load entire data once
import numpy as np
data = np.loadtxt("file.data",delimiter=',')
X = data[:,0:2]
Y = data[:,2]
  1. Incase you want to read line by line, you can try using numpy.fromstring which will output each string into an array
import numpy as np
data =[]
fname = open("file.data", "r")
for line in fname.readlines():
    data.append(fromstring(line,sep=','))
data_array = np.array(data)
X = data_array[:,0:2]
Y = data_array[:,2]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading