I just started practicing ML, I tried predicting the outcome of a particular data in a data frame. please i need solution to this problem. Thanks
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
df = pd.read_csv(r'C:\Users\USER\Downloads\homeprices.csv')
%matplotlib inline
plt.xlabel('area (sqr ft)')
plt.ylabel('price (us$)')
plt.title('HOME PRICES')
plt.scatter(df.area, df.price, color = 'red', marker = '+')
reg = LinearRegression()
reg.fit(df[['area']], df.price)
reg.predict(3300)
#returns
ValueError: Expected 2D array, got 1D array instead:
array=[3300].```
>Solution :
reg.predict() is waiting for X in the same dimension as it was in reg.fit().
If you check df[['area']].shape you can see something like (n, 1). It means that there is 1 column and n rows, in oreder to make a prediction you need to have the same amount of columns (1 in your case) and number of rows is flexible (1 in your case).
So the solution is:
reg = LinearRegression()
reg.fit(df[['area']], df.price)
reg.predict([[3300]])
or
reg = LinearRegression()
reg.fit(df[['area']], df.price)
reg.predict(np.array([[3300]]))
or
reg = LinearRegression()
reg.fit(df[['area']], df.price)
reg.predict(pd.DataFrame([[3300]], columns=['area']))
All these structures are the 2-dimensional with 1 column and 1 row