Simple Linear Regression Example in Python
Simple Linear Regression We want to come up with a formula for regression. We want to look at error, which is the distance between the regression line and all of the points. We assume the features and target exhibit a linear relationship. Start with the imports we will need: import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression Generate some random data: X = 30 * np.random.random((20, 1)) y = 0.5 * X + 1.0 + np.random.normal(size=X.shape) for i in range(0, len(X)): print(X[i], ", ", y[i]) print(X.shape) print(y.shape) Create a LinearRegression model. Fit and score it. model = LinearRegression() model.fit(X, y) print(model.score(X, y)) The score is an R-squared score, which looks at the difference between the plots and the regression line. Draw our plots. np linspace creates 100 points between 0 and 30. model.predict() then takes these values and computes the likely target, using the regression formula. This will give us a straight line that gives us the regression line. X_new = np.linspace(0, 30, 100) y_new = model.predict(X_new[:, np.newaxis]) fig, axs = plt.subplots(1, 1, figsize=(9, 9)) fig.suptitle('Linear Regression of random data.\nTraining data in green.\nRegression line in blue.') axs.scatter(X, y, color = "g", s = 99) axs.plot(X_new, y_new) axs.set_xlabel('x') axs.set_ylabel('y') axs.axis('tight') plt.show() List the model coefficient. import pandas as pd print(pd.DataFrame({'Predictor:': 'X', 'coefficient': list(model.coef_[0])})) We don't want it to fit perfectly, as that will give us overfit: we're fitting to the error. Next, we can add some outlier data to add some outliers via np.concat. See how this impacts our model via the score. This will show how it's important to look for outyling data, because it may negatively impact your model. Generate some random data: X = 30 * np.random.random((20, 1)) y = 0.5 * X + 1.0 + np.random.normal(size=X.shape) X = np.concatenate((X, [[2]]), axis=0) y = np.concatenate((y, [[10]]), axis=0) for i in range(0, len(X)): print(X[i], ", ", y[i]) print(X.shape) print(y.shape)
Download
1 formatsVideo Formats
Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.