## Statistics- Regression

Regression is an important concept in Statistics. Regression is used to measure the relation between two or more variables where one is independent and the others are dependant variables. Regression can be used to predict the dependant variable when the independent variable is known.

We can see an example to understand regression clearly.

Q: The regression line known as the least squares line is a plot of the expected value of the dependant variable of all values of the independent variable. In regression equation, y is always the dependant variable and x is always the independent variable.

The sales of a company (in million dollars) for each year are shown in the table below.

x (year) 2005 2006 2007 2008 2009

y (sales) 12 19 29 37 45

a) Find the least square regression line y = ax + b.

b) Use the least squares regression line as a model to estimate the sales of the company in 2012.

Sol: a) We first change the variable x into t such that t = x – 2005 and therefore t represents the number of years after 2005. Using t instead of x makes the numbers smaller and therefore manageable. The table of values becomes.

t (years after 2005) 0 1 2 3 4

y (sales) 12 19 29 37 45

We now use the table to calculate a and b included in the least regression line formula.

t y ty t^2

0 12 0 0

1 19 19 1

2 29 58 4

3 37 111 9

4 45 180 16

Σx = 10 Σy = 142 Σxy = 368 Σx2 = 30

We now calculate a and b using the least square regression formulas for a and b.

a= (n∑t y-∑t∑y)/((n∑t^2-(∑t )^2)) =

(5×368-10×142)/((5×142-〖10〗^2)) = 8.4

b = 1/n(∑y-a∑x) = 1/5(142-8.4×10) = 11.6

b) In 2012, t = 2012 – 2005 = 7

The estimated sales in 2012 are: y = 8.4 × 7 + 11.6 = 70.4 million dollars.

