
These WINKS statistics tutorials explain the use and interpretation of standard statistical analysis techniques for Medical, Pharmaceutical, Clinical Trials, Marketing or Scientific Research. The examples include howto instructions for WINKS SDA Version 6.0 Software. Download evaluation copy of WINKS. 
Simple Linear Regression
This
is one in a series of tutorials using examples from WINKS SDA.
Definition: Used to
develop an equation (a linear regression line) for predicting a value of the
dependent variables given a value of the independent variable. A
regression line is the line described by the equation and the
regression equation is the formula for the line. The regression equation
is given by:
Y = a + bX
where X is the independent
variable, Y is the dependent variable, a is the intercept and b
is the slope of the line.
Assumptions: For a fixed
value of X (the independent variable), the population of Y (the dependent
variable) is normally distributed with equal variances across Xs.
Related statistics: The
correlation coefficient, r, measures the strength of the association
between X and Y.
Test: A test that of the
slope of the regression line is 0 is used to determine if the regression
line shows an statistically significant linear relationship between X and Y.
The hypotheses for this test are:
H_{0}: slope = 0
H_{a}: slope <> 0
A low pvalue for this test (less than 0.05) means that there is evidence to
believe that the slope of the line is not 0, or that there is a
statistically significant linear relationship between the two variables.
Note: This test is
equivalent to the test rho = 0 in the correlation procedure.
Location in WINKS:
Simple linear regression is located in the Regression and Correlation
procedures menu.
Graphs: Graphs produced
with the simple linear regression procedure are:
1. Scatterplot with fitted
regression line.
2. Residuals by the
independent variable.
3. Residuals by run order.
Examination of the graphs is
useful to visually verify that the relationship is linear and that there is
no pattern to the residuals. If there is a pattern to the residuals,
remedial methods may need to be taken for the analysis. Reference: Neter,
Wasserman and Kutner.
Example: Use the Simple
Linear Regression procedure to calculate a prediction equation for HP given
the WEIGHT of a car using the CAR database. The partial results are:
Dependent variable is HP, 1 independent
variables, 38 cases.
A low pvalue suggests that the dependent
variable HP may be linearly related to independent variable(s).
Pearson's r (Correlation Coefficient)= 0.9172
RSquare=.8413
The linear regression equation is:
HP = 3.498343 + 34.3144 * WEIGHT
Test of hypothesis to determine significance of
relationship:
H(null): Slope = 0 or H(null): r = 0 (twotailed test)
t = 13.81 with 36 degrees of freedom p = 0.000
The following
plots show the linear relationship and the randomness of the residuals.
Warning:
Using the regression equation to predict values of the dependent variable
outside the range of the independent variable is not recommended since you
have no evidence that the same linear relationship exists outside the
observed range.
Related Topics:
Correlation, multiple linear regression, polynomial regression.
Simple Linear Regression 
Exercise
Data: A random sample of 14
elementary school students is selected, and each student is measured on a
creativity score (x) using a welldefined testing instrument and on a task
score (y) using a new instrument. The task score is the mean time taken to
perform several handeye coordination tasks. The data are:
STUDENT CREATIVITY(X) TASKS(Y)
AE 28 4.5
FR 35 3.9
HT 37 3.9
IO 50 6.1
DP 69 4.3
YR 84 8.8
QD 40 2.1
SW 65 5.5
DF 29 5.7
ER 42 3.0
RR 51 7.1
TG 45 7.3
EF 31 3.3
TJ 40 5.2
Answer these questions:
1. Calculate the regression
equation for this data, and enter it here:
Y =
2. Is the relationship between
tasks score and creativity score statistically significant?
3. What statistic do you base
your answer on?
4. Look at a scattergram of
Creativity by tasks. Does the relationship look linear?
5. Look at a plot of residuals.
Do the residuals look random?
6. Do you think the task score
does a good job of estimating the student's creativity score. Why?