| · Home · Description · Purchase ·Tutorials · Download · Support · Compare · Videos · What's New? · | ||||||||||
![]() |
||||||||||
|
"I consider WINKS a treasure." - Bill Lafitte, Pepperdine University-GSEP |
||||||||||
|
Tutorial Menu
New -- Kwikstat
"I recommend WINKS to anyone who needs a quick and easy program to perform data analysis." Dr. Wayne Woodward, Southern Methodist University
|
Pearson's Correlation Coefficient
|
|||||||||
Definition: Measures the strength of the linear relationship between two variables.
Assumptions: Both variables (often called X and Y) are interval/ratio and approximately normally distributed, and their joint distribution is bivariate normal.
Characteristics: Pearson's Correlation Coefficient is usually signified by r (rho), and can take on the values from -1.0 to 1.0. Where -1.0 is a perfect negative (inverse) correlation, 0.0 is no correlation, and 1.0 is a perfect positive correlation.
Related statistics: R2 (called the coefficient of determination or r squared) can be interpreted as the proportion of variance in Y that is contained in X.
Tests: The statistical significance of r is tested using a t-test. The hypotheses for this test are:
H0: rho = 0
Ha: rho <> 0
A low p-value for this test (less than 0.05 for example) means that there is evidence to reject the null hypothesis in favor of the alternative hypothesis, or that there is a statistically significant relationship between the two variables.
Note: This test is equivalent to the test of no slope in the simple linear regression procedure.
Location in WINKS: Pearson's correlation coefficient is found in the following locations:
1. Regression and Correlation - The Correlation procedure produces both Pearson and Spearman Correlation coefficients. The t-test for statistical significance of r is calculated. R2 is also reported.
2. Regression and Correlation - The Simple linear regression reports the Pearson correlation coefficient and the t-test. R2 is also reported.
3. Regression and Correlation - The Correlation Matrix procedure produces a matrix of correlations for a number of pairs of variables at a time, and includes the p-value for the test or significance of r.
Graphs: An important part of interpreting r is to observe a scatterplot of the data. Scatterplots are available from the Graphs option, as a part of Simple Linear Regression and in the Graphical Correlation Matrix option in Regression and Correlation.
Example: Use the Correlation procedure to calculate r for the two variables HP (horsepower) and WEIGHT in the WINKS "CAR" database. The results from WINKS (in part) are:
Variables used : HP and WEIGHT Number of cases used: 38 Pearson's r (Correlations Coefficient) = 0.9172 R-Square = 0.8413 Test of hypothesis to determine significance of relationship:
H(null): Slope = 0 or H(null): r = 0 (Pearson's)
t = 13.81425 with 36 d.f. p < 0.001
(A low p-value implies that the slope does not = 0.) Spearman's Rank Correlation Coefficient = 0.9071 (Spearman's) t = 12.93361 with 36 d.f. p < 0.001 A scatterplot of this data shows the positive correlation -- cars with higher horsepower tend to weigh more:

An example of writing up these results:
Narrative: "An evaluation was made of the linear relationship between horsepower and vehicle weight using Pearson's correlation."
Results: "An analysis using Pearson's correlation coefficient indicates a statistically significant linear relationship between horsepower and vehicle weight r(36)=0.92, p<0.001. For these data, the mean (SD) for horsepower is 101.7(26.4) and for weight 2.86 (0.71)."
Warning: There is a temptation to infer cause and effect when observing a correlation. However, the ability to assign causality depends on the creation of an experiment specifically designed to provide this kind of inference.
Related topics: Spearman's Correlation Coefficient is the non-parametric counterpart to r. See also simple linear regression, multiple regression, and polynomial regression.
Exercise - Correlation
At the beginning of an introductory engineering course, 10 students were given a pre-test to determine their initial mathematical ability. The following table lists the student's pre-test score and final grade in the class:
Student Number Pre-Test Course Grade 1
2
3
4
5
6
7
8
9
10
45
23
50
46
33
21
13
30
34
5092
86
97
95
87
76
72
84
85
98
1. Calculate Pearson's Correlation Coefficient (r) on this data.
r =
2. What statistical test is used to determine if this value of r is statistically significant?
3. Is the correlation seen in this data statistically significant. Why?
4. Display a scatterplot of the data. Does the data appear linearly correlated. Do there seem to be any outlier values?
5. Suppose an 11th student were added to the data, with a pre-test score of 40 and a Course Grade of 70. How would this effect r?
"[WINKS is] brilliant to teach students statistics with the null and alternative hypotheses." - Dr. Jorgen Fabricius, Denmark
WE RECOMMEND
Get the WINKS SDA laminated 8.5 x 11" 3-hole punched "BeSmartNotes" notes --
very helpful!

Purchase BeSmartNotes Reference sheets at the
TexaSoft on-line store.
BeSmartNotes are also available for SAS and SAS ODS.
|
Check this out... |