· Home  · Description · Purchase ·Tutorials · Download ·  Support · Compare ·  Videos · What's New?  · 
 

"I consider WINKS a treasure." - Bill Lafitte, Pepperdine University-GSEP


Tutorial Menu
TexaSoft Home
More About WINKS


New -- Kwikstat
Data Generator
Generate data sets with continuous, categorical, grouped and correlated data. Great for creating data sets for examples and for quality assurance testing. Data sets are created in standard .CSV format so they can be opened in most statistical programs such as SAS, SPSS, WINKS and into spreadsheet programs such as Microsoft Excel. Order now.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

"I recommend WINKS to anyone who needs a quick and easy program to perform data analysis."  Dr. Wayne Woodward, Southern Methodist University

 

 

Pearson's Correlation Coefficient
This is one in a series of tutorials using examples from WINKS SDA.

  Order WINKS Now 

 

Definition: Measures the strength of the linear relationship between two variables.

Assumptions: Both variables (often called X and Y) are interval/ratio and approximately normally distributed, and their joint distribution is bivariate normal.

Characteristics: Pearson's Correlation Coefficient is usually signified by r (rho), and can take on the values from -1.0 to 1.0. Where -1.0 is a perfect negative (inverse) correlation, 0.0 is no correlation, and 1.0 is a perfect positive correlation.

Related statistics: R2 (called the coefficient of determination or r squared) can be interpreted as the proportion of variance in Y that is contained in X.

Tests: The statistical significance of r is tested using a t-test. The hypotheses for this test are:

H0: rho = 0
Ha: rho <> 0

A low p-value for this test (less than 0.05 for example) means that there is evidence to reject the null hypothesis in favor of the alternative hypothesis, or that there is a statistically significant relationship between the two variables.

Note: This test is equivalent to the test of no slope in the simple linear regression procedure.

Location in WINKS: Pearson's correlation coefficient is found in the following locations:

1. Regression and Correlation - The Correlation procedure produces both Pearson and Spearman Correlation coefficients. The t-test for statistical significance of r is calculated. R2 is also reported.

2. Regression and Correlation - The Simple linear regression reports the Pearson correlation coefficient and the t-test. R2 is also reported.

3. Regression and Correlation - The Correlation Matrix procedure produces a matrix of correlations for a number of pairs of variables at a time, and includes the p-value for the test or significance of r.

Graphs: An important part of interpreting r is to observe a scatterplot of the data. Scatterplots are available from the Graphs option, as a part of Simple Linear Regression and in the Graphical Correlation Matrix option in Regression and Correlation.


Example: Use the Correlation procedure to calculate r for the two variables  HP (horsepower) and WEIGHT in the WINKS "CAR" database. The results from WINKS (in part) are:

Variables used : HP and WEIGHT Number of cases used: 38 Pearson's r (Correlations Coefficient) = 0.9172 R-Square = 0.8413 Test of hypothesis to determine significance of relationship:

H(null): Slope = 0 or H(null): r = 0 (Pearson's)

t = 13.81425 with 36 d.f. p < 0.001

(A low p-value implies that the slope does not = 0.) Spearman's Rank Correlation Coefficient = 0.9071 (Spearman's) t = 12.93361 with 36 d.f. p < 0.001 A scatterplot of this data shows the positive correlation -- cars with higher horsepower tend to weigh more:

 

An example of writing up these results:

Narrative: "An evaluation was made of the linear relationship between horsepower and vehicle weight using Pearson's correlation."

Results: "An analysis using Pearson's correlation coefficient indicates a statistically significant linear relationship between horsepower and vehicle weight r(36)=0.92, p<0.001. For these data, the mean (SD) for horsepower is 101.7(26.4) and for weight 2.86 (0.71)." 

Warning: There is a temptation to infer cause and effect when observing a correlation. However, the ability to assign causality depends on the creation of an experiment specifically designed to provide this kind of inference.

Related topics: Spearman's Correlation Coefficient is the non-parametric counterpart to r. See also simple linear regression, multiple regression, and polynomial regression.


Exercise - Correlation

At the beginning of an introductory engineering course, 10 students were given a pre-test to determine their initial mathematical ability. The following table lists the student's pre-test score and final grade in the class:

Student Number Pre-Test Course Grade
1
2
3
4
5
6
7
8
9
10
 
45
23
50
46
33
21
13
30
34
50
92
86
97
95
87
76
72
84
85
98

 

1. Calculate Pearson's Correlation Coefficient (r) on this data.

r =

2. What statistical test is used to determine if this value of r is statistically significant?

3. Is the correlation seen in this data statistically significant. Why?

4. Display a scatterplot of the data. Does the data appear linearly correlated. Do there seem to be any outlier values?

5. Suppose an 11th student were added to the data, with a pre-test score of 40 and a Course Grade of 70. How would this effect r?

 

"[WINKS is] brilliant to teach students statistics with the null and alternative hypotheses." - Dr. Jorgen Fabricius, Denmark

 


WE RECOMMEND
Get the WINKS SDA laminated 8.5 x 11" 3-hole punched "BeSmartNotes" notes -- very helpful!

WINKS BESMARTNOTES
Purchase BeSmartNotes Reference sheets at the TexaSoft on-line store.
BeSmartNotes are also available for SAS and SAS ODS.


Check this out...
Against All Odds VIDEOS
Now in VHS or DVD formats

Teaching Videos from Annenberg/PBS
$10 off!

Click here for info


 
 | Tutorial Index | TexaSoft Homepage | Send comments |
© Copyright TexaSoft, 2007