Who Uses WINKS? Am J Pathology, J Am Anim Hosp Assoc, J Oral Rehab, J. Biol. Chem, Acta chir belg. . . more citations. . .
Pearson's Correlation Coefficient
This
is one in a series of tutorials using examples from WINKS SDA.
Definition: Measures
the strength of the linear relationship between two variables.
Assumptions: Both
variables (often called X and Y) are interval/ratio and approximately
normally distributed, and their joint distribution is bivariate normal.
Characteristics:
Pearson's Correlation Coefficient is usually signified by r (rho),
and can take on the values from -1.0 to 1.0. Where -1.0 is a perfect
negative (inverse) correlation, 0.0 is no correlation, and 1.0 is a perfect
positive correlation.
Related statistics:
R2 (called the coefficient of determination or r squared) can be
interpreted as the proportion of variance in Y that is contained in X.
Tests: The
statistical significance of r is tested using a t-test. The
hypotheses for this test are:
H0: rho = 0
Ha: rho <> 0
A low p-value for this test
(less than 0.05 for example) means that there is evidence to reject the null
hypothesis in favor of the alternative hypothesis, or that there is a
statistically significant relationship between the two variables.
Note: This test is
equivalent to the test of no slope in the simple linear regression
procedure.
Location in
WINKS:
Pearson's correlation coefficient is found in the following locations:
1. Regression and
Correlation - The Correlation procedure produces both Pearson and Spearman
Correlation coefficients. The t-test for statistical significance of r is
calculated. R2 is also reported.
2. Regression and
Correlation - The Simple linear regression reports the Pearson correlation
coefficient and the t-test. R2 is also reported.
3. Regression and
Correlation - The Correlation Matrix procedure produces a matrix of
correlations for a number of pairs of variables at a time, and includes the
p-value for the test or significance of r.
Graphs: An important
part of interpreting r is to observe a scatterplot of the data. Scatterplots
are available from the Graphs option, as a part of Simple Linear Regression
and in the Graphical Correlation Matrix option in Regression and
Correlation.
Example: Use the
Correlation procedure to calculate r for the two variables HP
(horsepower) and WEIGHT in the WINKS "CAR" database. The results from WINKS (in part) are:
Variables used
: HP and WEIGHT Number of cases used: 38 Pearson's r (Correlations
Coefficient) = 0.9172 R-Square = 0.8413 Test of hypothesis to determine
significance of relationship:
H(null): Slope
= 0 or H(null): r = 0 (Pearson's)
t = 13.81425
with 36 d.f. p < 0.001
(A low p-value
implies that the slope does not = 0.) Spearman's Rank Correlation
Coefficient = 0.9071 (Spearman's) t = 12.93361 with 36 d.f. p < 0.001.
A scatterplot of this data shows the positive correlation -- cars with
higher horsepower tend to weigh more:
An example of writing up these results:
Narrative: "An evaluation was made of the linear relationship
between horsepower and vehicle weight using Pearson's correlation."
Results:
"An analysis using Pearson's correlation coefficient indicates a
statistically significant linear relationship between horsepower and vehicle
weight r(36)=0.92, p<0.001. For these data, the mean (SD) for horsepower is
101.7(26.4) and for weight 2.86 (0.71)."
Warning: There is a
temptation to infer cause and effect when observing a correlation. However,
the ability to assign causality depends on the creation of an experiment
specifically designed to provide this kind of inference.
Related topics:
Spearman's Correlation Coefficient is the non-parametric counterpart to r.
See also simple linear regression, multiple regression, and polynomial
regression.
Exercise - Correlation
At the beginning of an
introductory engineering course, 10 students were given a pre-test to
determine their initial mathematical ability. The following table lists the
student's pre-test score and final grade in the class:
Student Number |
Pre-Test |
Course Grade |
1
2
3
4
5
6
7
8
9
10
|
45
23
50
46
33
21
13
30
34
50 |
92
86
97
95
87
76
72
84
85
98 |
1. Calculate Pearson's
Correlation Coefficient (r) on this data.
r =
2. What statistical test is
used to determine if this value of r is statistically significant?
3. Is the correlation seen
in this data statistically significant. Why?
4. Display a scatterplot of
the data. Does the data appear linearly correlated. Do there seem to be any
outlier values?
5. Suppose an 11th student
were added to the data, with a pre-test score of 40 and a Course Grade of
70. How would this effect r?
-
Statistical Software
designed for your needs -- Medical, Clinical Trials, Dissertation, Thesis,
Business, Marketing, Agriculture, Forestry, Science and Research.
-
WINKS is an
economical, reliable and simple to use statistical analysis tool designed
to help students and researchers get the statistical answers they need
quickly and without hassle.
-
It contains a wide
range of statistical tests including many handy features not found in
programs such as SPSS or SAS -- for example, easy analysis from summary
data (as well as from raw data), nonparametric multiple comparisons,
APA standard analysis write-up suggestions and more.
-
Newest additions to
WINKS (ver 6.05) includes Grubbs Test for outliers and Tukey Test
for outliers, Kwikstat Data Generator evaluation edition,
multiple comparisons in 2xc and rx2 crosstabulation (Chi-Square)
analysis) & Permutations and combinations calculator
-
Statistical test such as
t-tests (paired t test, independent group t test/unpaired t test),
ANOVA, regression, correlation, repeated measures,
logistic regression, times series analysis, chi-square,
Bland-Altman, Kruskal-Wallis, Mann-Whitney and much
more.
-
Friendly interface with
easy Excel-like data handling & graphs (reads and writes Excel files.)
-
FAST
delivery to you when you order now
-
Print a brochure (pdf
format)
-
Recent updates include
- Added ability to calculate new column of z-scores for numeric
variables.
- Increased maximum groups to 40 for one-way ANOVA and one-way
repeated measures (40 repeats).
- Added editor option to create indicator variables from categorical
variables.
- Added Outcomes and Probabilities module that calculates combinations
and permutations.
- Added Grubbs & Tukey tests for outliers.
- Added Multiple Comparisons of Proportions to Crosstabulations
procedures for 2xc or rx2 tables
- Updates are free to current users.