Who Uses WINKS? Am J Pathology, J Am Anim Hosp Assoc, J Oral Rehab, J. Biol. Chem, Acta chir belg. . . more citations. . .
Pearson's Correlation Coefficient
is one in a series of tutorials using examples from WINKS SDA.
the strength of the linear relationship between two variables.
variables (often called X and Y) are interval/ratio and approximately
normally distributed, and their joint distribution is bivariate normal.
Pearson's Correlation Coefficient is usually signified by r (rho),
and can take on the values from -1.0 to 1.0. Where -1.0 is a perfect
negative (inverse) correlation, 0.0 is no correlation, and 1.0 is a perfect
R2 (called the coefficient of determination or r squared) can be
interpreted as the proportion of variance in Y that is contained in X.
statistical significance of r is tested using a t-test. The
hypotheses for this test are:
H0: rho = 0
Ha: rho <> 0
A low p-value for this test
(less than 0.05 for example) means that there is evidence to reject the null
hypothesis in favor of the alternative hypothesis, or that there is a
statistically significant relationship between the two variables.
Note: This test is
equivalent to the test of no slope in the simple linear regression
Pearson's correlation coefficient is found in the following locations:
1. Regression and
Correlation - The Correlation procedure produces both Pearson and Spearman
Correlation coefficients. The t-test for statistical significance of r is
calculated. R2 is also reported.
2. Regression and
Correlation - The Simple linear regression reports the Pearson correlation
coefficient and the t-test. R2 is also reported.
3. Regression and
Correlation - The Correlation Matrix procedure produces a matrix of
correlations for a number of pairs of variables at a time, and includes the
p-value for the test or significance of r.
Graphs: An important
part of interpreting r is to observe a scatterplot of the data. Scatterplots
are available from the Graphs option, as a part of Simple Linear Regression
and in the Graphical Correlation Matrix option in Regression and
Example: Use the
Correlation procedure to calculate r for the two variables HP
(horsepower) and WEIGHT in the WINKS "CAR" database. The results from WINKS (in part) are:
: HP and WEIGHT Number of cases used: 38 Pearson's r (Correlations
Coefficient) = 0.9172 R-Square = 0.8413 Test of hypothesis to determine
significance of relationship:
= 0 or H(null): r = 0 (Pearson's)
t = 13.81425
with 36 d.f. p < 0.001
(A low p-value
implies that the slope does not = 0.) Spearman's Rank Correlation
Coefficient = 0.9071 (Spearman's) t = 12.93361 with 36 d.f. p < 0.001.
A scatterplot of this data shows the positive correlation -- cars with
higher horsepower tend to weigh more:
An example of writing up these results:
Narrative: "An evaluation was made of the linear relationship
between horsepower and vehicle weight using Pearson's correlation."
"An analysis using Pearson's correlation coefficient indicates a
statistically significant linear relationship between horsepower and vehicle
weight r(36)=0.92, p<0.001. For these data, the mean (SD) for horsepower is
101.7(26.4) and for weight 2.86 (0.71)."
Warning: There is a
temptation to infer cause and effect when observing a correlation. However,
the ability to assign causality depends on the creation of an experiment
specifically designed to provide this kind of inference.
Spearman's Correlation Coefficient is the non-parametric counterpart to r.
See also simple linear regression, multiple regression, and polynomial
Exercise - Correlation
At the beginning of an
introductory engineering course, 10 students were given a pre-test to
determine their initial mathematical ability. The following table lists the
student's pre-test score and final grade in the class:
1. Calculate Pearson's
Correlation Coefficient (r) on this data.
2. What statistical test is
used to determine if this value of r is statistically significant?
3. Is the correlation seen
in this data statistically significant. Why?
4. Display a scatterplot of
the data. Does the data appear linearly correlated. Do there seem to be any
5. Suppose an 11th student
were added to the data, with a pre-test score of 40 and a Course Grade of
70. How would this effect r?
designed for your needs -- Medical, Clinical Trials, Dissertation, Thesis,
Business, Marketing, Agriculture, Forestry, Science and Research.
WINKS is an
economical, reliable and simple to use statistical analysis tool designed
to help students and researchers get the statistical answers they need
quickly and without hassle.
It contains a wide
range of statistical tests including many handy features not found in
programs such as SPSS or SAS -- for example, easy analysis from summary
data (as well as from raw data), nonparametric multiple comparisons,
APA standard analysis write-up suggestions and more.
Newest additions to
WINKS (ver 6.05) includes Grubbs Test for outliers and Tukey Test
for outliers, Kwikstat Data Generator evaluation edition,
multiple comparisons in 2xc and rx2 crosstabulation (Chi-Square)
analysis) & Permutations and combinations calculator
Statistical test such as
t-tests (paired t test, independent group t test/unpaired t test),
ANOVA, regression, correlation, repeated measures,
logistic regression, times series analysis, chi-square,
Bland-Altman, Kruskal-Wallis, Mann-Whitney and much
Friendly interface with
easy Excel-like data handling & graphs (reads and writes Excel files.)
delivery to you when you order now
Print a brochure (pdf
Recent updates include
- Added ability to calculate new column of z-scores for numeric
- Increased maximum groups to 40 for one-way ANOVA and one-way
repeated measures (40 repeats).
- Added editor option to create indicator variables from categorical
- Added Outcomes and Probabilities module that calculates combinations
- Added Grubbs & Tukey tests for outliers.
- Added Multiple Comparisons of Proportions to Crosstabulations
procedures for 2xc or rx2 tables
- Updates are free to current users.