TexaSoft Home

Star Tutorials Menu

star3 BASIC WINKS

Star2 PROFESSIONAL

"I consider WINKS a treasure."
- Bill Lafitte, Pepperdine University - GSEP

Order WINKS NOW

 


Winks Manual

  Order WINKS Now 

7

These WINKS SDA statistics tutorials explain the use and interpretation of standard statistical analysis techniques for Medical, Pharmaceutical, Clinical Trials, Marketing or Scientific Research. The examples include how-to instructions for WINKS SDA Version 7.0 Software. Download evaluation copy of WINKS

Crosstabulation Analysis (Chi-square)

Crosstabulations can be used to perform a chi-square test for independence or a chi-square test for homogeneity. A two-way table is constructed that displays the number of counts for each category. It must be possible to assume that the data observations are independent and that each data value can be counted in one and only one category. It is also assumed that the number of observations is fixed. SDA allows you to enter data for a two-way table from the keyboard or from a data set.

You can enter data for this analysis using

  1. Enter from data set (data are raw counts, one record per observation)
  2. Enter summarized data from keyboard
  3. Enter from a "count" data set (data are summarized counts)

Examples of each are provided here:

Example 1: Entering Data from a Data Set
(Analyze/Crosstabs, Frequencies, Chi-Square/ Crosstabulations/ Chi-Square)

If you choose to enter the information from a data set, you will be prompted to indicate what tables are to be calculated. Select one or more fields for the “Data field” (top right hand list box) and select one or more fields for the “By Var” field (bottom right hand side list box).

For example open the data file SALARY.SDA (salaries of professors at a college), produce a table of RANK by SEX.

Step 1: Select Analyze/Crosstabulations, Frequencies, Chi-Square/Crosstabulations, Chi-Square.

Chi-Square Analysis in WINKS

Step 2: For the variables to use, select Rank and Sex as shown here:

For all tables, you are prompted to specify what output options you want included in the output tables:

  • Frequencies
  • Total Percent
  • Row Percent
  • Column Percent
  • Expected Values
  • Chi-contribution
  • Residual
  • Standardized Residual
  • Adjusted Residual

For this example, select the “Expected Values” option. Click OK and the following output is produced:

    RANKS(rows) by SEX (columns)

    FREQUENCY|
    EXPECTED |    1|    2|   TOTAL
   ------------------------
            1|    7|   20|     27
             | 10.3| 16.7|
   ------------------------
            2|   15|   33|     48
             | 18.4| 29.6|
   ------------------------
            3|   27|   42|     69
             | 26.4| 42.6|
   ------------------------
            4|   18|   13|     31
             | 11.9| 19.1|
   ------------------------
    TOTAL        67   108     175
               38.3  61.7   100.0

   Statistic                       DF      Value         p-value 
   -----------------------------------------------------------------
   Chi-Square                       3      7.905          0.049
   Phi Coefficient                          .213
   Cramer's V                               .213
   Contingency Coefficient                  .208

The calculated Chi-Square value is 7.905 with 3 degrees of freedom. The p-value of 0.049 indicates marginal significance. Assuming the SEX code is 1=Female and 2=Male you can see that in the highest rank (4) there were fewer females than expected (11.9 instead of 18) and more males (19.1 instead of 13). This might indicate a gender bias in how professors are promoted rank.

Question: What to the differences in expected and observed in rank=1 indicate?

This is a test of independence. For this analysis the contingency table looks at two categorical variables from a single sample of one population and tests whether the two variables are related in some way, (e.g., are sex and rank related?) The hypotheses being tested are:

Ho: The variables are independent of each other. (There is no association between them).

Ha: The variables are not independent of each other.

If there is no association them (p is greater than 0.05) it means there is no evidence of bias. A low p-value indicates rejection of the null hypothesis and in this case implies bias.

WINKS SDA reports both the chi-square statistic and the p-value. If the expected value in one or more cells is less than 5, the chi-square test may not be valid. A warning to this effect appears on the screen if appropriate. In the case of a 2 by 2 table, Fisher's Exact Test and the chi-square with Yates' correction are also performed and results displayed. Note: Tables as large as 15 columns by 100 rows may be created by reading data from a data set. If there are more categories than this, SDA combines remaining categories in a group called REST. To prevent this, you might combine some groups.

Example 2: Entering Data from the keyboard
(Analyze/Crosstabs, Frequencies, Chi-Square/ Crosstabulations/ Chi-Square – From Keyboard)

Data for this example are observations of the number of beetles and bugs on the upper and lower sides of leaves (Zar,1974, page 292).

2 by 2 Contingency Table Data

Beetles

Bugs

Upper Leaf

12

7

Lower Leaf

2

8

To perform this analysis, follow these steps:

Step 1: Select Analyze/Crosstabulations, Frequencies, Chi-Square/ Crosstabulations, Chi-Square - From Keyboard.

Step 2: You are first prompted to select output options. For this example, just select Frequencies. You are then prompted to indicate the size of the table. When asked for the number of rows and columns, type 2, 2 and press Enter. An empty table appears. Enter counts for each category into the appropriate cell, and choose Calculate. Preliminary results appear on the status bar a the bottom of the screen. You can perform calculations on several tables, and all results will appear in the viewer when you select Exit.

2-Way Contingency Table
 

    FREQUENCY|     |     |   TOTAL

   ------------------------

             |   12|    7|     19

   ------------------------

             |    2|    8|     10

   ------------------------

    TOTAL        14    15      29

               48.3  51.7   100.0
 

   WARNING - Some Expected values less than 5. Chi-Square may not be valid.

 

   Statistic                     DF    Value         p-value

   -------------------------------------------------------------

   Chi-Square                     1    4.887          0.028

   Yates' Chi-Square              1    3.312          0.069

   Fisher's Exact Test (one-tail)                     0.033

                       (two-tail)                     0.050

   Phi Coefficient                      .411

   Cramer's V                           .411

   Contingency Coefficient              .380

   Relative Risk                       3.158

   Odds Ratio                          6.857  95% C.I.=(1.124,41.829)

   Sensitivity                          .857

   Specificity                          .533

 

   Sensitivity, Specificity and RR calculations are based on a

   table where the cells are in the following pattern:

   TP   FP

   FN   TN

 

Step 3: The calculated chi-square statistic is reported as 4.89 with a p-value of 0.028. The chi-square with Yates correction is 3.31 with a p-value of 0.069 and the Fisher Exact Test (two-tailed) has a p-value of 0.050. Because one of the cells produces an expected value less than 5, SDA gives a warning that the chi-square analysis for this data may not be valid. Given this warning, it is best to rely on the Fisher's Exact Test for making a decision.

A low p-value indicates rejection of the null hypothesis. At a 0.05 significance level, the Fisher's Exact Test p-value of 0.050 indicates (borderline) that there is enough evidence to reject the null hypothesis of independence of the two variables and to conclude that leaf side and type of insect are not independent. In this case it appears that beetles prefer the upper sides of leaves and bugs are about split in their preference. In the case of the Yates results, this decision is marginal.

Example 3: Entering Data from Count Data Set
(Analyze/Crosstabs, Frequencies, Chi-Square/ Crosstabulations/ Chi-Square – from count data)

The following data are from a classic study from 1909 reported by Karl Pearson that observed the association between drinking and criminal behavior.

Step 1: Open CROSSTAB_COUNTS.SAV and Select Analyze, Crosstabs, Frequencies, Chi-Square, Crosstab/Chi-Square (From count data.)

Step 2: Select CRIME as the row variable, DRINKER as the column and COUNT as count. Click Ok.

Bar Chart 8

Step 3: From the Options menu select Frequency and Standardized Residual. Click Ok. The following (partial) output is displayed (similar to Example 1.)

 CRIME(C)(rows) by DRINKER(N) (columns)
 FREQUENCY|           YES|   NO|   TOTAL
           ------------------------
           ARSON|      50|   43|     93
           ------------------------
           RAPE|       88|   62|    150
           ------------------------
           VIOLENCE|  155|  110|    265
           ------------------------
           STEALING|  379|  300|    679
           ------------------------
           COINING|    18|   14|     32
           ------------------------
           FRAUD|      63|  144|    207
           ------------------------
           TOTAL       753   673    1426
                      52.8  47.2   100.0
 Typical hypotheses tested include:
           Test of independence: Ho: There is no association between the two variables.
           or Test of homogeneity:  Ho: Distribution of each category is same across population.
 Statistic                       DF      Value         p-value 
           -----------------------------------------------------------------
           Chi-Square                       5     49.731         <0.001
           Likelihood Ratio Chi-Square      5     50.517         <0.001
           Phi Coefficient                          .187
           Cramer's V                               .187
           Contingency Coefficient                  .184
 Since p<=0.05, the null hypothesis (of independence or homogeneity)
           is rejected and multiple comparisons are performed.

continues...
         
Click the Graph option at the top of the sceeen to display the graph grouped by Drinker within Crime.
Bar Chart 9 

End of tutorial

 

For more information including explanation of options go to next tutorial.

 

   T=TFor more information... we recommend:

  • WINKS -- a simple to use and affordable statistical software program that will help you analyze, interpret and write-up your results. Download a free trial copy.

  • Additional statistics tutorials.

  • Against All Odds VIDEOS - Now in VHS or DVD formats Teaching Videos from Annenberg/PBS Click here for info

  • BeSmartNotes Reference sheets for SAS, SAS ODS, SPSS and WINKS  - Click here for info.

  • Statistical Analysis Quick Reference Guidebook: With SPSS Examples is a practical "cut to the chase" handbook that quickly explains the when, where, and how of statistical data analysis as it is used for real-world decision-making in a wide variety of disciplines. It contains examples using SPSS Statistics software. In this one-stop reference, the authors  provide succinct guidelines for performing an analysis, avoiding pitfalls, interpreting results, and reporting outcomes. Paperback. Sage Publishers ISBN: 1412925606 Order book from Amazon

| Send comments | Tutorial Index | WINKS Software | BeSmartNotes |

ue, F=False, P=Positive, N=Negative

 

 


| Top of document | Tutorial Index | TexaSoft Homepage | Send comments |
\A9 Copyright TexaSoft, 2007