January 19, 2019
###### Exploratory Data Analysis
January 19, 2019

Chapter6/Chapter Guides.pdf

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

Chapter 6 – Selecting and Interpreting Inferential Statistics Study Guide

OBJECTIVES: The student will be able to:

1. Identify the general design classification for difference research questions. 2. Explain the distinctions of within subjects design versus between groups design

classifications. 3. Utilize a decision tree (Figure 6.1) to guide the selection of appropriate inferential

statistics (Tables 6.1-6.4). a. Identify the research problem. b. Identify the variables and their level of measurement. c. Select appropriate inferential statistic.

4. Describe the relationship between difference and associational inferential statistics as a function of the general linear model.

5. Interpret the results of a statistical test. a. Determine whether to reject the null hypothesis. b. Determine the direction of the effect. c. Evaluate the size of the effect.

6. Discuss the relationship between statistical significance and practical significance. TERMINOLOGY: • variables • levels of measurement • descriptive statistics • inferential statistics

o difference inferential statistics o associational inferential statistics

• difference question designs • between group designs • within subjects design (repeated measures design) • single factor designs • between groups factorial designs • mixed factorial designs • basic (bivariate) statistics

o phi or Cramer’s V o eta o Pearson product moment correlation o Kendall’s tau or Spearman rho

• complex statistics o factorial ANOVA o multiple regression o discriminant analysis o logistic regression

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

o MANOVA o ANCOVA

• loglinear • general linear model • statistical significance

o critical value o calculated value o statistically significant o Sig.

• practical significance • effect size

o r family of effect size measures o d family of effect size measures

• confidence intervals ASSIGNMENTS: See additional activities and extra SPSS problems for assignment examples.

## Chapter6/Chapter Outlines.pdf

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

Chapter 6 – Selecting and Interpreting Statistics Chapter Outline

I. General Design Classifications for Difference Questions

A. Labeling difference question designs. 1. State overall type of design (e.g. between groups, within

subjects). 2. State the number of independent variables. 3. State the number of levels within each independent variable.

B. Between groups designs: each participant in the research is in only one condition or group.

C. Within subjects or repeated measures designs 1. Within subjects designs.

a. Each participant in the research receives or experiences all of the conditions or levels of the independent variable.

b. Also includes designs where participants are matched (e.g. parent & child; husband & wife).

2. Repeated measures designs: each participant is assessed more than once (e.g. pretest & posttest).

D. Single factor (one-way) design 1. Has only one independent variable. 2. Factor and way are other terms for group difference independent

variables. E. Between groups factorial design

1. When there is more than one group difference independent variable.

2. Each level of one factor (independent variable) is possible in combination with each level of the other factor(s).

a. The number of levels of each factor is used in the description of the design.

b. For example: a design that includes gender (2 levels) and ethnicity (4 levels) would be labeled as a 2 x 3 between groups factorial design.

F. Mixed factorial design: Has both a between groups independent variable and a within subjects independent variable.

G. Describing designs 1. Each independent variable is described using one number that

represents the number of levels for that variable. 2. Example: 3 x 4 between groups factorial design would have 2

independent variables, one with 3 levels and one with 4 levels. II. Selection of Inferential Statistics

A. Types of research questions. 1. Difference questions: compare groups and utilize difference

inferential statistics. (Tables 6.1 & 6.3) a. Basic (bivariate) statistics: one independent and one

dependent variable.

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

b. Complex statistics: three or more variables. 2. Associational questions: examine the association or relationship

between two or more variables and utilize associational inferential statistics (Tables 6.2 & 6.4).

B. Using Tables 6.1 and 6.4 to Select Inferential Statistics 1. Decide the number of variables.

a. 2 variables = Tables 6.1 or 6.2 b. 3 or more variables = Tables 6.3, 6.4 or 6.5

(Basic 2 variable Questions and Statistics) 2. If there are two variables and the independent variable is nominal

or has 2-4 levels = Table 6.1. a. Identify number of levels of IV. b. Identify type of research design (between or within). c. Determine the type of measurement for the DV.

3. If there are 2 variables and both are nominal use the bottom rows of Table 6.1 (difference question) or Table 6.2 (associational question).

4. If there are 2 variables and both variables have 5 or more ordered levels use Table 6.2 (associational question).

(Complex Questions and Statistics-3 or more variables) 5. If there is one normal/scale DV and the IV’s (2 or more) are

nominal or have a few ordered levels use Table 6.3. 6. If there is one normal/scale DV and the IV’s/predictors (2 or

more) are normal/scale or dichotomous use the top row of Table 6.4 (complex associational question).

7. If there is one DV that is nominal or dichotomous and there are 2 or more IV’s use the bottom row of Table 6.4 (or 6.3).

8. If there are 2 are more normal (scale) DV’s use the general linear model to do MANOVA.

III. The General Linear Model (GLM) A. Difference between associational and difference questions.

1. Mathematically, the distinction between associational and difference questions is artificial.

2. Both associational and difference inferential statistics serve the purpose of exploring and describing relationships (Fig. 6.2).

a. The GLM subsumes both associational and difference inferential statistics.

b. The relationship between the IV and DV can be expressed by an equation with weights for each of the independent/predictor variables plus an error term.

IV. Interpreting the Results of a Statistical Test A. Statistical Significance

1. The SPSS calculated value is compared to a critical value found in a statistics table.

2. Statistically significant: probability (p) is less than the preset alpha (usually .05).

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

a. Sig.: SPSS label for the p value. b. Usually, if the calculated value (t, F, etc.) is large, the

probability (p) is small. c. This Sig. is also the probability of committing a Type I

error (rejecting the null hypothesis when it is actually true). 3. The p and the null hypothesis

a. p > .05: don’t reject the null hypothesis; results are not statistically significant and could be due to chance.

b. p < .05: reject the null hypothesis; results are statistically significant and are not likely due to chance.

B. Practical Significance versus Statistical Significance 1. Statistical significance does not necessarily insure that the results

have practical significance or are important. 2. Effect size and/or confidence intervals must be examined to

determine the strength of association. a. It is possible, with a large sample, to have a statistically

significant result that is weak (small effect size). b. Small effect size may indicate that the difference or

association is of little practical importance. C. Confidence Intervals

1. An alternative to null hypothesis significance testing (NHST). 2. May provide more practical information than NHST. 3. Confidence intervals allow us to determine the interval that

contains population mean difference 95% of the time. D. Effect Size

1. The strength of the relationship between the independent variable and the dependent variable.

2. r family of effect size measures a. Pearson correlation coefficient (r): values range from –1.0

to +1.0 (0 = no effect and +1/-1 =maximum effect). b. Also includes other associational statistics such as rho, phi,

eta and the multiple correlation (R). c. Can be reported as a squared or unsquared value.

i. Squared values (r2) indicate the percent of variance of the DV that can be predicted from the IV, but give small numbers that give an underestimated impression of the strength or importance of the effect.

ii. Unsquared values (r) give a larger value and are recommended for r family indices.

3. d family of effect size measures a. Focuses on the magnitude of the difference rather than the

strength of the association. b. Computed by subtracting the mean of the second group

from the mean of the first group and dividing by the pooled standard deviation of both groups.

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

c. All d family effect sizes express effect sizes in standard deviation units.

d. Values usually vary from 0 to +/- 1.0, but can be > 1.0. 4. Issues about effect size measures.

a. d is not available on SPSS outputs but can be calculated from information provided on SPSS outputs.

b. r and R are available on SPSS outputs. c. Most journals now expect authors to discuss the effect size

as well as statistical significance. E. Interpreting Effect Sizes

1. Table 6.5 provides guidelines for the interpretation of effect sizes based upon the effect sizes usually found in the behavioral sciences and education.

2. The absolute meaning of large, medium, and small are relative to findings in these disciplines. Suggest using the following terms instead:

a. Minimal in place of small. b. Typical in place of medium. c. Substantial in place of large.

3. Cohen’s (1998) examples of effect size: a. Small = “difficult to detect”. b. Medium = “visible to the naked eye”. c. Large = “grossly perceptible”.

4. Effect size is not the same as practical significance. a. Effect size indicates the strength of the relationship and is

more relevant to practical significance than statistical significance.

b. However, effect size measures are not direct indexes of the importance of a finding.

V. An Example of How to Select and Interpret Inferential Statistics A. Steps in the process:

1. Identify the research problem. 2. Identify the variables and their level of measurement. 3. State the research question(s). 4. Identify the type of each research question. 5. Select an appropriate statistic. 6. Interpret the results of the statistic.

a. Determine if the results were statistically significant. b. If the results are statistically significant:

i. Determine the direction of the effect. ii. Calculate and interpret the effect size.

iii. If necessary, calculate and interpret confidence intervals to evaluate practical significance.

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

1. Update methods to include descriptive statistics about the demographics of the participants.

2. Add literature based evidence about the reliability and validity of measures/instruments.

3. Discuss if statistical assumptions were violated or not. B. Results Chapter

1. Includes a description of the findings. 2. Include figures and tables to illustrate the findings. 3. Do not include a discussion of the findings in this section. 4. Results of statistics should include:

a. The value of the statistic (e.g. t = 2.05) b. The degrees of freedom (and N for chi-square) c. The p or Sig. Value (e.g. p = .048)

C. Discussion Chapter 1. Puts the findings in context to research literature, theory and the

purposes of the study. 2. Explain why the results turned out the way they did.

## Chapter1/Chapter Guides.pdf

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

Chapter 1 – Variables, Research Problems and Questions Study Guide

OBJECTIVES: The student will be able to:

1. Explain the difference between research problems, research hypotheses, and research questions.

2. Provide definitions for different types of variables. 3. Identify the research question, research hypothesis, and types of variables used in a study. 4. Determine if a research question is a difference research question, an associational

research question, or a descriptive research question. 5. Explain the relationship between the type of independent variable used in a study and the

type of research question that can be answered (difference, associational, descriptive). 6. Discuss how the type of research questions drives the selection of the type of statistic. 7. Utilize the SPSS data editor and variable view features to examine the variables of an

existing dataset. TERMINOLOGY:

• research problem • variable

o independent variable (active vs. attribute) o dependent variable o extraneous variable

• operational definition • randomized experimental study • quasi-experimental study • non-experimental study • factor • grouping variable • values (categories, levels, groups, samples) • variable label • value label • research hypotheses • research question

o difference research question o associational research question o descriptive research question o complex research question (multivariate)

ASSIGNMENTS: See additional activities for assignment examples.

## Chapter1/Chapter Outlines.pdf

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

Chapter 1 – Variables, Research Problems and Questions Chapter Outline

I. Research Problems: Statement about the relationships between two or more

variables. II. Variables

A. Definition: Characteristic of the participants or situation for a study 1. Must be able to vary or have different values. 2. Concepts that do not vary are called constants. 3. Operational definition: defines a variable in terms of the

operations or techniques used to measure it or make it happen. B. Independent Variables

1. Active (manipulated) independent variable: can be given to participants within a specified period of time during the study.

a. Are not necessarily manipulated by the experimenter. b. Treatment is always given after the study is planned. c. Randomized experimental & quasi-experimental studies

must have active independent variables. 2. Attribute (measured) independent variable: preexisting attributes

of the persons or their ongoing environment. a. Cannot be manipulated by the experimenter. b. Non-experimental studies have attribute independent

variables. 3. Other terms for independent variables:

a. factor b. grouping variable

4. Inferences about cause and effect: a. Designs with active independent variables (experimental,

quasi-experimental) can provide data to infer that the independent variable caused the change or difference in the dependent variable.

b. Designs with attribute independent variables (non- experimental) should not be used to conclude a cause and effect relationship between the independent variable and the dependent variable.

5. Values of the independent variable: a. Several options or values of a variable. b. Also called: categories, levels, groups, samples

C. Dependent Variables 1. Presumed outcome or criterion that is supposed to measure or

assess the effect of the independent variable. 2. Must have at least two values, but usually have many values that

vary from high to low.

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

D. Extraneous Variables 1. Not of interest in a particular study but could influence the

dependent variable. 2. May also be called nuisance variables or covariates.

III. Research Hypothesis and Questions A. Research hypothesis: predictive statements about the relationship between

variables. B. Research questions: similar to hypotheses, but do not make specific

predictions. 1. Difference research questions: compare two or more different

groups on the dependent variable a. Utilize difference inferential statistics (e.g. ANOVA or t-

test) 2. Associational research questions: find the strength of association

between variables or to make predictions about a variable from one or more variables.

a. Utilize associational inferential statistics (e.g. correlation, multiple regression)

3. Descriptive research questions: summarize or describe data without trying to generalize to a larger population of individuals.

4. Complex research questions: involve more than two variables at a time.

a. Utilize complex inferential statistics. b. May be called multivariate in some books.

IV. Sample Research Problem: The Modified High School and Beyond (HSB) Study A. Research Problem: What factors influence mathematics achievement?

1. Identify primary dependent variable 2. Identify independent and extraneous variables 3. Identify types of independent variables (active vs. attribute) 4. Identify the research approach (experimental, quasi-

experimental, non-experimental) B. SPSS Variable View

1. Columns give information on database variables a. Name shows the variable name b. Label gives a longer description of the variable c. Values shows assigned value labels d. Missing identifies if certain values are designated by user

for missing values C. SPSS Data Editor

1. Shows raw data a. Variables are across the top (identified by short variable

names) b. Participants are listed down the left side.

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

D. Research Questions for the Modified HSB Study 1. Descriptive questions (Chapter 4) 2. To examine continuous variables for normality (Chapter 4). 3. Determine relationships between two categorical variables with

crosstabulations (Chapter 8). 4. Associational questions (Chapter 9) 5. Complex associational questions (Chapter 9) 6. Basic difference questions (Chapter 10) 7. Complex difference questions (Chapter 11)

• III. Research Hypothesis and Questions
• IV. Sample Research Problem: The Modified High School and Beyond (HSB) Study

## Chapter2/Chapter Guides.pdf

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

Chapter 2 – Data Coding, Entry, and Checking Study Guide

OBJECTIVES: The student will be able to:

1. Describe the steps necessary to plan, pilot test and collect data. 2. Prepare data for entry into SPSS or a spreadsheet 3. Define and label variables. 4. Display your SPSS codebook (dictionary). 5. Enter data into SPSS or a spreadsheet. 6. Check accuracy of data entry using SPSS Descriptive Statistics.

TERMINOLOGY: • pilot study • content validity • coding • dummy coding • codebook • define variables • label variables • missing values • data entry form • descriptive statistics ASSIGNMENTS: See additional activities and extra SPSS problems for assignment examples.

## Chapter2/Chapter Outlines.pdf

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

Chapter 2 – Data Coding, Entry, and Checking Chapter Outline

I. Plan the Study, Pilot Test, and Collect Data

A. Plan the study 1. Identify the research problem, question and hypothesis. 2. Plan the research design.

B. Select or develop the instrument(s) 1. Select from available instruments 2. Modify available instruments 3. Develop your own instruments

C. Pilot test and refine the instruments 1. Try out instrument on friends or colleagues 2. Conduct pilot study with a similar sample population 3. Utilize experts to check content validity of instrument items

D. Collect the data 1. Use methods appropriate for selected instruments 2. Check raw data before entering 3. Set “rules” for dealing with problematic responses.

II. Code Data for Data Entry A. Rules for data coding (assigning numbers to values or levels of a variable)

1. All data should be numeric. 2. Each variable for each case or participant must occupy the same

column in the SPSS Data Editor. 3. All values (codes) for a variable must be mutually exclusive. 4. Each variable should be coded to obtain maximum information. 5. For each participant, there must be a code or value for each

variable. 6. Apply any coding rules consistently for all participants. 7. Use high numbers (value or code) for the “agree”, “good”, or

“positive” end of a variable that is ordered. B. Make a coding form: to streamline data entry processes

III. Problem 2.1: Check the Completed Questionnaires (follow instructions in book)

IV. Problem 2.2: Define and Label the Variables (follow instructions in book)

V. Problem 2.3: Display Your Dictionary or Codebook (follow instructions in book)

VI. Problem 2.4: Enter Data (follow instructions in book)

VII. Problem 2.5: Run Descriptives and Check the Data (follow instructions in book)

• I. Plan the Study, Pilot Test, and Collect Data
• II. Code Data for Data Entry
• A. Rules for data coding (assigning numbers to values or levels of a variable)
• B. Make a coding form: to streamline data entry processes
• III. Problem 2.1: Check the Completed Questionnaires (follow instructions in book)
• IV. Problem 2.2: Define and Label the Variables (follow instructions in book)
• V. Problem 2.3: Display Your Dictionary or Codebook (follow instructions in book)
• VI. Problem 2.4: Enter Data (follow instructions in book)
• VII. Problem 2.5: Run Descriptives and Check the Data (follow instructions in book)

## Chapter2/Extra SPSS Problems.pdf

IBM SPSS for Introductory Statistics: Use and Interpretation, 5th Ed. (Morgan, Leech, Gloeckner & Barrett) Instructor’s Manual by Gene W. Gloeckner and Don Quick

Chapter 2 – Data Coding, Entry, and Checking Using the college student data.sav file, from http://www.psypress.com/ibm-spss-intro- stats/ (“Data Sets (ZIPS)” button) or the Moodle Web site for this book, do the following problems. Print your outputs and circle the key parts for discussion. 1. Compute the N, minimum, maximum, and mean, for all the variables in the college

student data file. How many students have complete data? Identify any statistics on the output that are not meaningful. Explain.

There are 47 students who have complete data. This value is found by looking at the value given for the Valid N (listwise). The mean is not meaningful for nominal (unordered) variables. In this example, nominal variables include: gender of student, marital status, and age group. The mean for dichotomous variables coded as 0 and 1 can be meaningful because the means actually tell the percent of students that answered with a “1” on their survey. In this example, the following variables are dichotomous: does subject have children, television shows-sitcoms, television shows-movies, television shows- sports, television shows-news.

2. What is the mean height of the students? What about the average height of the same

sex parent? What percentage of students are males? What percentage have children?

Mean height of the students = 67.30 inches Average height of same sex parent = 66.78 inches Percentage of students that are male = 52.0% Percentage of students with children = 52.0%