Front  Back  
What are the requirements for an experiment to be internally valid?

All variables/conditions are held constant except for the manipulation
of the independent variable; individual differences in participants are
balanced across conditions (usually through random assignment); the
three requirements for making causal inferences are met (the IV and DV
covary; there is a timeorder relationship; and possible alternative
explanations have been ruled out).


What are the implications when an experiment is internally valid?

The researcher is likely to be able to claim that the independent variable caused the observed changes in the dependent variable.


When does an experiment involve a confounding?

When the independent variable of interest and a potential independent variable are allowed to covary.


What does it mean to hold conditions constant in an experiment? What is the purpose of holding conditions constant?

That all materials, instructions, and experiences within the experiment are identical across the different conditions/levels of the IV, with the exception of the manipulation of the IV. Control technique to rule out other possible alternate explanations for the change in the dependent variable other than the manipulation of the independent variable.


How and why do researchers balance participants’ individual differences across conditions of an experiment?

Through random assignment, which works by generating groups of participants that are equivalent on average. Individual differences need to be balanced across experimental groups to rule out the possibility that some characteristic of the participants is actually responsible for the differences in the dependent variable rather than the manipulation of the independent variable.


What is the logic behind establishing the independent variable as the cause of a change in the dependent variable in a random groups design experiment?

Beginning with comparable groups, treating them differently and ending with noncomparable groups.


What are intact groups, and why do they pose the potential problem of confounding?

Groups that exist prior to the experiment, without the researcher randomly assigning participants to groups.
Potentially confounding because the previously established groups may differ on several characteristics that could influence the DV, so any differences observed between the two groups could be the result of those established differences rather than the IV. 

What is subject loss and why does it threaten the internal validity of the experiment?

When participants do not complete the experiment – mechanical or selective.
Mechanical is random and does not threaten the experiment’s internal validity. 

Which type of subject loss poses the most serious threat to internal validity?

The loss is selective in that some characteristic of the participant that is related to the outcome of the study is responsible.


What are two procedures typically used to control for demand characteristics and experimenter effects in an experiment?

Placebo control groups and doubleblind procedures.


What is effect size?

A measure of the strength of the relationship between the independent and dependent variables that is independent of sample size.


What is a metaanalysis?

The statistical tool that is used to analyze the results of several independent experiments.


What is the null hypothesis? How does null hypothesis testing work and what is the purpose?

That there is no relationship between the IV and DV.
To determine whether the IV has a reliable effect on the DV. What is the probability that the effect observed is simply due to chance or error variation (pvalue)? 

What does it mean to say that the outcome of an experiment is statistically significant?

That the outcome has a small likelihood of occurring if the null hypothesis is true (i.e., if there were truly no relationship between the IV and DV in the population, that there is only a 5% chance that the relationship would be observed in the experiment by chance.That the outcome is not simply the result of chance or error variation; the relationship between the IV and DV is most likely not a false positive (there is only a 5% chance that the result is a false positive).


What are Type I and Type II error?

Type I error = false positive; α or p.A Type I error occurs when the null hypothesis is really true and we claim the independent variable did have an effect on behavior.Type II error = false negative; β.
A type II error occurs when the null hypothesis is not true and we fail to reject it, or claim the independent variable did not have an effect on behavior. 

What is a matched groups design? What is the preferred pretest (matching) task for a matched groups design?

When only a small sample is available, the researcher can match the participants on some variable (preferrably the DV or one correlated with the DV) and randomly assign sets of matched participants to different conditions.The same task that will be used as the dependent variable.


What is a natural groups design?

To differentiate experiments involving individual differences (subject) variables and those involving manipulated independent variables, those experiments involving individual differences (subject) variables are called natural groups designs.


What is the most critical problem in drawing causal inferences based on a natural groups design?

Eliminating plausible alternative causes for the obtained relationship.


What are the reasons researchers choose to use repeated measures designs?

Require fewer subjectsMore convenient and efficientNeeded when the experimental procedures require participants to compare two or more stimuliGenerally more sensitive than independent groups designs


What is counterbalancing and what is its purpose?

Averaging practice effects across conditions of a repeated measures design.


The need to balance practice effects in the repeated measures design is analogous to what in the independent groups design?

The need to balance individual differences.


What does the complete repeated measures design entail?

When each participant completes each condition of the experiment multiple times, so practice effects are balanced within each participant.


What is the additional step needed when analyzing the results in a complete repeated design?

To average each participant’s scores for each condition.


What does the incomplete repeated measures design entail?

Each participant completes each condition of the experiment only once, but practice effects are balanced across participants.


What is block randomization in the context of balancing practice effects across experimental conditions? When is block randomization of conditions most effective?

Generate a new random order for each time the participant completes the conditions of the experiment (e.g., DACB, CDBA, ADBD).
Most effective when conditions are presented many times. 

What is the ABBA counterbalancing technique? When should it be used versus not be used?

Present one random sequence of conditions (e.g., DABC), then present the opposite of the sequence (CBAD).
Used when conditions are presented only a few times to each participant. Should not be used when there are nonlinear practice effects (participants change dramatically following the administration of a condition) or when anticipation effects can occur. 

What are anticipation effects?

When participants develop expectations about which condition will appear next in a sequence.


What is the general rule that applies to each of the three techniques that are used to balance practice effects in the incomplete repeated measures design?

Each condition (e.g., A, B, C) must appear in each ordinal position (1st, 2nd, 3rd) equally often.


What is the Latin Square technique for selecting orders in the incomplete repeated measures design? What are the advantages of the Latin Square technique versus the random starting order with rotation technique?

Each condition precedes and follows each other condition exactly once with the Latin Square technique, versus each condition preceding and following the same other condition in every order with the random starting order technique.


Why are repeated measures designs more sensitive at detecting the effects of independent variables than random groups designs?

Because the systematic variation caused by individual differences is eliminated from the statistical analyses.


What is differential transfer?

When the effects of the manipulation for a condition persist or carryover into the subsequent conditions in a repeated measures design.


What information does the estimated standard error of the mean provide?

Information about how well the sample mean estimates the population mean.


How is a confidence interval similar to a margin of error?

They are essentially the same thing.


How is a 95% confidence interval for a population mean calculated?

Sample mean ± (t critical) (estimated standard error)


Having calculated a 95% confidence interval for a single population mean, what can we claim?

That the odds are 95/100 that the obtained interval contains the population mean.
(Not that the population mean falls within the interval.) 

What do we do differently to construct a confidence interval for a comparison between two independent group means versus for a single sample mean?

We substitute the difference between two sample means for a single sample mean.


When interpreting confidence intervals, what does it mean when the intervals overlap?

We cannot be certain about the true population mean difference.


What about when the intervals do not overlap?

We have evidence that the population means differ.


What is the null hypothesis? (Hint: What assumption does the null hypothesis make?)

That the independent variable did not have an effect.


Null hypothesis significance testing uses the laws of probability to estimate the likelihood of what?

The likelihood of an outcome assuming that only chance factors caused the outcome.


What does it mean to say that a result or effect is statistically significant? What does it mean to say that a result or effect is not statistically significant?

Statistically significant: The IV had an effect on the DV, and there is less than a 5% probability that effect was only due to chance or error variation.
Not statistically significant: We have to be cautious about concluding that the IV had more than a trivial effect on the DV. 

How are Type I error, pvalue, and level of significance related?

Know how to apply these concepts to an example.The stated pvalue is the level of significance, and indicates the probability of a Type I error. Size of pvalue/level of significance = probability of a Type I error.


What factors are related to the power of a statistical test comparing two means?

Sample size
Level of significance Effect size 

What is the primary factor that researchers use to control the power of a statistical test?

Sample size


When is the ttest for independent groups the appropriate inferential test?

When comparing two independent means.


When is the repeated measures ttest the appropriate inferential test?

When each subject participates in both conditions of an experiment.


Why might “statistically significant” results not be of interest to the scientific community?

The study's methodology was poor.
The results have little external validity. The treatment effect is too small to be of practical value. 

The probability we use to define a statistically significant outcome is called the/is the same thing as the _______? (Hint: there are four correct answers to this fillinthe blank.)

pvalue, alpha, type I error, level of significance


What is the most common error associated with null hypothesis testing in psychological research?

Type II error (NOT Type I error, which we keep below .05/5%. We are willing to accept a much higher chance of Type II error, or false negative, than we are Type I error, or false positive. Thus, psychological research inherently involves more Type II than Type I error.)


How should and shouldn’t you report a "statistically significant" finding?

Should: As supporting the researcher's hypothesis.
Shouldn’t: As proving the researcher's hypothesis or that the null hypothesis is false. 

What are confidence intervals? How are they used to determine whether the population means likely differ for two conditions of an experiment?

A range in which we are (usually 95%) confident the true population mean lies.
We can be confident that the population means differ for two conditions of an experiment when the confidence intervals for the two sample means do not overlap. 

What does it mean to say that an experiment’s findings are reliable?

When the results of an experiment are likely to be replicated if the procedures are repeated, we are more confident that the findings are not a result of chance factors.


What is involved in a partial replication of an experiment? What is the purpose of a partial replication?

Replicating the study with minor changes to the sample, materials, procedure.
Purpose is to establish the reliability and generalizability of the findings. 