Preview Flashcards
FrontBack
Homoscedastic

Two sets of data have the same variance


Heteroscedastic

Two sets of data have different varieances


Ftest tests for...

Equality of variance


Hypothesis of F test

H0= sigma^2sub1 = sigma^2sub2HA=sigma^2sub1 not equal to sigma^2sub2


Can F stat be skewed?

Yes, unlike normal and t distributions


Population of larger sample variance is #1

Because all numbers in the table are greater than one, they assume we know not to put smaller number in the numerator


If F is in the white area, between two tails of F distribution... and smaller than the critical f (alpha)...

We fail to reject null hypothesis. Rejection zone is in the tail. This means variances are close enough to equal to use equal variances version of the difference of means test.


Non parametric tests

Less strict in their requirements (don't need a normal distribution)


Why would we ever use parametric tests then?

Parametric tests have greater power relative to samaple size (ability to reject H0 when HA is true)


Advantages of Non parametric tests

Can be used in skewed, bimodal, nomial and ordinal populations, and smaller sample sizes. Easier to compute, more resistant to data errors (ex mean is influenced by a large error, median isn't)


Sign test for the median

In a non normal population, the median is a robust measure of centrality, mean is not. You do not need normality!


n(with stem down) is "eta" or population median. n0 is hypothesized median.

If eta0 is the true median, then half of the population values re larger than eta0


B

# of observations with values greater than median. pi=.5 (pi is number of "successes).


In table A1, for sign of the meidan test, always use pi=.5

OK


MannWhitney Test

Tests hypothesis that two populations have the same median. Requires ordinal data, test is based on ranks


Steps of mann whitney

Pool the sample data for populations x and y and rank them, 1 is the lowest.Convert sample statistic (S (the sum of ranks)) to a Zscore.Pvalue comes from Standard Normal Probabilities table (z table)Compare pvalue to alpha


If there is a tie in ranks for mann whitney

Assign both observations the average of the ranks. So if there are two tied for 15, give them 15.5 for 15 and 16


Two Sample Number of Runs Test

Tests asks if distributions of 2 populations are the same or different. Sample statistic is R = # of runs


Procedure of Two sample number of runs test

Combine the 2 samples and rank them. a run is a continuous string of ranks from the same population. count the number of runs. A large R, or number of runs, means the distributions are about the same


Convert normally distributed R to a z score

We reject the null only when R is smaller than musubr


Goodness of Fit test

Tests whether a random variable follows specific probability distribution. Checking if it is normally distributed


ChiSquare Test

Compare observed frequencies for each category to frequencies expected under hypothesized distribution


Hypothesis

f(Y) is the theoretical probability distributionf(A) is the random variable's distributionH0: f(A) = f(Y)HA: f(A) not equal f(Y)


KolmogorovSmirnov Test

Compares sample distribution to theoretical distribution like chi squared test, but this one, the random variable is continuous, not nominal. It compares cumulitive frequencies and no sample info has been used to estimate theoretical distribution


Kolomogorov Smirnov Test

H0: The sample is from the population F(x)HA: THe sample is not from the population F(x)


Which table for Kolomogorov Smirnov?

Table A9


Contingency Tables Hypothesis

The variables are statistically independentTHe variables are statistically dependent


Steps

Sum rows and columns, compare observed frequencies to frequencies that would be expected if no relationship exists between variables


Expected frequency Eij for each cell

Eij= (RiCi)/n. Use table A8


Goal of Regression analysis and defining X and Y

Examine influence of X on Y


to fit the regression line

We find the sum of the squared errors


Three ways to evaluate the goodness of fit on the line

1. Pearsons productmoment correlation coefficient, r2. Coefficient of determination, r^23. Standard error of the estimate Ssubyx


Standard error of the estimate

Measures accuracy associated with predicting Y. Also called the RMSE. and it is Ssubx


Residual

Difference between actual and predictided value of Y. esubi = Ysubi  Yhatsubi. From examining residuals, we find out the amount of error and direction of error
