wyklad 5 Testy parametryczne

background image

Lecture 5: t tests

Probability and

Statistics
Part 2: Statistics

background image

Tests of Significance

Reasoning of significance tests

Stating hypotheses

Test statistics

P-values

Statistical Significance

Test for a population mean

Two-sided significance tests and

confidence intervals

P-values vs. fixed α

background image

Tests of Significance

2 most common forms of statistical
inference

Confidence interval

appropriate when goal is to estimate a
population parameter

Significance test

Appropriate when goal is to assess the
evidence provided by the data in favor of
some claim about the population

background image

Reasoning of significance
tests

Significance test

Formal procedure for comparing observed
data with a hypothesis whose truth we want
to assess.

Hypothesis: statement about the parameters in a
population or model

Results of the test are expressed in
terms of a probability that measures how
well the data and the hypothesis agree.

background image

Stating hypotheses

Null hypothesis

The statement being tested in a test of
significance

The test of significance is designed to
assess the strength of the evidence
against it.

Statement of “no effect” or “no
difference”

Abbreviated as H

0

background image

Stating hypotheses, cont.

Alternative hypothesis

The statement we hope or suspect is true

instead of H

0

Abbreviated as H

a

Hypotheses always refer to some population

or model, not to a particular outcome.

The hypotheses will be stated in terms of

population parameters.

Can have one-sided or two-sided alternative

hypotheses.

background image

Test Statistics

The test is based on a statistic that estimates

the parameter that appears in the

hypotheses. Usually this is the same

estimate we would use in a confidence

interval for the parameter. When H

0

is true,

we expect the estimate to take a value near

the parameter value specified by H

0

.

Values of the estimate far from the parameter

value specified by H

0

give evidence against

H

0

. The alternative hypothesis determines

which directions count against H

0

.

background image

Test Statistics, cont.

Test statistic

Measures compatibility between the
null hypothesis and the data.

Used for the probability calculation
that we need for our test of
significance.

Random variable with a distribution
that we know.

background image

P-values

A test of significance finds the
probability of getting an outcome
as extreme or more extreme than
the actually observed outcome

.

Extreme: far from what we would
expect if the null hypothesis were
true

background image

P-values, cont.

The probability, computed assuming that H

0

is true, that the test statistic would take a

value as extreme or more extreme than that

actually observed is called the P-value of the

test.

The smaller the P-value, the stronger the

evidence against H

0

provided by the data.

Calculate it using the sampling distribution of

the test statistic (for example the standard

normal distribution for the test statistic z).

background image

background image

background image

Statistical Significance

If the P-value is as small or smaller than
α, we say that the data are statistically
significant at level α.

Example:

alpha level of 0.01 means

that we are insisting that we obtain
evidence so strong that it would appear
only 1% of the time if the null
hypothesis is in fact true.

What does a p-value of 0.03 tell you?

background image

Random samples from N(0,1): two
groups of 16 individuals

Collect p-values obtained from any
statistical test

Number of experiments N=25000

Simulation study

background image

background image

background image

background image

N = 1262

background image

Test of Significance Steps

You can assess the significance of the

evidence provided by data against a null

hypothesis by performing these steps:

1.

State the null hypothesis H

0

and the alternative

hypothesis H

a

. The test is designed to assess the

strength of the evidence against H

0

. H

a

is the

statement that we will accept if the evidence

enables us to reject H

0

.

2.

Calculate the value of the test statistic on which

the test will be based. This statistic usually

measures how far the data are from H

0

.

background image

Test of Significance Steps

3.

Find the P-value for the observed data. This is

the probability (calculated assuming that the null

is true) that the statistic will weigh against the

null at least as strongly as it does for these data.

4.

State a conclusion. Choose a significance level α

(how much evidence against the null you regard

as decisive). If the P-value is less than or equal

to α, you conclude that the alternative

hypothesis is true; if it is greater than α, you

conclude that the data do not provide sufficient

evidence to reject the null hypothesis. Your

conclusion is a sentence that summarizes what

you have found by using a test of significance.

background image

Tests for a population
mean

We have an SRS of size n drawn from a

normal population with unknown mean

μ but known variance σ

2

. We want to

test the hypothesis that μ has a

specified value, call it μ

0

.

Null hypothesis: H

0

: μ = μ

0

.

Test statistic:

has the standard normal distribution when

the null hypothesis is true.

n

x

z

/

0

background image

Tests for a population
mean

Alternative hypothesis: If it is one-
sided on the high side, H

a

: μ > μ

0

P-value: probability that a standard
normal random variable Z takes a
value at least as large as the
observed z.

We can similarly reason for other
alternative hypotheses.

background image

background image

background image

Two-sided significance
tests and confidence
intervals

A level α two-sided significance
test rejects a hypothesis H

0

: μ = μ

0

exactly when the value μ

0

falls

outside a level 1- α confidence
interval for μ.

background image

background image

P-values vs. fixed α

The P-value is the smallest level α at

which the data are significant.

Knowing the P-value allows us to assess

significance at any level.

It is more informative than a reject-or-not

finding at a fixed significance level.

A value z* with a specified area to its

right under the standard normal curve is

called a a critical value of the standard

normal distribution.

background image

background image

background image

Use and Abuse of Tests

Choosing a level of significance

Choose a level α in advance if you must

make a decision.

Does not make sense if you wish only to

describe the strength of your evidence.

If you do use a fixed α significance test to

make a decision, choose it by asking how

much evidence is required to reject H

0

.

This depends on how plausible the null

hypothesis is.

background image

Choosing a level of significance

IF H

0

represents an assumption that

everyone in your field has believed
for years, strong evidence (small α)
will be needed to reject it.

Level of evidence required to reject
H

0

depends on the consequences of

such a decision.

Expensive: strong evidence

background image

Choosing a level of significance

Better to report the P-value, which
allows each of us to decide
individually if the evidence is
sufficiently strong.

There is no sharp border between
“significant” and “insignificant,”
only increasingly strong evidence
as the P-value decreases.

background image

What statistical
significance doesn’t mean

Statistical significance is not the same as

practical significance.

Example: hypothesis of no correlation is

rejected

Does not mean strong association, only that there is

strong evidence of some association

A few outliers can produce highly

significant results if you blindly apply

common tests of significance.

Outliers can also destroy the significance

of otherwise convincing data.

background image

Don’t ignore lack of
significance

If a researcher has good reason to
suspect that an effect is present and then
fails to find significant evidence of it, that
may be interesting news---perhaps more
interesting than if evidence in favor of the
effect at the 5% level had been found.

Keeping silent about negative results may
condemn other researchers to repeat the
attempt to find an effect that isn’t there.

background image

Statistical inference is not
valid for all sets of data

Formal statistical inference cannot
correct basic flaws in the design.

Randomization in sampling or
experimentation ensure that the
laws of probability apply to our
tests of significance and
confidence intervals.

background image

Beware of searching for
significance

The reasoning behind statistical
significance works well if you
decide what effect you are
seeking, design an experiment or
sample to search for it, and use a
test of significance to weigh the
evidence you get.

background image

Beware of searching for
significance

Because a successful search for a
new scientific phenomenon often
ends with statistical significance, it
is all too tempting to make
significance itself the object of the
search.

background image

Beware of searching for
significance

Once you have a hypothesis,
design a study to search
specifically for the effect you now
think is there. If the result of this
study is statistically significant,
you have real evidence.

background image

Power and Inference as a
Decision

Examining the usefulness of a
confidence interval

Level of confidence: tells us how
reliable the method is in repeated use

Margin of error: tells us how sensitive
the method is, or how closely the
interval pins down the parameter
being estimated

background image

Power and Inference as a
Decision

Examining the usefulness of fixed

level α significance tests

Significance level: tells us how reliable

the method is in repeated use

Power

: tells us the ability of the test to

detect that the null hypothesis is false

Measured by the probability that the test

will reject the null hypothesis when an

alternative is true.

The higher this

probability is, the more sensitive the test is.

background image

Power

The probability that a fixed level α
significance test will reject H

0

when

a particular alternative value of the
parameter is true is called the
power of the test to detect that
alternative.

background image

Calculating Power

State H

0

, H

a

, the particular

alternative we want to detect,

and the significance level α.

Find the values of that will lead

us to reject H

0

.

Calculate the probability of

observing these values of when

the alternative is true.

x

x

background image

background image

Increasing the Power

Increase α. A 5% test of significance will
have a greater chance of rejecting the
alternative than a 1% test because the
strength of evidence required for rejection
is less.

Consider a particular alternative that is
farther away from μ

0

. Values of μ that are

in H

a

but lie close to the hypothesized

value μ

0

are harder to detect (lower power)

than values of μ that are far from μ

0

.

background image

Increasing the Power,
cont.

Increase the sample size. More data will

provide more information about so

we have a better chance of

distinguishing values of μ.

Decrease σ. This has the same effect as

increasing the sample size: more

information about μ. Improving the

measurement process and restricting

attention to a subpopulation are two

common ways to decrease σ.

x

background image

Two types of error

In significance testing, we must accept one
hypothesis and reject the other.

We hope that our decision will be correct,
but sometimes it will be wrong.

2 types of incorrect decisions

If we reject H

0

(accept H

a

) when in fact H

0

is

true, this is a Type I error (related to p-value).

If we accept H

0

(reject H

a

) when in fact H

a

is

true, this is a Type II error (related to the power).

background image

background image

The common practice of
testing hypotheses

State H

0

and H

a

just as in a test of

significance.

Think of the problem as a decision
problem, so that the probabilities of
Type I and Type II errors are relevant.

Type I errors are more serious. So
choose an α (significance level) and
consider only tests with probability of
Type I error no greater than α.

background image

The common practice of
testing hypotheses

Among these tests, select one that
makes the probability of a Type II error
as small as possible (that is, power as
large as possible). If this probability is
too large, you will have to take a
larger sample to reduce the chance of
an error.

background image

The one-sample one-tail t
test

Suppose that an SRS of size n is drawn

from a population having unknown

mean μ. To test the hypothesis that H

0

:

μ = μ

0

based on an SRS of size n,

compute the one-sample t statistic

In terms of a random variable T having

the t(n-1) distribution, the P-value for a

test of H

0

against H

a

: μ > μ

0

is

or against H

a

: μ < μ

0

is

n

/

s

x

t

0

t

T

P

)

t

T

(

P

background image

background image

The one-sample one-tail t
test – example 1

Suppose that an SRS of size n is drawn
from a population having unknown mean
μ.

[114, 123.3, 116.7, 129.0, 118, 124.6, 123.1, 117.4, 111,
121.7, 124.5, 130.5]

sample mean = 121.15

sample standard deviation = 5.89

To test the hypothesis that H

0

: μ =

μ

0

=120

based on an SRS of size n,

compute the one-sample t statistic

6764

.

0

12

/

89

.

5

120

15

.

121

n

/

s

x

t

0

background image

The one-sample one-tail t
test – example 1

In terms of a random variable T
having the t(n-1) distribution, the P-
value for a test of H

0

against H

a

: μ >

μ

0

is

t

T

P

background image

degree of freedom

Critical value at
α=0.95

background image

If t

1

=0.6 then cdf(t

1

)=0.71967 so p

1

=1-

0.71967=0.28033

If t

2

=0.7 then cdf(t

2

)=0.75077 so p

2

=1-

0.75077=0.24923

so p

2

< p < p

1

The one-sample one-tail t
test – example 1

background image

From the exact t-distribution we
get

tcdf(0.6764,11) = 0.743621

p = 1 – 0.743621 =

0.256379

The one-sample one-tail t
test – example 1

background image

The one-sample one-tail t
test – example 1

0.7436

p=0.2564

background image

The one-sample one-tail t
test – example 1

Final conclusion:

we cannot reject the H

0

that

the population mean μ=120

and accept H

a

at p=0.2564

background image

The another example of one-
sample one-tail t test

To test the hypothesis that H

0

: μ = 135

based on an SRS of size n, compute the

one-sample t statistic

In terms of a random variable T having

the t(n-1) distribution, the P-value for a

test of H

0

against

H

a

: μ < μ

0

is

n

/

s

135

x

t

t

T

P

background image

background image

The one-sample one-tail t
test – example 2

[114, 123.3, 116.7, 129.0, 118, 124.6,

123.1, 117.4, 111, 121.7, 124.5, 130.5]

sample mean = 121.15

sample standard deviation = 5.89

To test the hypothesis that H

0

:

μ = μ

0

=135 compute the one-

sample t statistic

1456

.

8

12

/

89

.

5

135

15

.

121

n

/

s

x

t

0

background image

From the exact t-distribution we
get

P(T ≤ t)

=

tcdf(-8.1456,11) =2.7e-6

p =

0.0000027 < 0.000005

The one-sample one-tail t
test – example 2

background image

The one-sample one-tail t
test – example 2

p=2.7e-6

t = -8.1456

background image

The one-sample one-tail t
test – example 2

Final conclusion:

we reject the H

0

that the population

mean μ=135 and accept H

a

that μ

< μ

0

at p < 0.000005

background image

The one-sample two-tails t
test

To test the hypothesis that H

0

: μ = 115

based on an SRS of size n, compute the

one-sample t statistic

In terms of a random variable T having

the t(n-1) distribution, the P-value for a

test of H

0

against

H

a

: μ ≠ μ

0

is

n

/

s

119

x

t

|

t

|

|

T

|

P

background image

background image

The one-sample two-tails
t test – example 3

[114, 123.3, 116.7, 129.0, 118, 124.6,

123.1, 117.4, 111, 121.7, 124.5, 130.5]

sample mean = 121.15

sample standard deviation = 5.89

To test the hypothesis that H

0

:

μ = μ

0

=115 compute the one-

sample t statistic

6170

.

3

12

/

89

.

5

115

15

.

121

n

/

s

x

t

0

background image

• Since critical t

0.05

=2.2010 and our

observed is bigger than it we reject
the null hypothesis at the α=0.05.

• From the t-distribution we get

P(|T| > |t|)

= 2*(1-

tcdf(3.6170,11)) =

0.0040

So the exact p =

0.0040

The one-sample two-tails
t test – example 3

background image

The one-sample two-tails
t test – example 3

p=0.0040

-t = -3.6170

t = 3.6170

background image

The one-sample two-tails
t test – example 3

Final conclusion:

we reject the H

0

that the

population mean μ=115 and
accept H

a

that μ ≠ μ

0

at p = 0.0040

background image

Matched pairs t
procedures

In a matched pairs study, subjects
are matched in pairs and the
outcomes are compared within
each matched pair.

Example: before and after studies

background image

background image

Key points to remember
concerning matched pairs

A matched pair analysis is
needed when there are two
measurements or observations on
each individual and we want to
examine the change from the first
to the second. Typically the
observations are “before” and
“after” measures in some sense.

background image

Key points to remember
concerning matched pairs

For each individual, subtract the
“before” measure from the “after”
measure.

Analyze the difference using the
one-sample confidence interval
and significance-testing
procedures.

background image

Robustness of the t
procedures

A statistical inference procedure is
called robust if the probability
calculations required are insensitive to
violations of the assumptions made.

The t procedures are quite robust
against nonnormality of the population
except in the case of outliers or strong
skewness.

background image

Robustness of the t
procedures

Larger samples improve the
accuracy of the P-values and critical
values from the t distributions when
the population is not normal.

A normal quantile plot or boxplot to
check for skewness and outliers is
an important preliminary to the use
of t procedures for small samples.

background image

Practical Guidelines for
inference on a single
mean

Sample size less than 15: Use t procedures

if the data are close to normal. If the data

are clearly nonnormal or if outliers are

present, do not use t.

Sample size at least 15: The t procedures

can be used except in the presence of

outliers or strong skewness.

Large samples: The t procedures can be

used even for clearly skewed distributions

when the sample size is large, roughly n >

40.

background image

Comparing Two Means

Two-sample problems

The goal of inference is to compare
the responses in two groups.

Each group is considered to be a
sample from a distinct population.

The responses in each group are
independent of those in the other
group.

background image

Notation

Population

Variable

Mean

Standard Deviation

Sample size

Sample mean

Sample standard deviation

background image

The two-sample z statistic

Natural estimator of the difference μ

1

-

μ

2

is the difference between the sample

means,

To do inference on this statistic, we must
know its sampling distribution.

If the two population distributions are both
normal, then the distribution of
is also normal.

2

1

x

x

2

1

x

x



2

2

2

1

2

1

2

1

,

n

n

N

background image

The two-sample z statistic,
cont.

Suppose that is the mean of an SRS of

size n

1

drawn from a N(μ

1,

σ

1

) population

and that is the mean of an SRS of size

n

2

drawn from a N(μ

2,

σ

2

) population. Then

the two-sample z statistic

has the standard N(0,1) sampling

distribution.

1

x

2

x

 

2

2

2

1

2

1

2

1

2

1

n

n

x

x

z

background image

background image

background image

background image

The two-sample t
procedure

This test assumes that the variances in

the populations from which the two
samples were taken are identical

background image

The two-sample t
procedures

If the population standard
deviations are not known, then we
estimate them by the sample
standard deviations.

Use the proper formula depending
on the sample sizes n

1

and n

2

.

background image

The two-sample t
procedures

When the sample sizes are unequal and

n

1

or n

2

or both are small (<30) use:

This statistic have a t distribution with

k=n

1

+ n

2

– 2 degrees of freedom.

 





 





2

1

2

1

2

1

2

2

2

2

1

1

2

1

2

1

n

n

n

n

2

n

n

s

)

1

n

(

s

)

1

n

(

x

x

t

background image

The two-sample t
procedures

When n

1

and n

2

are equal (regardless

of size) use:

This statistic have a t distribution

with k=2(n – 1) degrees of freedom.

 

n

s

s

x

x

t

2

2

2

1

2

1

2

1

background image

The two-sample t
procedures

When n

1

and n

2

are unequal but large

(≥30) use:

This statistic have a t distribution with

k=n

1

+ n

2

- 2 degrees of freedom.

 

2

2

2

1

2

1

2

1

2

1

n

s

n

s

x

x

t

background image

Two-sample t significance
test

Suppose that an SRS of size n

1

is

drawn from a normal population
with unknown mean μ

1

and that an

independent SRS of size n

2

is drawn

from another normal population with
unknown mean μ

2.

We assume the standard deviations
for both populations are identical.

background image

Two-sample t significance
test

To test the hypothesis H

0

: μ

1

= μ

2

versus H

a

: μ

1 ≠

μ

2

compute the two-

sample t statistic and use P-values
or critical values for the t(k)
distribution, with proper df.

background image

Two-sample t
significance test -
example

Suppose that an SRS of size n

1

is drawn from a

population having unknown mean μ

1

.

[114, 123.3, 116.7, 129.0, 118, 124.6, 123.1,
117.4, 111, 121.7, 124.5, 130.5]

sample mean = 121.15

sample standard deviation = 5.89

Another SRS of size n

2

is drawn from a

population having unknown mean μ

2

.

[120, 125.5, 126, 125.5, 128.5, 125, 128, 116,
122, 121, 117, 125]

sample mean = 123.29

sample standard deviation = 4.08

background image

Two-sample t
significance test -
example

 

0383

.

1

n

s

s

x

x

t

2

2

2

1

2

1

2

1

=0 by assumption

p = 0.3104

Conclusion: we cannot reject the hypothesis that
mean values of both populations are the same at
α=0.05

Since n

1

=n

2

=12

background image

Two-sample Confidence
Interval

Suppose that an SRS of size n

1

is drawn

from a normal population with unknown

mean μ

1

and that an independent SRS of

size n

2

is drawn from another normal

population with unknown mean μ

2

. For

large n

1

and n

2

the confidence interval for

μ

1

- μ

2

given by

where t statistics has n

1

+n

2

-2 dfs.

2

2

2

1

2

1

2

1

2

2

2

1

2

1

2

1

*

,

*

n

s

n

s

t

x

x

n

s

n

s

t

x

x

background image

Robustness of the two-
sample procedures

The two-sample t procedures are more

robust than the one-sample t methods.

When the sizes of the two samples are equal

and the distributions of the two populations

being compared have similar shapes, we get

good accuracy.

When the two population distributions have

different shapes, larger samples are needed.

In planning a two-sample study, you should

usually choose equal sample sizes.

background image

Bartlett’s test for
homogeneity of variances

A test of null hypothesis that two
normal populations represented by
two samples have the same
variance.

The alternative hypothesis is that
the two variances are unequal.

background image

background image

background image

background image

Welch’s approximate t-test of
equality of the means of two
samples whose variances are
assumed to be unequal

background image

Welch’s approximate t-test

This test calculates an approximate

t-value, t

, for which the critical

value is calculated as a weighted

average of the critical values of t

based on the corresponding degrees

of freedom of the two samples.

background image

Welch’s approximate t-test

The two-sample t

statistic is

 

2

2

2

1

2

1

2

1

2

1

n

s

n

s

x

x

't

background image

Welch’s approximate t-test

The critical value of t

α’

for type I

error is computed as:

2

2

2

1

2

1

2

2

2

]

[

1

2

1

]

[

n

s

n

s

n

s

t

n

s

t

't

2

1

background image

Welch’s approximate t-test

The statistic does NOT have a t
distribution but we can approximate
the distribution of the two-sample t’
statistic by using the t(k)
distribution with an approximation
for the degrees of freedom k.

To calculate p-value k equal to the
smaller of n

1

– 1 and n

2

– 1.

background image

Homework


Document Outline


Wyszukiwarka

Podobne podstrony:
wyklad 5 Testy parametryczne PL
Wyklad 5 Testy parametryczne
Wyklad 6 Testy zgodnosci dopasowania PL
wyklad 6 Testy zgodnosci dopasowania PL
pytania testowe i chemia budowlana -zestaw3, Szkoła, Pollub, SEMESTR II, chemia, wykład, testy
lipidy 2, Prywatne, Biochemia WYKŁADÓWKA I, Biochemia wykładówka 1, TESTY, testy
Wykład 8 Schematy i parametry dla skł 2 Indywidualna praca nr2
statystyka 3, WNIOSKOWANIE STATYSTYCZNE - TESTY PARAMETRYCZNE
Wykłady, testy z analitycznej, 1
statystyka ii laboratorium viii testy parametryczne ii
pytania testowe i chemia budowlana -zestaw1, Szkoła, Pollub, SEMESTR II, chemia, wykład, testy
Makroekonomia test zielony - dr Mitręga, makroekonomia wyklady i testy
III b wykład testy
WS testy parametryczne 3
Wykład 3 Pomiar parametrów technologicznych płuczek wiertniczych
Rajfura A, Statystyka Wyklad 02 PARAMETRY 2012 13
pytania testowe i chemia budowlana -zestaw2(2), Szkoła, Pollub, SEMESTR II, chemia, wykład, testy

więcej podobnych podstron