# OCR hypothesis testing conundrum

Discussion in 'Mathematics' started by hert0677, Apr 6, 2019.

1. ### hert0677New commenter

I am teaching the OCR spec for the new A-Level and was looking through their back catalogue of S2 past papers to find some suitable hypothesis testing exam standard questions for the mean of a Normal distribution. I found one question which colleagues tell me has appeared in different guises in the past (with the characteristic phrase "not more than"):

---------------------------

--------------------

I understand what OCR have done in their solution, but I disagree with it (and am asking/challenging you to argue against what I say below). I argue that the critical region is in the left hand tail.

Solution #1
OCR's solution involves these ideas:
H0: mu = 30 [the hypothesis that the computer specification is being met]
H1: mu > 30

Solution #2
The "wrong" solution, which OCR highlight in the examiner's report, is this:
H0: mu = 30
H1: mu < 30 [the hypothesis that the computer specification is being met]

This leads to the "wrong" answer of a critical region of tbar<27.4

My argument for solution #2 to the original question is as follows:

(a) If the original question had instead stated that the computer specification is that the population mean time should be "less than 30 seconds" then the appropriate solution would be solution #2.
(b) When dealing with continuous data (as we are here) "less than 30 seconds" means precisely the same thing as "not more than 30 seconds" (i.e. the wording used in the question).
(c) Therefore, solution #2 is the appropriate solution to the original question.

If the computer specification had stated "there should be no evidence that the population mean time is more than 30 seconds" then I would agree with OCR's solution #1. But that is not what is meant by the words they used ("the population mean time should not be more than 30 seconds").

I would be grateful if you would prove me wrong, or else agree with me. If you think I am wrong, which part of the argument (a)-(c) do you disagree with?

You turn your computer on and start the clock. At 32 seconds it is ready to use. You're like "That's more than 30 seconds! It's not meant to take more than 30 seconds.". The next time you do it, it takes 33 seconds. You're like "That's even worse!"
So the question is, which of these two outcomes is significant evidence that it is taking more than 30 seconds?
hence the upper tail. Ta-daa!

3. ### crs123New commenter

The null hypothesis has to be that the specification is being met. The question is set up, as adam implies, so that you think "if it's more than 30 I have a problem!"

The null hypothesis is always stating that things are as expected, or there is no problem. The equality bit of "no more than" is a useful thing I tell my students to look out for when deciding on H0. Even if it wasn't there, though (and just said less than), taking H0 to be "there is no problem", I would use your argument that less than means the same as "less than or equal to" to argue for H0 being mu=30. So i disagree with part (a) of your argument.

4. ### hert0677New commenter

Thanks so much for your replies. Adam, do you agree with crs123 that if the computer specification had explicitly said "the population mean time should be less than 30 seconds" the critical region would still be in the upper tail, not the lower one?

Given the "less than 30 seconds" specification, a story analogous to Adam's as to why someone might think the critical region would be in the lower tail might be this:

You turn your computer on and time it [on 10 occasions and find the mean]. The time you get is 28 seconds when ready to use. You're like "That's only a bit less than 30 seconds. Could be a fluke - I'm not yet sure I trust the manufacturer on this." The next time you do all this, it takes 26 seconds. You're like "That's more convincing!" So the question is, which of these two outcomes is significant evidence that it is taking on average less than 30 seconds? Hence the critical region is in the lower tail.

5. ### pi r squaredNew commenter

Well no, because both of those (28 seconds and 26 seconds) support the manufacturer's claim, so you'd be happy with either as 'confirmation' (instinctively) that it is correct. You can't be in a position where the results of your experiment support the manufacturer's claim - ie. average less than 30 seconds - but you still reject the null hypothesis; you can only reject if the evidence contradicts the claim by such a substantial amount that it's unlikely to have happened by fluke.

6. ### hert0677New commenter

Thanks pi r squared. That is most helpful. I think I have now got my head around it. Feel free to jump in (anyone) and disagree:

1) A hypothesis test for the population mean of a normal distribution cannot test if there is evidence FOR a claim that the population mean is less than (or more than) some amount. As has been pointed out, a critical region of <27.4 for the "less than 30 seconds" example I discussed is not reasonable, because the population mean could be 29.999 and it would be absurd to demand most values be below this. Nor would a critical region of >32.6 be reasonable, because you would not want to conclude the mean is less than 30 if a value of less than 30 had never been experimentally obtained! The idea of trying to create a cutoff for concluding that there is evidence for the hypothesis that the mean is below (or above) some value is a non-starter.

2) A hypothesis test for the population mean of a normal distribution can only test if there is evidence AGAINST the claim that the population mean is less than (or more than, or equal to) some amount. So if a claim is made that the population mean is less than (or more than, or equal to) some amount, that claim must be the null hypothesis. If the null hypothesis is a claim that the population mean is more than/less than some amount, it is the awkward case where the sign on the alternate hypothesis is the "wrong" way around.

3) Any straightforward question can be reworded to make it into one of the awkward cases. Examples...

Straightforward:
Experience shows that the heights of certain plants have a mean of 70cm and a variance of 36.
A random sample of 49 plants is measured. Construct a critical region, at the 5% level, to test the claim that these plants are not as tall as expected.

H0: mu = 70
H1: mu <70

[Note that the claim is not merely that the measured plants come from a population with a mean of less than 70cm (e.g. 69.999). The claim is that they are so short that that it is unlikely they would have been obtained if they came from a population with a mean of 70]

Awkward:
Experience shows that the heights of certain plants have a mean of 70cm and a variance of 36.
A random sample of 49 plants is measured. Construct a critical region, at the 5% level, to test whether the measured plants come from a different normally distributed population, one with the same variance, but with a mean height of less than 70cm.

H0: mu = 70
H1: mu > 70

Last edited: Apr 10, 2019