6.5 Appendix: A note on calculating P-values

6.5.1 The Problem

Suppose you were performing a right-tailed hypothesis test (using a t distribution) and you arrived at a test statistic under the null of \(1.57\). This means that the rejection region is in the right tail, and if you wished to calculate the p-value, then it would be the area of the curve to the right of \(1.57\).

If you have a sample of \(n = 50\), then you would use a t-distribution with \(49\) or \((n-1)\) degrees of freedom.

An illustration is below:

6.5.2 How to calculate p-values

In case you haven’t noticed by now, R has a default way of calculating probability areas…

IT ALWAYS CALCULATES AREAS FROM THE LEFT!

In other words, the default is to give you the area to the left of a number…

(Pval = pt(1.57,49))
## [1] 0.9385746

Don’t be annoyed about this, because all software does this (including Excel).

We can use this default to calculate the p-value (i.e. the area to the right of 1.57) in THREE different ways by relying on two properties of our probability distributions.

Property 1: The distribution is centered at zero and symmetric.

This means that the area to the right of 1.57 is the same as the area to the left of -1.57. So we can use the pt function with the default setting to this effect:

(Pval = pt(-1.57,49))
## [1] 0.06142544

Property 2: The distribution always adds up to one.

This means that you have a 100% chance of pulling a number between negative and positive infinity. So if you use 1.57 and the default setting which gives you the area to the left, then subtract that number from 1 to get the area to the right:

(Pval = 1-pt(1.57,49))
## [1] 0.06142544

Final Option: Undo the default setting…

The full command for calculating a p-value from a t-distribution (for our purposes) is as follows:

pt(q, df, lower.tail = TRUE)

Note that \(q\) is the quantity, and \(df\) is the degrees of freedom. All other entries (if not specified) go to their default values. This is where \(lower.tail\) comes in. It is set to TRUE by default, meaning that whatever number you input for q, you will get the area to the left. If you change this entry to FALSE, then the default is switched off and you will calculate the area to the right.

(Pval = pt(1.57,49,lower.tail = FALSE))
## [1] 0.06142544

Notice that all three ways of calculating a p-value give you the exact same result. Therefore, you do not need to master all three - just pick whichever method works best for you.