The big little t

The t-test’s name is a little inconspicuous for such a powerful test of significance. The older brother of the z-score began life known as the student t-test conceived in a Guinness factory between by chemist, Gosset in conjunction with statistician, Fisher (Box, 1987). However the name student wasn’t given to the test because of its familial link with alcohol as we might think, it was given as Guinness had begun a scientific revolution in brewing and wanted to keep any results of this a secret (Box, 1987). From humble beginnings the t-statistic has grown up to become a very powerful statistical test with a number of variations used for testing significance in a variety of different situations but is it being used in research correctly?

Today there are in fact 3 t-tests exploring significance in different situations depending on the type of design or data used. They all are examining data on the t-distribution which is pretty similar to the normal distribution . If we find a t-score that is in the .05 tail/tails of this distribution we can reject the null hypothesis (Gravetter & Wallnau, 2009).

So the simplest type of t-test is the one sample t test. It’s used to infer whether the sample mean is representative of an unknown population mean (Rochon & Kieser, 2011). Surrey et al. (2003) make of a one sample t-test when examining manual dexterity assessment test. This is great, but what if we want to look at more than one sample?

Next up is independent samples t-test which examines whether there is a significant difference between two separate groups. LeWitt (2012) used an independent samples t-test to detect a difference in number of markers in the cerebral spinal fluid (CSF) in people with Parkinson’s and people without. But what if we want to look effects within a sample?

Lastly we have the dependant sample t-test, which looks to see if there is a significant effect of treatment conditions on the same participant. Ollo et al. (1995) used a dependant sample t-test to examine the significance between actual and predicted IQ scores of crack cocaine users. Although examining different types of things all t-tests rely on the same assumptions.

The non-specific t-test assumptions parametric, therefore the observations made by the investigator are independent, so there is no predictable outcome of the second observation from the first (Boneau, 1960). In addition the population that was sampled must be normal (Boneau, 1960).

However as well as reporting the significance when writing our results it is also recommended that we report the effect size. The effect size is the size of the effect that what we have manipulated has had on our participant (Gravetter & Wallnau, 2009). Which is particulary important for t-test as they are capable of having a very small effect size. Fritz and Morris (2011) found that over 66% of articles reporting t-tests published in the Journal of experimental psychology did not report the effect size. But what is the consequence of this? Well if a particular therapy was to be found significant but with a small effect size (meaning it would not work on a lot of people) was reported and then due to the significant value used on a large population. This would have serious financial and other implications.

So, t-tests are pretty useful and pretty powerful with many variations based on the same idea, but we have to be careful how we report them.


Box, J. F. (1987). Guinness, gusset, fisher, and small samples. Statistical Science, 2(1), 45-52.

Gravetter, F. J., & Wallnau, L. B. (2009). Statistics for the behavioural sciences. California, United States: Wadsworth

Ollo, C., Lindquist, T., Alim, T.N., & Deutsch, S.I. (1995). Predicting premorbid function in crack cocain users. Drug and Alcohol Dependance, 40, 173-175

LeWitt, P. (2012). Recent advances in csf biomarkers for parkinsons disease. Parkinsons & Related Disorders, 18(1), 49-51

Rochon, J., & Kieser, M. (2011). A closer look at the effect of preliminary goodness-of-fit testing for normality for the one-sample t-test. British Journal of Mathematical and Statistical Psychology, 64, 410-426

Surrey, L.R., Nelson, K., Delelio, C., Mathie-Majors, D., Omel-Edwards, N., Shumaker, J., & Thurnber, G. (2003). A comparison of performance outcomes between the Minnesota rate of manipulation test and the Minnesota manual dexterity test. Work, 20, 97-102

Boneau, A.C. (1960). The effects of violations of assumptions underlying the t test. Psychological Bulletin, 57(1), 49-64

Fritz, C.O., Morris, P.E., & Richler, J.J. (2012). Effect size estimates: current use, calculations and interpretation. Journal of Experimental Psychology, 141(1), 2-18

  1. I thought this blog was very good, it was very informative and I learnt a lot that I didn’t previously know.

    I agree t-tests are rather important, though I wasn’t sure of how important. T-tests are very useful and a lot of data can be reported through this methodology.

    However, I’ve always struggled to get my head around statistics and t-tests. It is a pretty lame thought to think that all the time and effort and money that is put into research can easily be wasted because of one small mistake in the data or research itself – though I also find this to be honest. A lot of the public, people who work in medical science, education etc. relies on science to be as true/factual as possible to be able to gain knowledge and learn from the research. I therefore believe that scientifc research that has been reported wrong should be removed from scientific databases. So, I’ll be more careful with my SPSS next time!

  2. Very good descriptive blog describing t-tests, all of the major points of the statistical test are covered.

    Areas to improve; more detail of the relationships to other tests such as an ANOVA and perhaps some reason as to why some of the things are done such as the assumptions.
    One of the strengths of the blog is the great description of the concepts without the use of numbers,

  3. I thought your blog was really informative and covered many aspects of t-tests. I think you touched on a very important issue regarding effect sizes. Often these are not reported alongside traditional statistical analyses such as t-tests and ANOVAs, however they can often be just as informative, if not more so that the actual statistical test. Treatments can produce significant results, however without calculating the effect size researchers can be unsure as to what the impact is clinically. There is little justification in applying interventions based on significant results if the effect size is so small that it will make negligible difference in reality. Effect sizes are very simple to calculate, and can be used for both t-tests and ANOVAs, but can greatly benefit research and can be applied in a variety of situations, such as school counselling (Sink et al., 2006). The article below describes how to calculate a Cohen’s d effect size using both t-tests and ANOVAs.


  1. February 10th, 2012
  2. February 17th, 2012

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: