… and non-normally distributed data can be normal
One of the underlying assumptions of many statistical methods is that the data (or the model residuals) are normally distributed. I teach students to evaluate this assumption with plots and normality tests. When they find their data do not seem to be normally distributed, they often report:
“… the data is abnormal”
This is incorrect, and not just because of the grammar. It arises when the name of the distribution confused is with the everyday-use of the word “normal”.
It’s true that the Normal distribution has acquired its name because seeing it is quite normal; many variables are normally distributed. Similarly, the Common gull (Lanus canus) is so-called because it is commonly seen. However, not all commonly seen gulls are Common gulls. Great black-backed gulls (Larus marinus) are not uncommon gulls.
There are several distributions which are common, usual and normal to see! For example, it is normal for counts to follow a Poisson distribution. Poisson data are definitely not Normal but they are not abnormal.
If your data are not normally distributed you might report:
“… the data are not normally distributed”
to be statistically and grammatically correct.
If it is not a Common gull it could be the common great black-backed gull or a herring gull.
Note: Yes, I do see ‘the data is’ much more frequently than the grammatically correct ‘the data are’
I’ll post how to do that plot in R soon…..