Statistical tests are mathematical tools for analyzing quantitative data generated in a research study. The multitude of statistical tests makes a researcher difficult to remember which statistical test to use in which condition. There are various points which one needs to ponder upon while choosing a statistical test. These include the type of study design (which we discussed in the last issue), number of groups for comparison and type of data (i.e., continuous, dichotomous or categorical).
In the present article, we are going to discuss these points, but before that, let us go through some important concepts in statistics.
There are four main levels of measurement/types of data used in statistics. They have different degrees of usefulness in statistical research. Ratio measurements have both a meaningful zero value and the distances between different measurements defined; they provide the greatest flexibility in statistical methods that can be used for analyzing the data. Interval measurements have meaningful distances between measurements defined, but the zero value is arbitrary (as in the case with longitude and temperature measurements in Celsius or Fahrenheit). Ordinal measurements have imprecise differences between consecutive values, but have a meaningful order to those values. Nominal measurements have no meaningful rank order among values. Because variables conforming only to nominal or ordinal measurements cannot be reasonably measured numerically, sometimes they are grouped together as categorical variables, whereas ratio and interval measurements are grouped together as quantitative or continuous variables due to their numerical nature.
In the study of statistics, we focus on mathematical distributions for the sake of simplicity and relevance to the real world. Understanding these distributions will enable us to visualize the data easier and build models quicker. However, they cannot and do not replace the work of manual data collection and generating the actual data distribution. Distributions show what percentage of the data lies within a certain range. So, given a distribution and a set of values, we can determine the probability that the data will lie within a certain range. The same data may lead to different conclusions if they are interposed on different distributions. So, it is vital in all statistical analysis for data to be put onto the correct distribution.