In descriptive statistics, the mean (also called the average) is one of the most fundamental measures used to represent the central tendency of a dataset. It is often used to summarize a set of values with a single number that best represents the entire distribution of data.
Definition #
The mean is calculated by summing all the values in a dataset and then dividing by the number of values. It is commonly used to understand the overall level of a dataset, such as the average score of students in a class, the average income in a region, or the average temperature of a month.
Formula #
Mean Calculation for Sample and Population #
Sample Mean #
For a sample of size (n) with values (x_1, x_2, \dots, x_n), the sample mean is calculated as:
[ \text{Mean} = \frac{x_1 + x_2 + \dots + x_n}{n} ]
Where:
- (x_1, x_2, \dots, x_n) are the individual data points.
- (n) is the total number of data points.
Population Mean #
For a population, the formula is very similar, but we use (N) to denote the total number of individuals in the population. It is often referred to as the population mean:
[ \text{Population Mean} = \frac{X_1 + X_2 + \dots + X_N}{N} ]
Where:
- (X_1, X_2, \dots, X_N) are the individual data points in the population.
- (N) is the total number of data points in the population.
Example: #
If we have the following data representing the test scores of 5 students:
[ 85, 90, 95, 80, 88 ]
The mean test score is calculated as:
[ \text{Mean} = \frac{85 + 90 + 95 + 80 + 88}{5} = \frac{438}{5} = 87.6 ]
Thus, the average score is 87.6.
Key Points #
- The mean is sensitive to extreme values (outliers). A very high or very low value can significantly alter the mean.
- It is best used when the data is symmetrically distributed without significant outliers.
- In cases with outliers or skewed data, the median or mode might be more appropriate measures of central tendency.
Last modified on 2023-12-30