(1) Basic Statistical Analysis
Principles of Descriptive Statistics
To analyze the overall characteristics of the research subject, it is not necessary to study each unit within the population. In communication and sociology, more often, sampling methods are used, and according to the principle of randomness, only a portion of units (usually called a sample) is selected from the entire population for calculation and study. For each sample group, statistical measures should first be calculated to make an estimation and inference about the population characteristics with a certain degree of reliability.
SPSS basic statistical analysis serves as the foundation for other statistical analyses. By learning basic statistical methods, one can gain a relatively accurate understanding of the overall characteristics of the data to be analyzed, which can help in selecting more in-depth statistical analysis methods.
(2) Basic Statistical Analysis - Mean (Mean) and Standard Error of the Mean (S.E. mean)
Definition: The mean (or average) represents the central tendency or average level of a variable’s values. For example, the average score of students in a subject, the average income of company employees, or the average height of students in a class.
The calculation formula is as follows:
Sample Mean: For a set of data ( x_1, x_2, …, x_n ) representing a finite sample of size n, the sample mean is:
The formula is the same as for the population mean, but it is computed for a sample instead of the entire population. We use ( N ) to represent the number of elements in the population, and the formula is:
The mean is easy to calculate and understand. It is more representative when the values are homogeneous, and as the number of values increases, the representation improves. However, it can be easily influenced by extreme values.
(3) Basic Statistical Analysis - Median
Definition: The median is the value in the middle of a data set when the values are arranged in increasing or decreasing order. It is a measure of position and is not influenced by extreme values, making it more robust.
Calculation Formula: For a series of size ( N ), the median is calculated as follows:
- If ( N ) is odd, the median is the value at position ( (N + 1)/2 ).
- If ( N ) is even, the median is the average of the values at positions ( N/2 ) and ( (N/2) + 1 ).
(4) Basic Statistical Analysis - Mode
Definition: The mode is the value that appears most frequently in a data set. The mode is useful for describing the central tendency of data.
(5) Basic Statistical Analysis - Range
Definition: The range (also called the extreme difference) is the absolute difference between the maximum and minimum values of the data. For two datasets with the same sample size, the dataset with a larger range is more dispersed.
Calculation Formula:
Range = Maximum Value - Minimum Value
(6) Basic Statistical Analysis - Variance and Standard Deviation
Variance Definition: Variance is the average of the squared deviations of each variable value from the mean. The main purpose of variance is to assess the dispersion of data. When comparing two samples, the sample with a larger variance is more dispersed. The standard deviation is the square root of the variance and represents the average dispersion of data values from the mean. Larger variance and standard deviation indicate greater differences between variable values and greater dispersion from the mean.
Calculation Formula:
[ \text{Skewness} = \frac{n}{(n-1)(n-2)} \sum_{i=1}^{n} \left( \frac{x_i - \bar{x}}{s} \right)^3 ]
Where:
- (x_1, x_2, \dots, x_n) are the data points.
- (\bar{x}) is the sample mean.
- (s) is the sample standard deviation.
- (n) is the number of data points.
This formula calculates the third moment of the data, normalized by the cube of the standard deviation. Positive skewness indicates a right tail, while negative skewness indicates a left tail.
Interpretation:
- Skewness = 0: The distribution is symmetric.
- Skewness > 0: Right-skewed (longer right tail).
- Skewness < 0: Left-skewed (longer left tail).
- Absolute Skewness: The larger the absolute value of the skewness, the more asymmetric the distribution.
(7) Basic Statistical Analysis - Kurtosis
Definition: Kurtosis is a statistical measure that describes the steepness or flatness of the distribution of a variable’s values.
Kurtosis is compared to a normal distribution. A kurtosis of 0 means the distribution is as steep as a normal distribution. Kurtosis greater than 0 means the distribution is more peaked than a normal distribution (leptokurtic). Kurtosis less than 0 means the distribution is flatter than a normal distribution (platykurtic).
Calculation Formula:
(8) Basic Statistical Analysis - Skewness
Definition: Skewness describes the symmetry of a data distribution. It is a statistical measure that tells you whether the values of a variable are skewed towards the left or right.
Skewness is compared to a normal distribution. A skewness of 0 means the data distribution is symmetric. Positive skewness indicates a right tail, meaning there are larger values to the right of the mean (right-skewed). Negative skewness indicates a left tail, meaning there are larger values to the left of the mean (left-skewed).
The absolute value of skewness indicates the degree of asymmetry in the distribution.
Calculation Formula:
Example Case: Familiarizing with Basic Data Analysis Techniques
We will now provide a practical case to demonstrate how to analyze the following descriptive statistics: mean, standard error of the mean, median, mode, variance, standard deviation, kurtosis, and skewness.
-
Dataset Example: A set of questionnaire responses related to student satisfaction in a class (ratings on a scale of 1-5 for aspects such as teaching quality, course content, etc.).
Steps to analyze:
- Mean: Calculate the average score across all variables.
- Standard Error of the Mean: Compute how much the sample mean deviates from the population mean.
- Median: Find the middle value when the data is arranged in ascending order.
- Mode: Identify the most frequently occurring rating value.
- Variance: Measure the dispersion of the scores around the mean.
- Standard Deviation: Understand how spread out the data points are.
- Kurtosis: Analyze the shape of the distribution (whether the data is more or less peaked than a normal distribution).
- Skewness: Check the symmetry of the data and whether the data is skewed to the right or left.
This case study can be conducted using SPSS to compute and visualize these basic statistical measures and gain insights into the data’s central tendency, dispersion, and distribution shape.
#SPSSLast modified on 2023-12-28