Variance is a measure of spread. It is calculated by finding the average of the squared differences between every observation and the mean. The resulting value is in units squared.
Interpretation of Variance
A larger variance means the data is more spread out and values tend to be far away from the mean. A variance of 0 means all values in the dataset are the same.
Calculating Variance in Python
In Python, we can calculate the variance of an array using the NumPy
import numpy as np values = np.array([1,3,4,2,6,3,4,5]) # calculate variance of values variance = np.var(values)
The standard deviation is a measure of a dataset’s spread. It is calculated by taking the square root of the variance of a data set. The resulting value has the same units as the original data.
Standard Deviation Units
Because standard deviation is in the same units as the original data set, it is often used to provide context for the mean of the dataset. For example, if the data set is
[3, 5, 10, 14], the standard deviation is
4.301 units, and the mean is
8.0 units. By using the standard deviation, we can fairly easily see that the data point
14 is more than one standard deviation away from the mean.
Calculating Standard Deviation in Python
We can calculate standard deviation in Python using the NumPy
import numpy as np values = np.array([1,3,4,2,6,3,4,5]) # calculate standard deviation of values variance = np.std(values)