Quick explanation of the statistical terms used in the case study

The numbers below refer to this case study:

https://smalldeskbigdata.com/2018/06/03/quick-explanation-of-the-statistical-terms-used-in-the-case-study/

 

Statistical Mean

The average value of a set of numbers.
How to calculate: Add all the values, and multiply by the number of the occurrences of the value
Example: In our case, (2+3+6+0+1+5+4)/7 = 3. So this person, had a value of mean consumption equal to 3 bananas per day
Use: The Mean allows us to produce a single value which most closely interprets a data set

Variance

Indication of how “spread” is a data set, in this case, the bigger the number, the more the spread it is.
How to calculate: Subtract the Mean from each value (also called the deviation from the mean), square it, sum the squares, divide by the occurrences of the value
Example: [(2-3)^2+(3-3)^2+(6-3)^2+(0-3)^2+(1-3)^2+(5-3)^2+(4-3)^2]/7=4
Use: It can be easily seen that if the person had exactly the same number of bananas every day, all the deviations would be zero, so there would be no variance. A lower variance would mean that the number of fruits daily is closer to each other, making predictions for future consumptions much safer

Standard Deviation

How to calculate: The square root of Variance (in our case, 2)
Use: While the Variance shows the dispersion of the data set, standard deviation is used to measure confidence in the statistical conclusions we achieve. Imagine in our own sample that we would have fewer values, and some of them had much different values. in this case a very large Standard Deviation would mean that is impossible to predict how many bananas will the person eat