Since we are learning about Normal Distributions, we are going to compare our data sets from Week 4's Discussion, with the related Normal Distribution. You will choose a data set, share and explain your information, and determine if you think it looks like a Normal Distribution based on the characteristics of your data set and the characteristics of a Normal Distribution. No formal testing is required.
Classmates may add the Normal Distribution with your mean and standard deviation to your graph in order to compare it more easily.
Step 1: Choose one of the following data sets. Click the randomization button a few times to get your sample.1) Aneurysm COD variety by state for 2018: https://www.desmos.com/calculator/njjq7bbxih(Links to an external site.)2) Adult Asthma Prevalence by State for 2018:https://www.desmos.com/calculator/p63yd7mbql(Links to an external site.)3) Colon Cancer New Cases 2017:https://www.desmos.com/calculator/4tqa66zjq7(Links to an external site.)4) Drug Poisoning Mortality by State for 2018:https://www.desmos.com/calculator/ar5jhlfglv(Links to an external site.)5) Leukemias Cancer for 2017: https://www.desmos.com/calculator/n9ob9lemqc(Links to an external site.)Step 2: Provide the title and link to your personalized set of randomized values.Your classmates will need to be able to use your data.Step 3: Provide the mean and standard deviation, rounded to 2 decimal places.Step 4: Embed your histogram, if needed use the magnifying glass to zoom your graph.Step 5: Discuss the shape of your histogram, humps, skewness, etc.Step 6: Do you think your data is normal, why or why not?
Week 4 info
Aneurysm COD Variety by state for 2018:https://www.desmos.com/calculator/tbbnzl72rc(Links to an external site.)
https://www.desmos.com/calculator/th8w4qlv4d(Links to an external site.)
V is taken to be a random sample from the larger data set:
Mean (V) = 2519/25
= 100.76
Median (V) = 75
Standard deviation (V) = 104.873
Standard deviation in relation to Mean:
The standard deviation is greater than the mean which indicates that
- the data could be skewed
- There is a wide range of variation amongst the sample data
- The median or mode should be considered as the best measure of accuracy in this case
- There could be an outlier amongst the data set
- Presumably there is nothing significant which indicates poor precision in the data set.
Mean and Median relationship:
- The mean is greater than the median which indicates that sample distribution is positively skewed
Mode class =10 =link to finding mode (https://www.desmos.com/calculator/xazzp1g4rf(Links to an external site.))
- This indicates that 10 is the most frequent number in the data set
IQR:Q (1) = 19
Q (3) = 126
IQR:Q (3) - Q (1) = 107
- IQR can be used to spot the possible outliers in the sample data set whereby in this sample data there are rampant outliers that can be observed.
Shape of the histogram:
The histogram is skewed to the right which indicates that sample data is positively skewed.
This indicates that all the collected data values are all greater than zero.
The humps also indicate that a large number if data in this sample occur on the left side with very few of the values on the right side.