I need assignment # 3 completed . The rest is for reference.
Example of how to report on Descriptive Statistics Statistics Variable SE-MaritalStatus N N* Mean SE Mean StDev Minimum Q1 Median Q3 SE-Income Married 15 0 102696 1505 5828 95835 98309 101829 106894 Unmarried 15 0 96626 258 1000 94929 95744 96664 96928 Variable SE-MaritalStatus Maximum SE-Income Married 114505 Unmarried 98717 Table 1: Married/Unmarried SE Income Table 1 displays the data analysis for Married versus Unmarried individuals x SE Income. The results suggest that married people have a higher mean income than unmarried people, $102,696 vs. $96,626, a mean difference of $6,070. Moreover, the SD for married individuals ($5858) is larger than for unmarried individuals ($1000). The difference of $4828 is indicative of more income variability within the married group. The SEMean is an indication of the reliability of the mean. For Married individuals the SE is $1505 versus Unmarried individuals SE Mean $258 suggesting that the Unmarried group provides a more complete and reliable picture of the mean value relative to the overall population. The unmarried group is less reliable in terms of approximating a true mean. Statistics Variable SE-MaritalStatus N N* Mean SE Mean StDev Variance CoefVar USD-AnnualExpenditures Married 15 0 77236 976 3779 14283357 4.89 Unmarried 15 0 55960 137 532 283096 0.95 Variable SE-MaritalStatus Minimum Q1 Median Q3 Maximum IQR Mode USD-AnnualExpenditures Married 72228 73550 77744 80934 82676 7384 * Unmarried 55120 55558 55963 56393 57082 835 * Variable SE-MaritalStatus N for Mode Skewness USD-AnnualExpenditures Married 0 0.10 Unmarried 0 0.27 Table 2: Married/Unmarried Annual Expenditures Table 2 displays the data analysis for Married versus Unmarried individuals x USD Annual Expenditures. The results suggest that married people have a higher mean expenditure than unmarried people, $77236 vs. $55960, a mean difference of $21,276. Moreover, the SD for married individuals ($3779) is larger than for unmarried individuals ($532). The difference of $3247 is indicative of more variability for Annual Expenditures within the married group. The SEMean is an indication of the reliability of the mean. For Married individuals the SE is $976 versus Unmarried individuals SE Mean $137 suggesting that the Unmarried group provides a more complete and reliable picture of the mean value relative to the overall population. The unmarried group is less reliable in terms of approximating a true mean. The skewness for Married individuals is .10 as compared to .27 for Unmarried individuals. Both groups show a moderate, positive right skew. Table 2 displays quartiles for the Married and Unmarried individuals. Quartile 1 (Q1) is the median of the lower half of the data set and as such, roughly 25% of the numbers in the data set lie below Q1 and about 75% lie above Q1. The third quartile (Q3) is the median of the upper half of the data set and roughly 75% of the numbers in the data set lie below Q3 and about 25% lie above Q3. The Interquartile range for Married individuals is $7384 and for Unmarried individuals $835. The IQR indicates the spread of the "middle" values. Given the median for the Married group is $77744 the IQR of $7384 indicates a wide spread from the middle of the data. The median for the Unmarried group is 5596 and the IQR is $835. The values for Annual Expenditures appear to cluster closer to the midpoint of the data set for the Unmarried individuals. University of Maryland Global Campus STAT200 - Assignment #1: Descriptive Statistics Data Analysis Plan Identifying Information Student: Diana Edwards Class: Stat 200 Instructor: Alyson Muff Date: 3/29/21 Scenario: I am a 36-year-old (head of household) working as a private investigator making $99,610 a year. I am married with 2 kids, spending a total of $73,550 on annual expenses for my family size of 4. Each year I spend $9,513 on food, $158 on entertainment and $37 on education. I would like to compare my income and monthly expenses to other married couples with a family size of 4 to determine if our expenditures are reasonable compared to others in similar situations. Table 1. Variables Selected for the Analysis Variable Name in the Data Set Description (See the data dictionary for describing the variables.) Type of Variable (Qualitative or Quantitative) Variable 1: “Income” Annual household income in USD. Quantitative Variable 2: “Marital Status” Marital Status of Head of Household. Qualitative Variable 3: “Family size” Total Number of People in Family (Both Adults and Children) Quantitative Variable 4: “Expenses” Total amount of annual expenditures in USD. Quantitative Variable 5: “Food” Total amount of annual expenditure on food in USD. Quantitative Reason(s) for Selecting the Variables and Expected Outcome(s): good 1. Variable 1: “Income” – I chose income as a variable because it will be useful in comparing the annual income of other households and how income effects annual expenses. I am interested in this variable because I want to see if my annual income increased, would I spend more money or attempt to save more. I think the expected outcome will show households with higher annual incomes will have increased annual expenses. 2. Variable 2: “Marital status “- I chose marital status as a variable because I want to see if being married effects the amount of annual expenses per household compared to those who are single. I am interested in this variable because I am also married and would be curious to find out if married households have more expenditures. I would expect the outcome to show that married households spend more money annually on expenses then single households. 3. Variable 3: “Family size “- I chose family size as a variable to see if the amount of people in your household effects the number of expenditures annually. I am interested in this variable because I also have a household size of 4 and I want to compare similarities. The expected outcome is that households with a larger family size will have increased expenditures. 4. Variable 4: “Expenses “- I chose expenses as a variable because I want to compare the composition of different households and how it affects their annual expenditures. I am interested in this variable because I am curious to see how other households spend annually compared to my budget. The expected outcome would show higher expenditures as family sizes increase. 5. Variable 5: “Food “- I chose food as a variable because I want to compare the amount of money spent on food annually with variations in family size and eating habits such as dining out or home cooking. I am interested in this variable because the majority of the time my family eats at home but I tend to order out often when I go to work. The expected outcome would show that larger households/family sizes would be spending more money on food annually. Proposed Data Analysis: Measures of Central Tendency and Dispersion Table 2. Numerical Summaries of the Selected Variables Variable Name Measures of Central Tendency and Dispersion · Rationale for Why Appropriate Variable 1: “Income” · Median · Sample Standard Deviation I am using median for two reasons: · If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. · The variable is quantitative. I am using sample standard deviation for three reasons: · The data is a sample from a larger data set. · It is the most commonly used measure of dispersion. · The variable is quantitative. good but be careful and know the risks. See my Cheat Sheet on Median under Class Announcements Variable 2: Marital Status · Mode I am using mode for two reasons: no, this won’t tell you anything. · The mode is the most commonly occurring value and is the only one that can be used with all levels of data (nominal, ordinal, interval, and ratio). · The variable is qualitative. Variable 3: Family size · Mean · Standard deviation I am using mean for two reasons: why the mean here and the median above? Don’t mix ‘apples an oranges’ you can’t compare mean and median, for example. · The distribution is more symmetric and consistent not necessarily. · The variable is quantitative. I am using standard deviation for two reasons: · The data is from a larger data set and will be able determine how skewed the data is from the average. · The variable is quantitative. Which is why the mean is a potential problem. Variable 4: Expenses · Median · Standard deviation I am using median for two reasons: see above about mixed comparisons. Also, median expenditures would tell you what? · The distribution is more skewed and resistant to outliers. · The variable is quantitative. I am using standard deviation for two reasons: · The data is from a larger data set and will be able determine how skewed the data is from the average. · The variable is quantitative. Variable 5: Food · Median · Standard deviation I am using median for two reasons: see above · The distribution is more skewed and resistant to outliers. · The variable is quantitative. I am using standard deviation for two reasons: ? standard deviation within the food categories? What will this tell you? · The data is from a larger data set and will be able determine how skewed the data is from the average. · The variable is quantitative. Graphs and/or Tables Table 3. Type of Graphs and/or Tables for Selected Variables Variable Name Graph and/or Table Rationale for why Appropriate? Variable 1: “Income” Graph: Histogram A histogram is one of the best plots available to visually depict the distribution of quantitative level data. Not necessarily. See my Cheat Sheet under Class Announcements 3/29/21. It depends on what you depict on the x and y axes. Variable 2: “Marital status” Pie chart A pie chart is beneficial for showing areas proportional to the number of data points in each category. It also displays a small number of categories such as married/not married. There are only two categories. Of what value is this comparison in isolation? Variable 3: “Family size” Frequency table A frequency table is a summary of the data with counts of how often a data value occurs. It will help show how often household have a particular family size. good Variable 4: “Expenses” Bar graph A bar graph shows observations of multiple categories and summarize a large data set that can be interpreted visually. Variable 5: “Food” Bar graph A bar graph shows observations of multiple categories and summarize a large data set that can be interpreted visually. Suzannah: You will be measuring your data against the data provided. Keep that in mind. I would suggest a few comparisons. Example below Variable Recoded SE-MaritalStatus N N* Mean SE Mean StDev Variance CoefVar USD-Entertainment 1 15 0 125.5 12.8 49.6 2461.3 39.52 2 15 0 95.80 2.43 9.43 88