Instruction to do Data Analysis and Statistical Modelling Assignment Please submit your assignment in word document format on or before the Wednesday 24th May 5.00pm (You may submit your draft well earlier than this to check the similarity. You have to keep the similarity below 30% to gain full marks. You can submit your assignment as many as possible times. However, last submission will be graded. If the similarity is over 70%, you expect 0 marks for the assignment. Best way to minimise similarity is to write the assignment with your own words because there may be a chance that the similarity may go up after your dead line. You need to give reference when you paraphrase from some other resources). Remark: Your repot should be a self-contain. It should look like a journal article. Step 1: Find or collect a Dataset For this project, you must find some sort of published, existing data. Possible sources include: almanacs, magazine, journal articles, textbooks, web resources, athletic teams, newspapers, reference materials, campus organizations, professors with experimental data, electronic data repositories, the sports pages or collect your own data from fellow students, neighbours or friends. The dataset you select must have at least 25 cases. It also must have at least two categorical variables and at least two quantitative variables. Choose or collect a dataset that interests you! Step 2: Analyse Your Data! See the description below of what analysis should be included. Use technology to automate calculations and graphs. Step 3: Write Your Report Cut and paste all relevant computer output with your analysis. Be sure to include both computer output and your discussion of that output in every case. As you discuss each analysis, be sure to interpret what you are finding in the context of your particular data situation. Include all of the following. • Introduction: How did you find or collect your data? (If you found the data, give a clear reference. If you collected the data, describe clearly the data collection process you used.) What are the cases? What are the variables? What population do you believe the sample might BUS708 TUTORIAL T1 2017 KOI ASSIGNMENT generalize to? Is the sample data from an experiment or an observational study? Include a copy of the dataset. • Analysis of One Quantitative Variable: For at least one of the quantitative variables, include summary statistics (mean, standard deviation, five number summary) and at least one graphical display. Are there any outliers? Is the distribution symmetric, skewed, or some other shape? • Analysis of One Categorical Variable: For at least one of the categorical variables, include a frequency table and a relative frequency table. • Analysis of One Relationship between Two Categorical Variables: Analyse your own data for a chi-square test for association between the two Categorical Variables. State the hypotheses of the test. Conduct the test, showing all details such as expected counts, contribution of each cell to the chi-square statistic, degrees of freedom used, and the p-value. State a clear conclusion in context. If the results are significant, which cells contribute the most to the chi-square statistic? For these cells, are the observed counts greater than or less than expected? Whether or not the results are significant, describe the relationship as if you were writing an article for your campus paper. If the results are significant, can we infer a causal relationship between the variables? • Analysis of One Relationship between a Categorical Variable and a Quantitative Variable: Include a side-by-side histogram and describe it. Does there appear to be an association between the two variables? If so, describe it. Also, use some summary statistics to compare the groups. • Analysis of One Relationship between Two Quantitative Variables: For at least one pair of quantitative variables, include a scatterplot and discuss it. • Conclusion: Briefly summarize the most interesting features of your data. Topic or Resource Suggestions Use one of these or come up with your own idea or find your own source. There are many sites reporting frequency counts from survey results. • Frequency of smoking (never, occasionally, frequently), gender for students, age of the student and number of years smoking etc. • Academic division (business, accounting, TESOL,...), whether the student has a Mac, PC, or neither, for students, age and number of trimesters completed. • Whether a person plans to vote in the next election, political party affiliation (yes or no), age and number of years affiliated with the party.