Please refer to the attached file for more details about the assignment.
Stats Assignment Y.D. MATH2114- Assignment 1 (Stats.) Page 1 of 4 RMIT Classification: Trusted RMIT University, School of Science, Mathematical Sciences Math2114 –Numerical Methods & Statistics for Engineers Assignment 1 (Statistics Module) Due: Thursday 17:00 pm, 8th Oct. (Week 10), Semester 2, 2020 This assignment contributes 8% towards the assessment for the subject. The Assignment reports should be in the PDF format, and must be individual and submitted online via course Canvas site under the tab: Assignments. Late assignments will not be marked, unless a special consideration is granted prior to the due day for a later submission. Instructions: - The following list of questions is based on the material covered in the textbook Chapters 1- 7. Please work through these problems and present your calculations and/or Excel output where required. - You will need to create a Microsoft Word document, copy your Excel output in forms of tables into the word document, or take a screen shot and then insert the image into the word document, as well as present your written answers to the questions in the report. - At the completion of the report, save it to a .pdf file and submit it online to the course Canvas site. Question 1 (total 2.8 marks) The Pavement Research group is investigating the durability of brick pavers subject to high volume pedestrian traffic. Under experimental conditions, holes are gouged in fixed pavers subject to the vibrating head of a tip-hardened nail punch. The volume of the hole (as assessed by the liquid displacement) is a measure of the durability of the pavers. In a test of the durability of brick pavers, 20 pavers made from two types of brick (Control type and Experimental type) gave the following liquid displacement (ml): n Control Experimental 1 380 361 2 321 447 3 366 401 4 356 375 5 283 434 6 349 403 7 402 393 8 462 426 9 356 406 10 410 318 11 329 467 12 399 407 13 350 427 14 384 420 15 316 477 16 272 392 17 345 430 18 455 339 19 360 410 20 431 326 Y.D. MATH2114- Assignment 1 (Stats.) Page 2 of 4 RMIT Classification: Trusted 1A) Use Excel to order the Control and Experimental datasets from the smallest value to the largest value, respectively. Present the ordered data in the report and then calculate Q1, Q2, and Q3 and present your calculations for the 5-number summary in the report for both data groups. (0.8 marks) Hint: You may refer to Chapter 3 Lecture Slides #36 - #41 for the calculations 1B) Using Excel to produce the box & whisker plots for the Control and Experimental data groups, respectively. (0.2 marks) 1C) Discuss the presence of outliers in these data sets and provide the reason to justify your answer. (0.5 marks) 1D) Using Excel to determine the sample mean, median, mode, standard deviation, variance, Kurtosis and skewness for each of the Control and Experimental data: Data > Data Analysis > Descriptive Statistics > Summary statistics Hint: You may refer to Chapter 3 Lecture slides #32 & slide #33 for Excel instructions. (0.3 marks) 1E) Comment on the distributions of the two datasets, in terms of central tendency, variation and shapes of the distribution. (1.0 mark) Question 2 (total 1.2 marks) A highly contagious disease has broken out in Melbourne that is characterized by symptoms of ataxia, accelerated autophagy, cognitive decline and an insatiable desire for dopamine (through ingestion). The probability that a household contains an infected person has been determined to be 0.3. You have an urgent need to organise medical supplies and decide to try the 15 houses in your street for the following required information. Note: you are required to present both calculation and Excel output for Questions 2A – 2D. 2A) What are the expected value (μ) and standard deviation (σ)? (0.3 mark) 2B) Find the exact probability that 3 houses would contain infected people (both hand calculation and Excel output are required). (0.3 mark) 2C) Find the probability that, at most, 4 houses would contain infected people (both hand calculation and Excel output are required). (0.3 mark) 2D) Find the probability that more than 5 houses would contain infected people (both hand calculation and Excel output are required) (0.3 mark) Hints: • You may refer to the examples on Slides #23 – #27 of Chapter 5 Lecture Notes for both hand calculations and Excel instructions (on Slides25) for Question 2A. • You may refer to the examples on Slides #18 – #20 and #26 – #27 of Chapter 5 Lecture Notes for hand calculations for the Questions 2B – 2D. • You may refer to the following table for the appropriate Excel command for the Questions 2B – 2D. Find BINOMIAL Probabilities: Binomial Probability The command to type into Excel (do NOT forget the “=” symbol) Command Meaning P(X = x|n,π) =BINOM.DIST(x,n,π,false) The exact probability at X=x P(X < x|n,π)="BINOM.DIST(x-1,n,π,true)" the="" probability="" that="" is="" less="" than="" x="x" p(x="" ≤="" x|n,π)="BINOM.DIST(x,n,π,true)" the="" probability="" that="" is="" at="" most="" when="" x="x" y.d.="" math2114-="" assignment="" 1="" (stats.)="" page="" 3="" of="" 4="" rmit="" classification:="" trusted="" p(x=""> x|n,π) =1-BINO.MDIST(x,n,π,true) The probability that is more than X=x P(X ≥ x|n,π) =1-BINO.MDIST(x-1,n,π,true) The probability that is at least when X=x Question 3 (total 1 mark) Suppose that, based on a previous study, the systolic blood pressure (X) of a female satisfies ? ~ ?(110, 142). Note: ? ~ ?(?, ?2) states that variable X is approximately distributed normally with the population mean µ, and variance ?2. 3A) Determine the probability that a randomly selected female has a systolic blood pressure reading of less than 150. (0.5 marks) 3B) 90% of females would have blood pressure less than what value? (0.5 marks) Instruction: (1) You can utilise appropriate tables in COMPUTE worksheet of Excel workbook: Normal_template for your calculation. Copy the tables with your Excel output into your report. Note: you only need to change the data in the cells highlighted in the purple colour, and then the results will be automatically generated in the corresponding cells highlighted in the yellow colour. All the Excel formulas for each yellow coloured cell are provided in the Excel file: COMPUTE_FORMULAS for your information. (2) You are also required to present the hand-calculations using the Standard Normal Probability table (Z table). Question 4 (total 3 marks) The length of time that a 50-cent coin remains in circulation is modelled with a uniform function, with a mean of 24 months. Using Excel, generate the random data with the uniform distribution to represent sampling the length of time in circulation for 100 samples of n = 200. 4A) Create a histogram for a sample from your randomly generated data (you may use the first column data); then describe the general distribution of the sample data. Label the graph appropriately. (0.5 marks) 4B) Calculate the sample mean for each of the 100 samples (a total of 100 means); and then plot the sample means distribution with a histogram. Label appropriately. (2 marks) 4C) Describe the distribution of the sample means (i.e. sampling distribution) and provide the reason why you observe this distribution. (0.5 marks) Instructions: (1) To generate the required random data: (a) Open an Excel Worksheet (b) In cell A1: type: n; In cells A2 to A201 generate a series: 1, 2, 3, ……., 200 (see column A in Figure 1 below) (c) In cell B1 type: Sampling distribution (d) Select: Data > Data Analysis > Random Number Generation > OK. (e) In the pop-up dialogue window, input the required information as shown in the Figure; then OK. Y.D. MATH2114- Assignment 1 (Stats.) Page 4 of 4 RMIT Classification: Trusted Figure 1 Settings in the Random Number Generation dialogue window By now you will have generated 100 datasets (100 columns) with 200 random numbers in each dataset in the Excel worksheet Sheet1. (2) In Question 4(a), you are required to create a histogram for one sample distribution. Do the following steps: (a) Copy a column data, for example, data in Column B. (b) Open a new worksheet (Sheet 2), then Paste Special > Values to Column B of the new worksheet (Sheet 2). (Refer to Exercise 3 of Lab 1) (c) You now can create a histogram for the selected sample data distribution. If you forgot how to do this please refer to the Exercise 1 of Lab 2. (3) In Question 4(b), you are required to find all the sample means of the 100 datasets. Do the following steps: (a) In your worksheet: Sheet1, click on cell A203, enter Sample