BUS708 Statistics and Data Analysis Statistical Modelling Assignment Trimester 2, 2018 1 OVERVIEW OF THE ASSIGNMENT This assignment will test your skills of collecting and analysing data to answer a...

1 answer below »
Hii i have attached an assignment file.



BUS708 Statistics and Data Analysis Statistical Modelling Assignment Trimester 2, 2018 1 OVERVIEW OF THE ASSIGNMENT This assignment will test your skills of collecting and analysing data to answer a specific business problem. It also gives you the opportunity to apply the theories you have learned in this course such as finding numerical summaries, displaying with appropriate graphs and using statistical inferences to solve business problems, including constructing hypotheses, test them and interpret the findings. You may have to use two Data sets. One Data set will be sent to you via KOI student email individually and you need to find or collect another dataset. Suppose you are working for an agency who analyse NSW transport system data to make a recommendation to improve public transport system. You will be given series of research questions. Use your knowledge that you gain from this course to answer these questions by displaying appropriate outputs of Excel, StatKey or Wolfram alpha. Use these answers to write an executive summary which might be a valuable recommendation to Transport NSW. 2 TASK DESCRIPTION: WRITTEN REPORT There are two datasets involved in this assignment: Dataset 1 and Dataset 2, detailed below. Dataset 1: You will receive an email that contains a dataset that is specifically allocated to you. This dataset is a subset of a data Opal Tap on and Tap Off Location - 8th to 14th August 2016 individual sample file, provided by the Transport for NSW Open Data and has been edited to only include a subset of the cases and variables. The original dataset can be obtained from https://opendata.transport.nsw.gov.au/dataset/opal-tap-on-and-tap-off and it is under the license of Creative Commons Attribution 3.0 Australia. Data dictionary of the edited dataset is given in the following table. Variable Description Values mode Type of the public transport Bus, Train, Ferry and Light Rail date Date of the tap on/off held Date/month/year tap It is a tap on or off On and Off loc Locations of stops. For bus postcodes and others name of the stations Postcodes and names of the stations count Total number tap on or off on the certain location and the certain date Number Dataset 2: Collect data (e.g. via a survey) that will answer research question given in section 3. There is no requirement about the number of variables, sampling methods and sample size, but you need to justify your approaches in Section 1 (see below). https://opendata.transport.nsw.gov.au/dataset/opal-tap-and-tap/resource/c8d1d429-c283-4350-95f8-d8d21b845ac0 https://opendata.transport.nsw.gov.au/dataset/opal-tap-on-and-tap-off https://creativecommons.org/licenses/by/3.0/au/ Both datasets should be saved in an Excel file (one file, separate worksheets). All data processing should be performed in Excel or Statkey (http://www.lock5stat.com/StatKey). Prepare a report in a document file (.doc or .docx) which includes all relevant tables and figures, using the following structure: 1. Section 1: Introduction a. Give a brief introduction about the assignment and search related article and write a paragraph of summary which supports your assignment. You need to give the full citation of the article. b. Dataset 1: Give a short description about this dataset. Is this primary or secondary data? What are types of variables involved? Explain briefly what are the possible cases used in this study. c. Dataset 2: Explain how you collect the data and discuss its limitation (e.g. whether your sample is biased). Is this primary or secondary data? What is/are the type(s) of variable(s) involved? Give a description of cases you consider for this data set. 2. Section 2: Analysis of single variable in Dataset 1 a. To answer research question “Which type of public transport was most used by the NSW people during 8th to 14th of August 2016?”, provide a suitable numerical summary and graphical display for the variables mode of Dataset 1. Give a detailed comment to answer the research question. b. Now to answer research question “Are there more than 50% of public transport users in NSW use the particular mode of transport found in Part a?” setup an appropriate hypotheses, perform hypotheses test and answer the research question by writing the conclusion of the test. 3. Section 3: Analysis of two variables in Dataset 1 NSW Government need to decide on whether they have to build an underground Railway line from either Parramatta, Bankstown or Gosford to central. To prepare a recommendation for this; a. Give a numerical summary and an appropriate graphical display for the variables location, by only considering those three stations; and the variable count by considering the data with trains only. b. Perform a suitable hypothesis test at a 5% level of significance to test whether there is difference between mean counts of taps on and off. c. Use the conclusion of the test in part b and the outputs in part a to write a recommendation to NSW government. 4. Section 4: Collect and analysis Dataset2 You are interested in finding whether there is a difference in preference between different gender in terms of their transport mode (Bus, Train, Ferry and Light Rail). by considering appropriate number of cases and variable, give a proper graphical display and use it to write a comments. Section 5: Discussion & Conclusion Write an executive summary by combining all your findings in the previous sections which must be a valuable recommendation for NSW Transport. Give a suggestion for further research http://www.lock5stat.com/StatKey 3 TASK DESCRIPTION: PRESENTATION/INTERVIEW A presentation/interview for the assignment is scheduled on Week 11, in your allocated tutorial. You do NOT need to prepare a presentation material (e.g. power-point slides), instead, you will be asked to demonstrate and/or explain how you summarised the data and how you performed the analysis. You may be asked to reproduce what you have made in your written report (e.g. generate a chart or numerical summary using Excel or Statkey). 4 SUBMISSION REQUIREMENT Deadline to submit written report: Week 10 Friday (21 Sep 2018), 11:59 pm You need to submit 2 files to Turnitin: 1. Main report, in a Microsoft Word document file (this is the file that will be marked, it should contain all necessary tables and figures) 2. Dataset, in a Microsoft Excel file (this is just a supporting file) Main report (word document): 1. Size: A4 2. Use Assignment Cover Page (download from Moodle) with your details and signature 3. Single space 4. Font: Calibri, 11pt Dataset (excel document): 1. Dataset 1 in Sheet 1 2. Dataset 2 in Sheet 2 3. Data processing for each section in other sheets (rename the sheet appropriately) 5 DEDUCTION, LATE SUBMISSION AND EXTENSION Late submission penalty: - 5% of the total available marks per calendar day unless an extension is approved. For extension application procedure, please refer to Section 3.3 of the Subject Outline. 6 PLAGIARISM Please read Section 3.4 Plagiarism and Referencing, from the Subject Outline. Below is part of the statement: “Students plagiarising run the risk of severe penalties ranging from a reduction through to 0 marks for a first offence for a single assessment task, to exclusion from KOI in the most serious repeat cases. Exclusion has serious visa implications.” “Authorship is also an issue under Plagiarism – KOI expects students to submit their own original work in both assessment and exams, or the original work of their group in the case of a group project. All students agree to a statement of authorship when submitting assessments online via Moodle, stating that the work submitted is their own original work. The following are examples of academic misconduct and can attract severe penalties:  Handing in work created by someone else (without acknowledgement), whether copied from another student, written by someone else, or from any published or electronic source, is fraud, and falls under the general Plagiarism guidelines.  Students who willingly allow another student to copy their work in any assessment may be considered to assisting in copying/cheating, and similar penalties may be applied. ”
Answered Same DayAug 31, 2020BUS708University of the Sunshine Coast

Answer To: BUS708 Statistics and Data Analysis Statistical Modelling Assignment Trimester 2, 2018 1 OVERVIEW OF...

Pooja answered on Sep 06 2020
150 Votes
Table of Contents
Section 1: Introduction    2
Introduction    2
Dataset 1    2
Dataset 2    2
Section 2: Analysis of single variable in Dataset 1    3
Type of public transport    3
Hypothesis testing    4
Section 3: Analysis of two variables in Dataset 1    4
Location and Count    4
Hypothesis testing    5
Recommendation    6
Section 4: Collect and analysis Dataset2    6
Section 5: Discussion & Conclusion
    8
References    9
Section 1: Introduction
Introduction
This assignment focuses on Opal tap on and tap off location during the time span of 8th August 2016 up till 14th August 2016. The original data set is obtained from the transport NSW government of Australia. This dataset is licensed by Creative Commons attributes 3.0 Australia. The sample data taken for the purpose of the analysis is obtained by taking a subset of the original data.
Dataset 1
The analysis is done on the basis of secondary data. Secondary data is defined as the data which is already existing and can be easily obtained from freely available sources like the internet.
There are a total of 6 variables in this dataset namely the category of public transport used, date of the tap on/off held, if the tap is on or off, the location of the stop, and Total number tap on or off.
The variable type of the public transport is denoted by mode. It is measured by the nominal scale of measurement as it is categorical in nature. The various categories of mode are Bus, Train, Ferry and Light Rail.
The variable, Date of the tap on/off held is denoted by date. The date is a measure of time which is measured by the interval scale of measurement.
The location of the stops is denoted by loc. This variable has two kinds of values Postcodes and the names of the stations. The names of the stations are measured by the nominal scale of measurement. Postcodes are also discrete in nature.
Total number tap on or off on the certain location and the certain date is denoted by count. The variable count is a continuous variable which is measured by the ratio scale of measurement. This variable is comparative in nature. I can say that the total number of tap on or off on 1st location is twice as compared to the total number of tap on or off at the 2nd location.
Dataset 2
I use the technique of stratified Sampling to obtain my second data set. The method of primary research is used in this case. My population is all the citizens of Australia. The strata are the five states of Australia namely New South Wales, Queensland, Victoria, South Australia, and Tasmania. The method of simple random sampling is used to select 10 individuals from each of these strata. The sample size for this survey is 50. Each individual is given a survey and asked about their Gender and the mode of transportation used.
Gender and the mode of transportation used or both measured by the nominal scale of measurement as they are discrete variables. Gender is grouped as either male or female. The mode of transport is categorized as Bus, Train, Ferry and Light Rail.
Using the appropriate measure of sampling technique, I can ensure that the sample obtained is a good representation of the population. A reliable sample is an indication towards reliable results.
Section 2: Analysis of single variable in Dataset 1
Type of public transport
The frequency distribution depicting the frequency of each category of public transport used is given below.
    Row Labels
    Count of mode
    bus
    472
    ferry
    24
    lightrail
    19
    train
    485
    Grand Total
    1000
It is evident that the train is the most commonly used type of public transport. There are 485 individuals in a sample of 1000 individuals who travel by train. The number of individuals using the bus is observed to be 472 out of thousand individuals. A very few individuals use light trail and ferry as their preferred type of public transport. The number of...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here