Attached are the requirements for the major assignment.
There are two files; one contains the task to be performed and the other is the data file to be used.
Please read the instructions carefully.
Microsoft Word - FIN10002_ Sem 2 2020_Major assignment.docx FIN10002 S2 2020 Major Assignment 1 FIN10002: Financial Statistics Semester 2, 2020 Assessment 2: Report Weighting: 25% Due date: 11.59pm Friday, 16th October 2020 (AEST) Word limit: 1500 (+/- 10%; not including Appendix or Executive Summary) Assessment overview For this assessment, you need to produce a report by generating responses to six tasks presented below. For each of these sections, you will need to use Excel to generate statistical output (statistical analyses and graphs). The report should be presented in the form of a business report to a senior manager who cannot be assumed to have any knowledge of statistical methods. Microsoft Word and Excel should be used to complete this assessment. Your statistical calculations should be carried out using Excel only. For this part of the assignment, you will submit the Excel file and a Word version of the report as soft copies via the submission folder in Canvas. Assessment details You will need to download the Excel dataset ‘Major Assignment data Taxation Sem 2 2020.xlsx' from Canvas. The data set contains taxation data for a location in Australia and is supplied by the ATO. The data is based upon the taxations returns of the individuals in a geographical area. There are 4600 lodgments for a particular year in this data set and eight variables as follows: 1. Id No: is the identifier number set by the ATO in no order 2. Gender: 0 = male, 1 = female 3. Age- range: age of person lodging return in an interval of 5 years. See table 1 in the file. 4. Occ_Code: Occupation code – see table 2 in file. 5. Lodgment_Method: Return ledged either by agent or self 6. Tot_inc_amt: Total income or loss – salary/wages or another source ($) 7. Tot_ded_amt: Total allowable deductions ($) 8. Taxable_Income: Taxable income ($) FIN10002 S2 2020 Major Assignment 2 You will use this data set to generate responses to the following six tasks. 1. Select a random sample Select a random sample of size 90 from the given 4600 lodgments in the Taxation lodgment data set. You will use this sample data to complete tasks 2 to 6. 2. Descriptive statistics Use data summary methods to describe the returns in your sample using seven variables – items 2 to 8 above. (Note: Do NOT do Id No) Use an appropriate graphical and summary statistical technique, chosen according to the type of variable (note that less appropriate/inappropriate techniques will receive fewer/no marks). Choose your techniques from: Tabular Techniques: frequency tables and grouped frequency tables Summary Statistics: mode, median, mean, standard deviation, range, coefficient of variation and interquartile range Graphical Techniques: pie chart, bar graph, histogram, frequency polygon. (See topics 1 and 2) Do not draw an ogive curve, stem plot, or a box plot in this assignment and do not draw 3-D graphs. NOTE: for each variable there should be one graph(chart) and at least one table. This should be presented in the body of the report. 1. For a nominal or an ordinal variable draw a graph and present a frequency table in percentages. 2. For a ratio or an interval, variable draw a graph and a summary statistics table, including summary statistics appropriate to the type of distribution only. 3. Try to use variation in drawing graphs e.g. pie chart/bar chart or histogram/polygon. 4. Do not draw two different graphs for the same variable. You can draw the same type of graph for two variables. 5. Do not include any information that you will not include in your discussion such as kurtosis. 6. Display and describe one variable at a time. 3. Confidence intervals Estimate the following quantities, using 95% confidence intervals. Explain the meaning of your confidence intervals. 1. The average taxable income for Technicians and Trades Workers only 2. The average Total allowable deductions for all sample lodgments. Compare both intervals with their respective true means by calculating the actual population mean for the full 4600 lodgments, and comparing the true population mean to the sample mean and confidence interval (note: it FIN10002 S2 2020 Major Assignment 3 is not usual to do this, so you are asked to do this for the purpose of this assignment). Your confidence interval should start with ‘We are 95% confident that…”. This section should take half a page or less. NB: Please make sure you provide enough information in the appendix for your confidence interval calculations to be replicated, so they can be checked. 4. Hypothesis testing 1. It is often felt that tax agent lodged returns claim more total allowable deductions on average than self-prepared lodged returns. Investigate this contention by carrying out an appropriate hypothesis test. 2. It is often felt that the average taxable income, differs for males and females. Use gender for this test. Only report a non-technical explanation of your methodology and your findings in the main section of the report. The computations and output should be placed in an appendix, including the test statistic, p-value and degrees of freedom. This should take half a page or less. 5. Correlation and regression In this section, you will investigate the relationship between the Total income amount and the age range. Using these two variables (total income amount (Tot_inc_amt) and age range) develop a regression model to predict average total income amount from the age of the lodger. Make sure that you undertake a full regression analysis, with appropriate discussion and include: 1. a scattergram and a brief discussion 2. an estimate of the linear regression model 3. the coefficients of correlation and determination 4. a test of the hypothesis that there is no linear relationship between age range and total income amount for each lodgment. Ensure your scattergram includes a line of best fit. Also, make sure you describe the relationship between the variables using R and R-square and interpret the slope, coefficients and the results of the hypothesis test. Use a significance level of 0.05. 6. Conclusion Provide a brief, concise summary of all your findings and briefly mention any limitations in your findings. Make sure you do not give tables or graphs here. FIN10002 S2 2020 Major Assignment 4 Presentation The report should be presented in the form of a business report to a senior manager who cannot be assumed to have any knowledge of statistical methods. Make your report informative but concise and use a non-technical style. Do not just quote statistics or analysis results but explain what they mean. In general, do not include in the report formulae, calculations, definitions of statistical terms or discussions on how graphs are constructed. Where appropriate these may be included in the appendices. It is important that the values which have been calculated are correctly analysed, discussed and interpreted, and that a written description of the main features of the tables and graphs that have been constructed is included. The emphasis in this assignment is on interpretation and analysis, not just the computation of statistics and construction of graphs. It will be assumed that all computations have been correctly performed and that graphs have been properly constructed. Nevertheless, marks will be deducted if these are inaccurate or incorrect. The presentation is an important feature of a business report. The guide to the presentation that follows gives a general outline to report writing. Executive Summary 1. Report only the highlights of the findings. 2. Entice an Executive to read on. 3. Essentially a lively summary of the main conclusions. 4. No longer than one page; this is not counted in the word count and must be on a separate page from the rest of the report. Introduction 1. State the purpose of the report i.e. what you will discuss in the report 2. Outline the contents of the Report 3. Provide a brief description of the methodology 4. Describe the source of the data and state its location in the report. 5. This should contain information about what we expect to read in the project. This should take about half a page. Analysis 1. Contains a thorough yet non-technical description of all the findings (graphs and tables will be included only where they help this discussion). 2. Details the results that were highlighted in the Executive Summary 3. Do not include any calculations here but include appropriate graphs and tables which are needed to support your discussions. FIN10002 S2 2020 Major Assignment 5 Conclusion 1. Report the findings and results of your work. 2. Essentially an expansion of the executive summary written from the point of view that the Executive Summary has not been read. 3. End with a discussion of the limitations of your analysis (e.g. reference to sample size if small, or comment on the data if it is old). Appendices 1. Must be referred to in the main body of the report. 2. Must contain your selection of random numbers and related random data. 3. Include the sample raw data, charts and tables that are not essential, but support the ANALYSIS section. 4. Include your EXCEL output for descriptive statistics, confidence intervals, hypothesis testing and regression. 5. Include any other relevant calculations. Please make sure the information in the appendix is enough for all calculations to be replicated, so they can be checked e.g. if you include the output for your confidence intervals, please show how this output was used to calculate the confidence intervals. Graphs must be in the Main body along with relevant tables and discussion. Graphs kept only in the appendices will not score any marks for graphs. Keep the appendices to a moderate size. Marks are