This assignment consists of four questions.R and only packages (that is, R libraries)t can
be used for generating answers for this assignment!answers to this assignment are to be provided in a single R script file. All material in your
script file should be logically organised, so that related material can be easily and quickly located. Relevant files have been uploaded
Overleaf Example Western Sydney University Programming for Data Science (COMP7024) Assignment 2023 Q1 Due 12th of March 20231 1 Introduction This assignment consists of four questions, each of equal value, giving a total contribution of 40% to this subject. The beginning of each question provides a breakdown of marks for each part in that question. For example, a breakdown of (1 + 3 + 6 = 10) implies a question consisting of three parts, where the first, second and third parts are worth 1, 3 and 6 marks respectively. Important R and only packages (that is, R libraries) described in the lectures and tutorials for this subject can be used for generating answers for this assignment! In addition, R Markdown must not be used2. Penalties may apply for non-conformance. 2 Answer structure In doing this assignment, you should not seek to use the maximum word limit declared in the Learning Guide. Note that marks are not awarded for using many words, rather, for using an economy of words and only stating what is relevant to the question being answered. Consider the old adage – “less can be more”. Therefore, show you know what is relevant and have mastered the ability to get to the point using few, simple words and clear sentences. Seek to also apply this philosophy to the code you write. Your answers to this assignment are to be provided in a single R script file. All material in your script file should be logically organised, so that related material can be easily and quickly located. Clearly identify yourself in this file, as a minimum: full name and student ID, as comments at the beginning of the file. 2.1 R script file Textual answers should be included as comments in your script file, refer to listing 1 for examples on including comments. The comments in your script file should be: • Brief and to the point • Stating a high level perspective • Stating what is not immediately obvious, but worth mentioning 1You are welcome to submit early. Refer to “4.1 Early submission” section for further details. 2In short, only use R and R packages described in the lectures and tutorials for this subject. You are also not allowed to use R Markdown for this assignment. All these requirements will also apply for the exam. 1 # In an R script file , comments are prefixed with the hash symbol # A line with just a comment on it # Generate a distribution of mean values from a sequence of digits d <- replicate="" (1000="" ,="" {="" s="">-><- sample="" (0:9,="" replace="TRUE)" #="" generate="" a="" sequence="" of="" 10="" numeric="" digits="" mean(s)="" #="" a="" comment="" to="" end="" a="" line="" with="" r="" code="" on="" it="" })="" hist(d,="" main="’Distribution" of="" means="" ’)="" #="" show="" distribution="" of="" means="" #="" of="" cause="" you="" would="" use="" smarter="" comments="" than="" those="" used="" here="" #="" only="" state="" what="" is="" not="" immediately="" obvious="" listing="" 1:="" some="" r="" code="" with="" comments="" (shown="" in="" green)="" be="" brief="" and="" to-the-point="" with="" respect="" to="" comments.="" the="" approach="" described="" in="" this="" section="" should="" be="" the="" same="" as="" used="" for="" the="" exam.="" make="" judicious="" use="" of="" comments="" a="" priority="" in="" doing="" this="" assignment.="" after="" all,="" comments="" are="" meant="" to="" communicate="" important="" and="" useful="" details.="" make="" sure="" you="" also="" communicate="" well="" through="" wise="" choices="" in="" variable="" and="" function="" names.="" also="" make="" wise="" decisions="" regarding="" the="" layout="" of="" everything="" inside="" your="" r="" script="" file.="" note="" if="" things="" go="" wrong,="" good="" organisation="" and="" comments="" can="" help="" you,="" since="" they="" can="" show="" if="" appropriate="" logic="" was="" intended.="" 3="" plagiarism="" this="" is="" an="" individual="" effort="" assignment,="" therefore="" the="" answers="" you="" provide="" must="" be="" your="" own.="" you="" may="" learn="" from="" others,="" but="" the="" understanding="" claimed="" by="" your="" assignment="" must="" be="" yours.="" if="" you="" include="" any="" material="" in="" this="" assignment="" that="" is="" not="" your="" own,="" you="" must="" acknowledge="" that="" fact="" and="" declare="" the="" source="" of="" that="" material.="" be="" warned,="" your="" answers="" will="" be="" checked="" for="" plagiarism="" and="" if="" caught,="" significant="" penalties="" may="" apply.="" 4="" submission="" once="" you="" have="" completed="" the="" assignment,="" you="" must="" upload="" your="" r="" script="" file="" via="" turnitin;="" if="" you="" wish,="" you="" can="" also="" e-mail="" your="" r="" script="" file="" directly="" to="" me3.="" this="" maybe="" wise="" if="" you="" are="" having="" trouble="" with="" turnitin="" or="" vuws="" and="" are="" at="" risk="" of="" submitting="" late.="" once="" you="" have="" e-mailed,="" seek="" to="" successfully="" submit="" via="" turnitin.="" be="" aware="" that="" you="" may="" need="" to="" rename="" your="" r="" script="" file="" by="" adding="" the="" extension="" “.txt”,="" otherwise="" you="" may="" not="" be="" successful="" in="" submitting="" via="" turnitin.="" on="" a="" windows="" machine="" you="" can="" easily="" add="" a="" “.txt”="" extension="" via="" file="" explorer.="" select="" the="" “view”="" tab="" and="" tick="" “file="" name="" extensions”,="" refer="" figure="" 1.="" then="" select="" the="" file="" to="" be="" renamed,="" press="" f2="" to="" enter="" edit="" mode="" and="" add="" “.txt”="" to="" the="" very="" end="" of="" the="" file="" name;="" do="" not="" remove="" the="" “.r”="" portion="" of="" the="" file="" name.="" hopefully="" a="" similar="" process="" is="" available="" on="" other="" platforms.="" determine="" the="" method="" you="" will="" use="" and="" test="" it="" prior="" to="" submission.="" you="" must="" submit="" your="" assignment="" no="" later="" than="" the="" due="" date="" declared="" on="" the="" first="" page="" of="" this="" assignment,="" otherwise="" late="" submission="" penalties="" will="" apply,="" as="" described="" in="" the="" section="" titled="" “late="" submission="" penalties”.="" prior="" to="" the="" due="" date,="" you="" may="" replace="" a="" previously="" submitted="" version,="" but="" only="" the="" last="" submitted="" version="" will="" be="" marked!=""
[email protected]="" 2="" figure="" 1:="" how="" to="" add="" “.txt”="" to="" the="" file="" extension="" on="" windows="" 4.1="" early="" submission="" two="" “early="" submission”="" options="" are="" available,="" but="" you="" must="" choose="" which="" option="" 1.="" one="" or="" more="" submissions="" 2.="" only="" one="" submission="" and="" priority="" marking="" and="" follow="" the="" relevant="" instructions.="" 4.1.1="" one="" or="" more="" submissions="" you="" are="" free="" to="" make="" as="" many="" submissions="" as="" you="" wish,="" but="" only="" the="" last="" submission="" will="" be="" marked.="" nb,="" if="" your="" last="" submission="" is="" after="" the="" due="" date,="" then="" late="" submission="" penalties="" will="" apply.="" 4.1.2="" only="" one="" submission="" and="" priority="" marking="" you="" can="" only="" provide="" one="" submission="" and="" you="" must="" declare="" your="" request="" for="" this="" option="" by="" sending="" me="" an="" email="" after="" completing="" your="" submission="" to="" turnitin.="" to="" use="" this="" option,="" send="" an="" email="" to=""
[email protected]="" and="" use="" the="" subject="" “one="" submission="" and="" priority="" marking”.="" i="" will="" endeavor="" to="" start="" marking="" submissions="" are="" they="" arrive="" and="" in="" the="" order="" of="" their="" arrival.="" however="" the="" release="" of="" marks="" are="" subject="" to="" a="" caveat="" declared="" in="" the="" following="" section.="" 4.1.3="" caveat="" the="" marking="" order="" will="" be="" determined="" by="" order="" of="" submission.="" however="" marks="" cannot="" be="" released="" until="" everyone="" has="" submitted,="" or="" a="" minimum="" of="" one="" week="" has="" passed="" since="" the="" declared="" due="" date.="" 4.2="" late="" submission="" penalties="" late="" submission="" penalties="" exist.="" the="" contribution="" value="" of="" the="" assignment="" will="" reduce="" by="" 10%="" per="" day,="" for="" each="" day="" after="" the="" submission="" date;="" therefore="" four="" marks="" per="" day.="" for="" example,="" if="" your="" assignment="" is="" four="" days="" late,="" the="" maximum="" possible="" mark="" you="" can="" score="" for="" the="" assignment="" is="" 24="" out="" of="" 40.="" 3="" question="" 1="" (3="" +="" 3="" +="" 3="" +="" 1="10)" you="" are="" to="" develop="" a="" simple="" gambling="" game="" and="" test="" what="" the="" average="" outcome="" is="" if="" you="" always="" bet="" $50.="" (i)="" write="" the="" code="" necessary="" to="" perform="" a="" single="" turn="" of="" the="" game.="" the="" algorithm="" for="" the="" game="" is="" as="" follows="" •="" randomly="" choose="" a="" bet="" that="" is="" one="" of="" the="" following="" values="" 10,="" 15,="" 20,="" 25="" .="" .="" .="" ,="" 90,="" 95,="" 100="" •="" simulate="" the="" roll="" of="" a="" pair="" of="" fair="" dice="" •="" determine="" the="" outcome="" of="" the="" roll="" as="" follows="" –="" any="" of="" the="" following="" results="" in="" losing="" your="" bet="" 11,="" 33,="" 55="" –="" you="" receive="" twice="" your="" bet="" for="" any="" of="" the="" following="" 22,="" 44="" –="" you="" receive="" five="" times="" your="" bet="" for="" rolling="" a="" 66="" –="" any="" other="" roll="" outcome="" results="" in="" losing="" half="" your="" bet="" •="" tell="" the="" user="" what="" the="" bet="" and="" the="" return="" values="" are="" •="" if="" the="" return="" value="" is="" twice="" the="" bet,="" also="" print="" the="" following="" message="" on="" a="" new="" line="" you="" won="" money!="" •="" if="" the="" return="" value="" is="" five="" times="" the="" bet,="" then="" also="" print="" the="" following="" message="" on="" a="" new="" line="" jackpot="" win!!!="" make="" sure="" your="" code="" is="" well="" organised="" and="" has="" sensible="" documentation="" in="" the="" form="" of="" comments;="" you="" will="" be="" expanding="" the="" capability="" of="" your="" code="" in="" the="" rest="" of="" this="" question.="" also="" seek="" to="" make="" wise="" choices="" regarding="" variable="" names="" and="" code="" layout.="" (ii)="" wrap="" the="" dice="" simulation="" and="" return="" calculator="" you="" developed="" (i),="" within="" a="" function="" that="" looks="" like="" betresult="">-><- function(bet="50)" {="" #="" bet="bet" to="" be="" made="" #="" #="" simulate="" the="" roll="" of="" a="" pair="" of="" fair="" dice="" .="" .="" .="" #="" determine="" the="" return="" from="" the="" bet="" .="" .="" .="" return(betreturn)="" }="" 4="" insert="" within="" another="" function,="" the="" code="" you="" wrote="" in="" (i)="" to="" randomly="" determine="" a="" bet,="" make="" use="" of="" the="" following="" function="" template="" betgenerator="">-><- function="" ()="" {="" #="" randomly="" choose="" a="" bet="" within="" the="" following="" sequence="" #="" {10,="" 15,="" 20,="" 25="" ...="" 90,="" 95,="" 100}="" .="" .="" .="" return(betamount)="" }="" (iii)="" making="" use="" of="" the="" functions="" created="" in="" (ii),="" create="" the="" following="" function="" playgame="">-><- function(turns="10)" {="" #="" turns="number" of="" bets="" to="" be="" made="" .="" .="" .="" return(betreturn)="" }="" in="" order="" to="" complete="" the="" entire="" functionality="" of="" your="" game="" as="" devised="" in="" (i),="" except="" using="" a="" specified="" number="" of="" turns.="" however="" in="" this="" case,="" the="" function="" playgames()="" provides="" the="" following="" user="" output="" •="" a="" single="" line="" of="" output="" for="" each="" iteration="" of="" the="" game,="" which="" looks="" as="" follows="" bet="25," dice="" outcome="14," winnings4="505" •="" a="" single="" line="" stating="" the="" final="" position="" for="" the="" player,="" for="" example,="" “lost="" $200”="" (iv)="" what="" is="" the="" overall="" position="" for="" the="" player="" after="" one="" hundred="" turns="" of="" the="" game,="" where="" every="" turn="" consists="" of="" a="" $50="" bet?="" the="" position="" should="" consist="" of="" •="" total="" outlay="" •="" total="" winnings="" •="" overall="" profit="" 4winnings="" is="" the="" amount="" won="" 5use="" a="" negative="" value="" to="" indicate="" a="" loss="" 5="" question="" 2="" (2="" +="" 2="" +="" 2="" +="" 4="10)" for="" this="" exercise,="" you="" are="" to="" make="" use="" of="" the="" built-in="" dataset="" called="" iris.="" the="" first="" six="" lines="" of="" the="" dataset="" can="" be="" viewed="" as="" follows=""> head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa More information on this dataset can be obtain within R by executing ?iris in the console or in Wikipedia - Iris flower data set. In answering this question, only use functions provided within R base package, hence do not install any other package. (i) Using just functional programming, determine the mean Sepal.Length for each species of iris flower. Hint, you only need a single and simple line of code. Using only one or two sentences, explain how your code works. (ii) Using two different methods, repeat the exercise in (i), but without using functional programming. Using only one or two sentences, explain how your code works. (iii) Only using functional programming, determine the mean for each numeric column of the iris dataset, but according to each species. Therefore produce the following Sepal.Length Sepal.Width Petal.Length Petal.Width setosa 5.006 3.428 1.462 0.246 versicolor 5.936 2.770 4.260 1.326 virginica 6.588 2.974 5.552 2.026 Explain your code using no more than three simple sentences. (iv) Using the output of (iii), write code to build a tree structure6, which contains the above output data. The tree structure is described as follows • There are three branches off the root and each represents a particular species • Each species branch breaks into the following two branches: Sepal and Petal • The Sepal branch breaks into two branches consisting of: Length and Width • Similarly, the Petal branch breaks into two branches consisting of: Length and Width • The root of the tree consists of just the node, while the other end consists of 12 branches You do not need to visualise the tree structure, just write code to create it. 6Use the most appropriate built-in R data structure to build this tree structure 6 https://en.wikipedia.org/wiki/Iris_flower_data_set Question 3 (2 + 5 + 3 = 10) Here we will perform some simple analysis of data regarding the quality of different red wines. The data is located on vUWS in the file called “wineQuality-red.csv”. Further details for this dataset can be found at UCI - Wine Quality Data Set. The goal is not to become a wine expert, rather to do some simple intuitive investigation. Load the dataset and do some basic exploration and familiarization of it. (i) Write code to produce a single box plot that shows alcohol versus each wine quality. Give the plot a reasonable appearance, hence having a title, axis labels and using colours. Repeat for residual sugar versus quality and density versus quality. Using two simple sentences, which plot shows the greatest connection and worst connection with quality? (ii) Using the coding method described in lecture 6, write code to reproduce the visualisation shown in figure 2. Figure 2: Various mean wine variables versus quality Note that your visulaisation does not have to match exactly, in essence, just show the same information. 7 http://archive.ics.uci.edu/ml/datasets/Wine+Quality (iii) There is a built in function in R called cor(), which determines the correlation between two variables. More information->