here Describing your selected data *500 words max.; 10 marks Choose a data set from above that you find most interesting. Subset the data to several (3 -5) variables and investigate the data using statistical and graphical summaries. Write a short report about your findings and why they are important. *PLEASE NOTE: Figures, tables, headings, captions, and references/citations are not included in the word count. Some things you may wish to include: Succinctly tell us which data set and variables you chose and why they are interesting to you. Describe the types of data present in your subset of variables; e.g. Numerical (Continuous, Discrete) or Categorical (Binary, Nominal, Ordinal) What are typical values of your data, this might also be the most frequently occuring class, how spread out is the data? e.g. measures of spread, centre - plots, tables, and summaries may be helpful here. What do the distributions of the different variables look like? e.g. normal or skewed - histograms, boxplots, and bar charts and measures of skewness and coefficient of variation may be useful here. Note that categorical variables can become discrete numeric variables. For example, if we have a categorical variable that indicates the type of pet owned by a household (dog/cat/bird/fish), we could assign each category a numerical value (e.g., 1 for dog, 2 for cat, etc.), or we could count the number of households that have each type of pet (e.g., 50 households have dogs, 30 have cats, etc.). In either case, the categorical variable would become a numerical discrete variable that can be tabulated or plot as a bar chart. Can you find any interesting relationships in the data? e.g. comparing the medians/IQR/Min/Max of two continuous variables, or a continuous variable with two or more groups (summary, boxplot) or is there a relationship between two continuous variables (xy-scatterplot) or is there a relationship between two categorical variables (cross tabulation). Provide a brief discussion that describes any interesting findings from your analysis. You might also discuss limitations in your analysis and perhaps any future analysis you might do in subsequent research. You may wish to research variables you use, for example, why is dissolved oxygen a good indicator of water quality? You must cite all sources including chatGPT, use of stack exchange and other forums, and journal articles and grey literature, Please use Harvard or APA reference style for in text citations and references. here is a nice example of citing ChatGPT Click here to an external site. This section is not included in the word count. Provide a brief summary (Journal) on how you used the sources and how you validated/trusted the sources you used. This section is not included in the word count.
Associated Learning Outcomes
LO1.
Demonstrate proficiency in utilizing R and Excel to effectively explore and describe data sets in the life sciences.
LO2.
Evaluate and interpret different types of data in the natural sciences by visualising probability distributions and calculating probabilities using RStudio and Excel.
LO5.
Articulate statistical and modelling results clearly and convincingly in both written reports and oral presentations, working effectively as an individual and collaboratively in a team, showcasing the ability to convey complex information to varied audiences.
Submission
Your project must be neatly produced in a singleQuarto Document, output as HTML format.
If you cannot render to HTML, copy code, output, figures, tables to aMicrosoft Word Document, create your neatly presented report and submit.
You willsubmityour final HTML/Word/PDF doc file to the Turn-it-in tool in Canvas via theProject 1 assignmentlink.
We have also provided aproject templatefolder to help you out with setting up your first project. This can be accessed
here
Download here
. The template provided is a project folder with relevant quarto documents as well as some examples of how you can make nicely presented documentshttps://quarto.org/
Links to an external site.
. You will have to unzip the folder to access the project folder. Double click on the Rproj file in the project folder to open the project. If you are having issues with rendering your document, please let us know ASAP as we can assist you with making it work or we can provide alternative ways of creating your final document!
Things to think about when using AI
You must take 100% responsibility for any submissions you make, and you must acknowledge how you have used generative AI, if you have used it.
·
Your assessment submission must not be taken directly from the output of these tools. It needs to be put into your own words.
·
You will create a journal with (i) the name(s) of the AI tool(s)/forums you used, (ii) a brief description of what purpose(s) you used them for, and (iii) a brief description of how you used the tool(s) (e.g. what prompts did you provide, what questions did you ask)