Cannot paste will share on the next page
ASSESSMENT GUIDE Unit Code: ITEC203 Introduction to Data Science and Machine Learning, Study Period, S1 2022 Assessment number (2) Assessment Artefact: Python Codes and Comments Weighting [30%] Why this assessment? What are the types of employability skills that I will acquire upon completion of this assessment? Assessment Overview: Purpose, as written in the EUO Due date: 20/05/2022, 5pm, on Friday of Week 11 Weighting: 30% Length and/or format: data exploring tasks Learning outcomes assessed LO2 Graduate attributes assessed GA4, GA5 • Opportunity to apply theory into practice • Exposure to real-life scenario • Enhance students’ hand-on experience with python packages numpy, matplotlib, scikit-learn • Establish the students’ understanding of data preparation and manipulation using Python • Enhance students’ online engagement skills • The feedback from this assessment will help students to be ready to correct any conceptual misunderstanding and apply in the next assessment Skill Type Developed critical and analytical thinking ☑ Developed ability to solve complex problems ☐ Developed ability to work effectively with others ☐ Developed confidence to learn independently ☑ Developed written communication skills ☑ Developed spoken communication skills ☑ Developed knowledge in the field study ☑ Developed work-related knowledge and skills ☑ 2 How to submit: task submission – via LEO Return of assignment: Via LEO Assessment criteria: Rubric: see end of document Context The assessment consists of tasks to do online forum participation and image data exploration which requires fundamental knowledge of data science and machine learning. The purpose is to assess students’ online engagement and understanding and practical skills in the process of data preparation for machine learning models. There are 6 weeks sub questions/tasks related to this assessment and students are required to participate in weekly sub-tasks forum discussions. Students need to incorporate weekly forum participation summary when preparing submission. [18 marks in total] Suppose the students take a Data Engineer role in a company. One of their daily duties would be processing huge amount of data for different projects. So this assignment will guide the students how to handle these situations with an example. Instructions MNIST number dataset a set of 70,000 small images of digits handwritten by high school students and employees of the US Cen‐ sus Bureau. Each image is labeled with the digit it represents. This set has been studied so much that it is often called the “hello world” of Machine Learning: whenever people come up with a new classification algorithm they are curious to see how it will perform on MNIST, and anyone who learns Machine Learning tackles this dataset sooner or later. Instructions to explore this dataset are: 1. Use Google Colab for interactive practice of Python and related Machine Learning packages. 2. Always refer to textbook ‘hands-on machine learning with Scikit-Learn, Keras & TensorFlow‘ for coding help. 3. Specific tasks include a. download dataset (1%) b. explore the dataset and output information include (5%) i. how many images ii. how many features and the range of feature values (e.g., histogram of the data value) iii. how many categories/labels (discrete or continuous type) iv. visualize randomly selected samples (at least 5 images) within each category (feel the variance of the data, summarize your observation.) v. visualize more data samples to see whether there are bad data samples need to be removed, and summarize your observation. c. do more data manipulation (6%) i. Explore PCA to reduce feature dimensions down to two dimensions and plot the result using Matplotlib. You can use a scatterplot using 10 different colours to represent each image’s target class. ii. Use t-SNE to reduce the MNIST dataset down to two dimensions and plot the result using Matplotlib with scatterplot. iii. Summary/conclude your comparisons, discoveries, and insights. 3 Structure Prepare a jupyter notebook for this assignment. The structure of the Jupyter notebook should alternate texts and python codes and cover topics listed the in specific tasks above. How do I submit? Submit Jupyter notebook (.ipynb) to Assessment 2 via LEO assessment tile Note that: The code will be compared to other students’ submission in Turnitin to make sure the submission satisfies academic integrity. Submission checklist I have formatted my report as per the specifications ☐ I have checked my Turnitin report and taken appropriate actions to ensure that the submission satisfies academic integrity ☐ I have actioned feedback advice provided to me from labs feedback (if applicable) ☐ I have submitted my work before the due date/time ☐ I have submitted feed forward template along with my assignment submission ☐ Feed Forward Template (example) A template for students to use and act on feedback and provide recommendations for improvement. You can also submit anonymously via https://acu.qualtrics.com/jfe/form/SV_6sa9tdmOa5Y7s1g Note This is a task for any instance of follow-on assignment (assessment 2 and 3). This must be submitted as the first page of the follow-on assignment (assessment 2 and 3) to ensure you acted on the feedback provided to you in the previous assignment (this is not counted as part of the assessment word count). How did you act on the feedback? Feedback is an important component of learning. Please consider the feedback you received in your last assignment and provide a response on how you acted on, or intend to act upon, that feedback, and how it has informed the current assignment task. Submit this sheet along with your assignment. Questions Your learning from the previous assignment feedback How have you acted on the feedback from previous assignment to improve your work in this assignment? (e.g. based on my previous feedback, I made sure that I supported my discussion, position, ideas, concepts with peer reviewed journal references in this assignment) What is your expectation around the type of feedback that enhances your learning? (e.g. I want to know where I made a mistake and how I can correct them and not make the same mistake again i.e. I want specific feedback that will help me to improve my learning and performance in the next assignment) Did you have any difficulty understanding or acting on previous feedback? Please be as specific as possible so that you can gain further feedback/clarify anything you do not understand in the feedback (e.g. feedback provided in my previous assignment was very generic I did not know how to improve my work. So, I would like the teacher to explain more on xxxx aspects of the feedback or I would like an opportunity to have a dialogue to understand the feedback) https://acu.qualtrics.com/jfe/form/SV_6sa9tdmOa5Y7s1g 4 Some Helpful Websites and Resources Add in a couple of places to go for more info Anaconda environment https://docs.anaconda.com/anaconda/ Python official website https://www.python.org/ Useful python packages: https://numpy.org/ https://scikit-learn.org/stable/ https://pandas.pydata.org/ https://matplotlib.org/ Who can help me? Studiosity Academic skills Unit (ASU) Places –NLiC Maoying Qiao (
[email protected]) TBC I’m having problems Special Consideration: This form is used by students to apply for Special Consideration for assessable work in studies at Australian Catholic University. Approval of such applications will only be granted to students who are legitimately disadvantaged in their assessment due to exceptional and unforeseen circumstances beyond their control. Referencing All referencing should be in ACU Harvard style; However if you are coming from another faculty, you may choose to use your usual referencing style. If this is the case you must indicate at the top of your reference list what referencing style you are using (e.g. APA, MLA, Chicago, etc). Please ensure your assignment makes use of in-text citations and a reference list. Missing citations or references is equivalent to plagiarism. Criteria The full criteria is compiled in a rubric, which can be found on the following page/s. https://units.acu.edu.au/__data/assets/word_doc/0006/620655/SC_Application_for_Special_Consideration_20180214.docx https://libguides.acu.edu.au/referencing/harvard 5 Rubric for [ITEC203 AT2, 30%] Relevant LO/GAs Criterion (related to a single GA from the related LO – one GA per criterion Does not meet expectations Meets expectations Exceeds expectations NN PA CR DI HD GA5 LO2 Weight=6 marks TL=3 Learning stage = I and D Demonstrate correct understanding of the data preparation and manipulation concepts Fail to adequately demonstrate correct understanding of the data preparation and manipulation concepts (0 – 49) Adequately demonstrate correct understanding of the data preparation and manipulation concepts (50 – 64) Credibly demonstrate correct understanding of the data preparation and manipulation concepts (65 – 74) Distinctively demonstrate correct understanding of the data preparation and manipulation concepts (75 – 84) Highly distinctively demonstrate correct understanding of the data preparation and manipulation concepts (85 – 100) GA4 LO2 Weight=6 marks TL=3 Learning stage = I and D Demonstrate python programming skills by implementing data preparation and manipulation codes with packages scikit- learn, numpy, matplotlib. Fail to adequately demonstrate python programming skills by implementing data preparation and manipulation codes with packages scikit- learn, numpy, matplotlib (0 – 49) Adequately demonstrate python programming skills by implementing data preparation and manipulation codes with packages scikit- learn, numpy, matplotlib (50 – 64) Credibly demonstrate python programming skills by implementing data preparation and manipulation codes with packages scikit- learn, numpy, matplotlib (65 – 74) Distinctively demonstrate python programming skills by implementing data preparation and manipulation codes with packages scikit- learn, numpy, matplotlib (75 – 84) Highly distinctively demonstrate python programming skills by implementing data preparation and manipulation codes with packages scikit- learn, numpy, matplotlib (85 – 100) GA10 LO2 Weight=18 marks TL=3 Learning stage = I and D Demonstrate utilise information and communication and other relevant technologies effectively Fail to adequately demonstrate utilise information and communication and