the assignment is analyzing any set of data
Initial_Project_Directions CourseProject:Your OnlyAssignment DueLastDayofClass WhyaProject • Aprojecthelpbuildaportfolio. • Itforcesyoutodoanalysis (usually)andcommunicate yourresults. • Farmoreinterestingtograde thananexam. “It doesn’t matter how great your analysis is unless you can explain it to others: you need to communicate your results.” R for Data Science Hadley Wickham & Garrett Grolemund GeneralProjectGuidelines A)FindaDataSourceandAnalyzeit • YoucansearchKaggle orotherdatasetrepositories) • Work* • Makeyourowndataset. • UseanAPI(likethetwitterAPI)togatherdata(Noteasy) • WebScraping(hard) • ImproveonPreviousAnalysisProjects B)OtherPythonrelatedoptions(pickany) • CodeaMLAlgorithmfromscratch • ImproveaPythonprojectfromanotherclass(doesn’thavetobeanalysis related) • TranslateyouroldcodetoPython GeneralProjectGuidelines • ChooseoptionAorB. Grading • Duetothenatureofdataanalyticsbeingabitsubjectiveand peoplebeingabletochoosebetweendifferentprojectoptions, anyonewhoturnsinaprojectwillgetanA(95%)inthisclass. • StudentsmaygetuptoanA+(100%)iftheprojectisimpressive. • Iwillstillprovidefeedbackandareasforimprovementonyour project. GeneralProjectAdviceOptionA • Thefollowingcoupleofslidesarejustadvice. OptionA:Task1 •Makeaproblemstatement • Afterpickingyourdataset,itis importanttofigureoutwhat problemyouaretryingto solve. OptionA:Task2 • Identifywhomayuseyour result • Inotherwords,figureoutwhat isthepotentialusefulnessof youranalysis OptionA:Task3 •Makesomepreliminarygoals foryourproject(whatmay comeoutofyourwork) OptionA:Task4 • ThinkofsomeSuccessMetrics foryouranalysis • Formachinelearningtasks, accuracycanhelp • HypothesisTesting • Howdoweknowit works/improves OptionA:Task5 •Mentionanyuncertainty/risks thatmaybeachallengeto completeyourproject. • Example:Formachinelearning tasks,ithelpstohavemorethan 250rows. OptionB B)Pythonrelatedoptions(pick any) • CodeaMLAlgorithmfrom scratch • ImproveaPythonprojectfrom anotherclass • TranslateyourcodetoPython OptionB Sincethisisincrediblyopen ended,youwillhavetofigure outwhatyouwanttoshow. Microsoft Word - onlineSyllabus.docx Data Analytics using Python Syllabus Course Number: CSE-41204 Instructor Information Name: Michael Galarnyk Email:
[email protected] LinkedIn: https://www.linkedin.com/in/michaelgalarnyk/ Communication Policy You may contact me by email. It usually helps to contact me a couple days before your one project is due as most students tend to ask around that time. Course Information Course Description (Goals and Objectives) In this course, you will learn the rich set of tools, libraries, and packages that comprise the highly popular and practical Python data analysis ecosystem. This course is primarily taught via screen sharing programming videos. Topics taught range from basic Python syntax all the way to more advanced topics like supervised and unsupervised machine learning techniques. Key Topics • Installing Python/Jupyter/IPython on Windows and Mac • Python Basics (variables, strings, simple math, conditional logic, for loops, lists, tuples, dictionaries, etc) • Using the Pandas library to manipulate data (filtering and sorting data, combining files, GroupBy, etc) • Plotting data in Python using Matplotlib and Seaborn • Logistic Regression using Scikit-Learn • Classification and Regression Metrics • Decision Trees using Scikit-Learn • Random Forests (Scikit-Learn) • Clustering Algorithms (K-Means, Hierarchical Clustering) • Dimensionality Reduction (Principal Component Analysis) Page 2 of 5 Course Materials and Textbooks Suggested Texts: None Student Learning Outcomes By the end of this course, students will be able to: a) Interpret trends in data b) Produce a project that they can use as part of their data analytics/science portfolio. Course Schedule While a lot of the students in this class know the basics, reviewing the basics is important even for experienced python programmers. That is why the first two weeks are dedicated to the basics and will be continuously reviewed throughout the remainder of the course during more advanced topics. Session Topic Assignments w/due dates 1 Intro + Setup + Basics (strings, lists, tuples, etc) 2 Tuples, dictionaries, sets, functions 3 Pandas Part 1 4 Pandas Part 2 5 Matplotlib + Logistic Regression 6 Decision Trees 7 Decision Trees + Random Forests 8 Unsupervised Learning (KMeans + dimensionality reduction) 9 Topics of Interest and How to Learn Them* * Lecture about the topics we didn’t cover in this class and how to learn them. Final Project Due Page 3 of 5 Grading and Assignment Information Letter grades are based on the UC San Diego Extension Grading Scale. Your final course grade is based on the percentage of points you have earned. Passing Grades A+ 100% A 90-99% A- 88-89% B+ 86-87% B 83-85% B- 80-82% C+ 76-80% C 71-75% C- 20-70% Weighted Grading Criteria UC San Diego Extension does NOT have a requirement about how instructors weight their grading criteria. I have decided to make nearly 100% of your grade be a project. Details and rationale for this are explained in assignment section of Blackboard. Assignments (Class Project) 100% TOTAL 100% Grading Policies This course can be taken as part of the Python Programming certificate. In order for the class to count towards your certificate it must be taken for a letter grade or as pass/no pass. Classes that are taken as NFC cannot count towards a certificate. You can change your grading option any time BEFORE the last day of class through My Extension. Late Policy: Final Project is due on date specified on course schedule. An assignment is considered late if it is posted or sent after the due date/time. Late assignments will be accepted at the discretion of the instructor and cannot be accepted more than 1 week late. A couple hours or a day late is typically okay. I don’t take off points for late assignments. Page 4 of 5 Assignments Due to the nature of this course (a final project), any type of submission you see fit is typically acceptable. I normally see some variation of: a) .ipynb file b) .ipynb file + powerpoint file c) .ipynb file + report (I highly discourage writing a report as a blog post is better for an online presence. d) .ipynb file + blog post e) .py file f) .py file + powerpoint file The reason for allowing different type of projects is that I want this class to be a way for students to improve themselves as they see fit. Everyone coming into the class has different goals and I allow for people to show me however they want that they learned something or improved on previous knowledge in this class. Discussion Board Feel free to ask questions on the board or on the unlisted youtube videos for the course. Quizzes & Tests No quizzes or tests. UC San Diego Extension Policies and Resources Academic Policies and Procedures Please refer to UC San Diego Extension’s website (Student Resources tab) for specific details about academic policies and procedures: Student Resources. MyExtension Your MyExtension account is your student records portal. Log into MyExtension (https://myextension.ucsd.edu/) to enroll in a course, drop a course, request verification of enrollment, request official transcripts and more. Campus Emergencies In the event of an emergency, information will be posted at UC San Diego Extension (http://extension.ucsd.edu/). Extension students must access the website to find out the status of the emergency situation. Email and or phone lines may not be accessible. Page 5 of 5 Information will be updated online as the situation progresses and an ALL CLEAR will be posted once the situation is resolved. Code of Conduct All participants in a course at UC San Diego Extension are bound by the University of California, Code of Conduct found at Student Conduct Code. Academic Integrity Policy The University is an institution of learning, research, and scholarship predicated on the existence of an environment of honesty and integrity. As members of the academic community, faculty, students, and administrative officials share responsibility for maintaining this environment. It is essential that all members of the academic community subscribe to the ideal of academic honesty and integrity and accept individual responsibility for their work. Academic dishonesty is unacceptable and will not be tolerated at the University of California. Cheating, forgery, dishonest conduct, plagiarism, and collusion in dishonest activities erode the University's educational, research, and social roles. If students who knowingly or intentionally conduct or help another student perform dishonest conduct, acts of cheating, or plagiarism will be subject to disciplinary action at the discretion of UC San Diego Extension. Please refer to UC San Diego Extension website to view this policy: Student Conduct Policy. Access and Accommodations At UC San Diego Extension, we strive to make learning experiences as accessible as possible. If you anticipate or experience physical or academic barriers based on disability, we encourage you to contact the Extension Disability Coordinator to apply for reasonable accommodations. Visit our website: Services for Students with Disabilities. Please note that it is your responsibility to initiate contact with the Disability Coordinator. Phone: 858-822-1366 Email:
[email protected]