my topic is covid 19, so any problem related to it will work
Project Introduction A significant component of the course is the use of the competencies you acquired on a data set of your choice. Ideally, the data set should be relevant to the organization you are associated with, but if you cannot find one, any publicly available data set would be acceptable. There are good data sets available at https://www.data.gov http://wdi.worldbank.org/table, and https://research.stlouisfed.org/fred2/Links to an external site. but in a previous edition of the course students have found other publicly available data sets, for example about statistics in various sports, in particular the NHL. You are of course encouraged to discuss any research ideas you have with the other students in the course or with me. Also, because this is a formative assessment, you have the opportunity to submit a draft version of your report and there are therefore two assignment boxes in this module. Because of my schedule, I have set the deadline for the draft report for 10 days or so before the deadline for the final assignment. This should give me enough time to provide you with useful feedback on your report. Since this is a somewhat open-ended project, it is hard to give a precise rubric for the project. Here is the rubric which I will be using. The final project will focus on using the technology skills you have acquired in this course to perform an analysis related to two research questions. It requires you to create a number of visualizations and a dashboard, and to write a report. If you decide to use Tableau as your tools, please make sure you save your Tableau workbook as a Tableau Packaged Workbook so that I can access the data as well. The report and the data source, including the visualizations, should be in a zip file, which you submit below. The visualizations, which can maximally attract 100 points, should consist of a data source and a number of visualizations, including a dashboard and a more advanced visualization (i.e., either a map or an animated visualization). The following table gives details about what I will be looking for: Category Elements Minimum Required Points for Each Maximum for this Category Data Source Data Sheets/Tables - Tables derived from a single source table will count as one 2 5 25 Links between Data Sheets/Tables - Links between tables derived from a single source table will not count 1 5 25 Visualizations Simple Visualizations 2 5 20 Advanced Visualization, i.e. Drill down, Filter, or Parameter 1 5 20 Dashboard 1 10 20 Animated Visualization or Map 1 10 20 The table may require some explanation. The "Minimum Required" column is the minimum number of that particular element that needs to be in the Tableau Packaged Workbook or the Power BI file. Thus, I will look for at least 2 data sheets (or tables) in the data source. The "Points for Each" column gives the points you will get for each successful element in that category. Thus, a successful relationship between 2 data sheets or tables will earn you 5 points. The "Maximum for the Category" column gives the total number of points available in this category. Thus, you will be able to earn at most 25 point for the tables (i.e., as most of you will have worked out, 5 of them). The rationale is that the above gives you some flexibility in your project. Some of you may want to create 2 animated visualizations for your project, whereas for others only one animated visualization is appropriate. In other words, you choose how you "collect" the 100 points available for this project, within the parameters set above. So, make sure that you have at least the minimum required in each category, and realize that there is a maximum in each category. So, if you have a large complicated data source with a large number of tables, you do not need to create as many visualizations. On the other hand, if you use a minimal data source with only two tables and one link, you will need to create more visualizations to get to 100 points. There is one more clarification: You will only receive points for the data source if the tables are truly separate. In past semesters, students have taken a single source table, containing for example a number of different economic indicators, separated these in multiple files, for example a single file for each economic indicator, and then linked those. Since this was technically in compliance with the rubric, I accepted these data sources in the past. I will not do so going forward. At least 2 of the minimum of 4 visualizations included in the workbook must combine data from multiple tables in the data source. If none do, I will deduct 15 points for the workbook. The report should contain 4 sections: 1. Introduction 2. Methodology 3. Analysis 4. Conclusion The introduction should give the background to the research questions you are addressing and a clear formulation of the two research questions. The methodology should describe the data sources that you used for this project and where you sourced them from. The analysis should describe how you used the data reports and visualizations to address the research questions. The conclusion, finally, should describe the conclusions that you reached, discuss possible explanations for the trends that you found in the data, and hint at possible future research to address the issues that have arisen from the above. The report will be graded as follows: Elements Points Introduction Clear and succinct description of the background of the project, including, where appropriate, references to external sources. 10 Ambitiousness of the project 20 Clear and succinct description of the research questions, ideally formulated as actual questions. 10 Methodology Clear and succinct description of the data sources used in the project, including, when appropriate, where you sourced them. 10 Analysis Clear and succinct description of how you used the data that you sourced and visualizations to address the research questions. Ideally, the report will contain images of the visualizations. 10 Appropriateness of visualizations to address the research questions (e.g, is all the information that seems relevant to the question included in the visualization, is there no superfluous information, is the chart type selected for the visualization appropriate?). 10 Conclusion Clear and succinct conclusions for both research questions, including justifications. 10 Identification of additional research questions, ideally formulated as actual questions. I only expect you to identify them; I do not expect you to actually answer them as well. 5 Language and formatting Appropriate and professional formatting. 5 Correct spelling. 5 Correct grammar. 5 Perhaps the most nebulous of the above is the second item, namely the ambitiousness of the project. It is hard to define this concept more precisely, other than to give some examples. Here are some examples of projects that lack ambition · A project that simply redoes one of the exercises in the course but with slightly different data (e.g. the car sales example but using sales data from another domain). · A project that uses data from a single web source, such as kaggle, or the census. · A project that uses artificial data. · A project that compares some phenomenon (e.g., number of car crashes, number of police-involved killings) by some geographical area (state or county) but does not normalize for population size. You would expect California to have more of almost everything than Vermont because its population is so much larger, but the picture may look very different if you normalize for population size. On the other hand, a project that combines data from completely different sources (e.g., COVID data from the Florida Department of Health with demographic data from the Census Bureau) is ambitious. The notion is that you stretch yourself. We will return to this issue on the next page which discusses the issue of selecting data sources for your project.