CSCI XXXXXXXXXXThe Process of Data Science - Summer 2019 Assignment 3 Due date: August 2nd, XXXXXXXXXX:59:59 pm. The submission must be done through brightspace. Teams of 2 students ● Cite any and all...

1 answer below »
Python Data science assignment


CSCI - 5901 - The Process of Data Science - Summer 2019 Assignment 3 Due date: August 2nd, 2019 11:59:59 pm. The submission must be done through brightspace. Teams of 2 students ● Cite any and all resources used. ○ books, websites (other than documentation) like stackoverflow. ● I will use plagiarism tools to detect any type of cheating and copying (your code and PDF). ● Write all of your comments and explanations in the code as a text cell. ● Two sides of Data Science (and your mark): ○ Technical // does it work? ■ Quality of code (documentation, naming) ■ Does it work ○ Conceptual // what did you find from the data? ■ Quality of the data insights ■ Quality of the discussion text ○ Questions will be marked individually. Their weights are shown in parentheses after the question number. ● Your submission is a single Jupyter notebook and a PDF (With the compiled results generated by your Jupyter notebook). Filename should be A3--.jpynb and A3--.pdf​. ​Please upload to Brightspace. Please include your B# in your Jupyter notebook and PDF. ● Forgetting to submit both files results in 0 markings for both students. In this question, you are going to solve a spatiotemporal problem and work with some streaming techniques. Follow the steps and provide the answer to each question. The total marks in this assignment are 130. If you score more than 100, I will add the extra points for the previous assignments. In case you score 100 in all assignments, I will proportionally add the bonus points to your midterm. Download Nima Ports datasets from here: Nima_Ports.Zip. This is a shapefile that you are going to load in your notebook. Download AISdata.zip which is a CSV file similar to ferry AIS data in the lab practice. 1. Find all the vessels that visited ports in Canada. For this part, you are going to create a buffer with a 5 ​km​ radius around the center of each all Canadian ports. Second, you are going to find all the AIS data that intersect with these ports. (​20 points​) 2. Show the density (i.e., density is the number of vessels in that port), of each port on a map by using a colour-coded map. (​20 points​) 3. Now divide the AIS data into data frames with one-hour interval. Repeat steps 1 and 2 for all of the sub-dataframes. Here each data frame has only information of one hour. (​20 points​) 4. Select any port you like. Create a temporal chart for the density of that port. Your x is the time and each snapshot of the time has the density of port at a specific hour. (​20 points​) 5. Use concept drift methods on step 4 and find out if there is any drift in the data can be detected. Try to play with the input parameters and justify the one you chose. Try to explain why the drift was detected, what characteristics changed? (​25 points​) 6. Cluster the ports based on their density using DBSCAN and categorize the Canadian ports based on traffic (density). Repeat this procedure for both dataframes created in steps 2 and 3. (​25 points​)
Answered Same DayJul 20, 2021

Answer To: CSCI XXXXXXXXXXThe Process of Data Science - Summer 2019 Assignment 3 Due date: August 2nd,...

Anirban answered on Aug 01 2021
159 Votes
Answer to question 1
For this section, the data is grouped by event time as available in the dataf
ile which
results are listed below. Plotting the Map location of the ports available from the ports data
using the geo pandas we got the specific Shape With The Help Of Matplotlib.
Through the use of the matplotlib, it helps in plotting the shape or the...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here