COMP1730/6730 S1 2021 — Project Assignment COMP1730/6730 S1 2021 — Project Assignment Jeffrey Fisher XXXXXXXXXX Important • The assignment is due 9:00 am Monday May 24 (in week 12). • The code for the...

1 answer below »
Should do the 6730 extra report questions


COMP1730/6730 S1 2021 — Project Assignment COMP1730/6730 S1 2021 — Project Assignment Jeffrey Fisher 03-05-2021 Important • The assignment is due 9:00 am Monday May 24 (in week 12). • The code for the assignment can be developed in groups of up to four people. • The report is individual - you must write it entirely on your own. • COMP6730 students have to write a more extensive report. • Include your university ID in every file you submit. • Include the university ID of every member of your group in any code files you submit. • This assignment is worth 25% of your grade for COMP1730/COMP6730. Groups • The code for the assignment can be completed in groups of up to 4 people. If you wish to work in a group you should sign-up for one in the sign-up form in Wattle (all members should sign-up to the same group). • The report is individual, i.e. you must do it on your own. We will check for plagiarism and other suspicious activities in the reports within groups as normal. • If you do not wish to work with other people, please sign-up to the I will do the assignment on my own group instead. • There is no difference in size/scope/marking criteria based on your group size. Unless you have strong feelings otherwise, we recommend you work in a group of 3 or 4 people. • COMP6730 and COMP1730 students can be part of the same group. • You are not limited to members of your tutorial discussion groups - you can form a group with anyone else enrolled in the course. • The sign-up link in Wattle is here. • Group sign-ups will close at 9:00am Monday 10 May. Anyone not signed up to a group at that point will be added to the I will do the assignment on my own group. Overview In this assignment you will be doing a series of data analysis and modelling tasks using some real-world geographical data. It is different to the homework assignments you have done up until this point in that for almost all of the questions, there is no single “right” answer. You are also not given any tests against which to check your answers (although you are encouraged to write your own to help you test the correctness of your functions). Because there is no single “right” answer, it will be important to justify the decisions and choices that you make while completing the assignment. This is important since it allows anyone relying on your results and conclusions to understand how they were obtained and whether they are suitable for a particular purpose. The Cotter River provides the ACT with the majority of its water supply. The river stretches over 70 kilometers from the South West edge of the ACT, Northward until it joins with the Murrumbidgee River, 1 https://wattlecourses.anu.edu.au/mod/groupselect/view.php?id=2137144 just below the Cotter Dam. In addition to the Cotter Dam, there are two other reservoirs along the Cotter River, Bendora Dam and Corin Dam. These three dams store the majority of water used in the ACT. The Cotter Dam (left) and Corin Dam (right) are both overflowing after the wet Summer and Autumn we’ve had in Canberra this year. For this assignment, we have obtained elevation data for the majority of the Cotter River catchment area, at a 5 meter resolution. You will be analysing this data and answering some questions about the region, including some questions that are relevant to Canberra’s drinking water supply. The Data We have provided you with two csv files - elevation_data_small.csv and elevation_data_large.csv. These two csv files contain height information on a 5 metre grid. The elevation_data_small.csv file contains just the Cotter Dam region. The elevation_data_large.csv contains the entire Cotter River catchment area including the Cotter Dam, Bendora Dam and Corin Dam, as well as the surrounding mountain ranges and the Namadgi National Park. You can see a heatmap of the two data sets in the images below. Brighter colours are higher elevation. You can also see the Cotter Dam on Google maps here, Bendora Dam here and Corin Dam here. Elevation data for the Cotter Dam (left) and Cotter River catchment area (right). 2 https://www.google.com/maps/place/Cotter+Dam,+Australian+Capital+Territory/@-35.3237551,148.9152989,14.67z/data=!4m5!3m4!1s0x6b17b76c44194b5f:0xc7482c48c1462c8!8m2!3d-35.3202853!4d148.939818 https://www.google.com/maps/place/Bendora+Dam/@-35.4638766,148.8335277,12.63z/data=!4m5!3m4!1s0x6b17c389e536a9cd:0xe832a1de46823c11!8m2!3d-35.4464606!4d148.828355 https://www.google.com/maps/place/Corin+Dam,+Cotter+River+ACT+2611/@-35.5449483,148.8241156,14.67z/data=!4m5!3m4!1s0x6b17db7b770167cf:0x34d44ef9262ceac!8m2!3d-35.534741!4d148.83527 The elevation_data_small file looks like this (but with a lot more rows and columns): 693.366,692.038,690.964,690.964,... 693.406,692.079,691.025,691.025,... 693.383,692.039,691.018,691.018,... 693.457,692.085,691.058,691.058,... 693.457,692.107,691.091,691.091,... ... ,... ,... ,... ,... All elevation values are in meters. This means that the elevation of the NorthWest most point in the region is 693.366m, the point 5 meters to the East of it has elevation 692.038 meters and so forth. If we need to refer to a specific cell in the data, we can do so using its x and y coordinates. We’ll use matrix style indexing so the origin (x=0, y=0) is the top left grid cell, rather than the bottom left grid cell you might see on a traditional graph. The coordinate at x = 2 and y = 4 means the 3rd column from the left and the 5th row from the top (highlighted in yellow in Figure 1). Figure 1: Indexing Example Be aware that the same location will have different coordinates in the two data sets. If you are not sure how to read and process CSV files, have a look at Labs 6 and 8 in order to remind yourself. Please also keep in mind that even the elevation_data_small file is not actually that small, it contains roughly one million points. You may need to consider efficiency when completing the assignment. The Task You are provided with assignment_template.py, which contains the basic functions of the assignment. The functions are incomplete. In this assignment, you will fill in the blanks and complete the missing functions. However, we also encourage you to use functional decomposition where appropriate, i.e. you may (and should) add additional functions as necessary. You will also write a short report about your functions and decisions. For Questions 1 through 5 you should just make use of the elevation_data_small file. Question 6 requires you to use the elevation_data_large file as well. Please be aware that if you try and test Questions 1 through 5 using the large data set, you will need to do the cleaning/preparation (described at the start of Question 6) or you will get nonsensical results. Question 1: Reading the Data - 10 marks Write a function that takes the file path of the CSV file as input, reads the file, and returns the data in a suitable format. The assignment template contains a function for you to fill in: def read_dataset(filepath): pass pass means “do nothing”, and you should remove it when you fill in this function. To load the data, you can then run 3 data = read_dataset('elevation_data_small.csv') as long as the CSV file is in the same directory as your assignment file. If it is elsewhere, you’ll need to provide the file path instead of just the file name. You should read the data from filepath, and return it in an easy-to-use format. This can be any data type or data structure that you like, as long as it makes sense for the tasks you will be doing later in this assignment. You will be using this returned value in all other questions of the assignment, so make sure your choices here support your later solutions! Hint - have a look at the remaining questions before deciding on what format to load your data in! Question 2: Summary Statistics - 10 marks Now that we have a function to read in the data set, it’s time to do some analysis. We’ll start by calculating some basic statistics about the data. There are three function to fill in for Question 2: def minimum_elevation(data_set): pass def maximum_elevation(data_set): pass def average_elevation(data_set): pass The input to each of these functions should be the data set returned by Question 1. The output, should be the minimum elevation, the maximum elevation and the average (mean) elevation respectively of the region covered by the data set. All return values should be in meters. The minimum and maximum are each worth 3 marks. The average is worth 4 marks. Question 3: Gradient - 10 marks The Cotter River valley is pretty rugged country. There are steep mountain ranges on either side of the river for most of its length. It’s useful to know how steeply sloped an area is. For example, it would be used when planning walking or fire trails, risk of landslides, assessing bushfire risk, and so on. For a given cell we calculate the slope by subtracting the elevation in the cell on its left from the elevation in the cell on its right - then dividing by 10 (the horizontal distance). This is the x gradient. Then subtract the elevation in the cell below from the elevation in the cell above, and again divide the result by 10. This is the y gradient. Square both gradients, add them together, then take the square root. This is the slope, or total gradient. Mathematically if ex,y is the elevation for cell (x, y), the slope at cell (x, y) is calculated as: slopex,y = √ ((ex+1,y − ex−1,y)/10)2 + ((ex,y+1 − ex,y−1)/10)2 Fill in the function: def slope(data_set, x_coordinate, y_coordinate): pass It should take as inputs the data set (returned by Question 1), an x coordinate and a y coordinate and return the total gradient at the corresponding cell. Hint: You may need to consider the edges of the map separately. 4 Question 4: Surface Area of the Dams - 10 marks The areas covered by the two elevation data sets are particularly important for Canberra’s water supply. There was a period of time in the not too distant past where the level in all the dams was dangerously low, resulting in severe restrictions being placed on water usage in the ACT. One way of measuring the water in the dam is by calculating its extent - or surface area. Our elevation data just contains the elevation at the surface - regardless of whether it is water or land. However, if we assume that the dam is all approximately the same level, then as long as we know the elevation of a single point on the dam, we can figure out the surface area. Complete the following function in the assignment template: def surface_area(data_set, x_coordinate, y_coordinate):
Answered 4 days AfterMay 18, 2021

Answer To: COMP1730/6730 S1 2021 — Project Assignment COMP1730/6730 S1 2021 — Project Assignment Jeffrey Fisher...

Shreyan answered on May 21 2021
162 Votes
Modified_Solution/assignment.py
"""
This is the assignment template for the COMP1730/COMP6730 major assignment
for Semester 1, 2021.
The assignment is due at 9:00am on Monday 24 May.
Please include the student IDs of all members of your group here
Student Ids:
"""
import pandas as pd
import numpy as np
import math
import matplotlib.cm as cm
from matplotlib import pyplot as plt
from skimage import data, filters, color, morp
hology
from skimage.segmentation import flood, flood_fill
import datetime
from sklearn import preprocessing
# Question 1:
def read_dataset(filepath):
df = pd.read_csv(filepath, sep=',')
return df
# Question 2:
def minimum_elevation(data_set):
minValuesCol = data_set.min(axis = 0)
minValue = minValuesCol.min()
return minValue
def maximum_elevation(data_set):
maxValuesCol = data_set.max(axis = 0)
maxValue = maxValuesCol.max()
return maxValue
def average_elevation(data_set):
sumValuesCol = data_set.sum(axis = 0)
sumValue = sumValuesCol.sum()
noOfVals = data_set.size
return (sumValue/noOfVals)
# Question 3
def slope(data_set, x_coordinate, y_coordinate):

if y_coordinate == 0:
e_left = 0
else:
e_left = data_set.iloc[x_coordinate,y_coordinate -1]

if y_coordinate == data_set.shape[1]-1:
e_right = 0
else:
e_right = data_set.iloc[x_coordinate][y_coordinate +1]

if x_coordinate == 0:
e_up = 0
else:
e_up = data_set.iloc[x_coordinate-1][y_coordinate]

if x_coordinate == data_set.shape[0]-1:
e_down = 0
else:
e_down = data_set.iloc[x_coordinate+1][y_coordinate]

e_gradient = math.sqrt(((e_left - e_right)/10)**2 + ((e_up - e_down)/10)**2)
return e_gradient

# Question 4
def surface_area(data_set, x_coordinate, y_coordinate):

df = np.zeros(data_set.shape, dtype = 'int')
e = data_set.iloc[x_coordinate,y_coordinate]
tol = 10
area = 0

m = data_set.shape[0] -1
n = data_set.shape[1]-1
# creating a bi-modal image based on threshold = tol

for i in range(0,m):
for j in range(0, n):
df[i][j] = 255 if ((data_set.iloc[i][j] >= e-tol) or (data_set.iloc[i][j] <= e+tol)) else 0
df = flood_fill(df, (x_coordinate, y_coordinate), 127)

# Calculating the area

for i in range(1,m):
for j in range(1, n):
if df[i][j] == 127:
area = area +1
return area
# Question 5:
def expanded_surface_area(data_set, water_level, x_coordinate, y_coordinate):

df = np.zeros(data_set.shape, dtype = 'float')

area = 0
m = data_set.shape[0] -1
n = data_set.shape[1]-1

# creating a bi-modal image based on threshold = water_level

for i in range(0,m):
for j in range(0, n):
df[i][j] = 255 if (data_set.iloc[i][j] <= water_level) else 0

df = flood_fill(df, (x_coordinate, y_coordinate), 127)

# Calculating the area

for i in range(1,m):
for j in range(1, n):
if df[i][j] == 127:
area = area + 1

plt.imshow(df, cmap = 'gray') # plots the diagram of the catchment area
return area
# Question 6:
def impute_missing_values(data_set):

data_set = data_set.applymap(clean_data)
return data_set
def clean_data(item):
if item < 0:
return 1600
else:
return item
# You'll need to decide what other functions you want for Question 6
# It should be clear from your code, what we need to do in order to produce the plot(s).
def produce_plot(data_set, x, y):
max = maximum_elevation(data_set)
df1 = flood_fill(data_set, (x, y), 1.5*max, tolerance=10)
plt.imshow(df1, cmap = 'gray')
# Code in the following if statement will only be executed when this file is run - not when it is imported.
# If you want to use any of your functions (such as to answer questions) please write the code to
# do so inside this if statement. We'll cover it in more detail in an upcoming lecture.
if __name__ == "__main__":
filename = 'elevation_data_small.csv'
data_set = read_dataset(filename)

# Finding the minimum, maximum and...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here