Please find attached, in CSV format, the dataset with filename
2019 EVCP use Q3 and Q4.csv.
This is an openly available dataset from the Web link:
https://datamillnorth.org/dataset/electric-vehicle-chargepoints-in-council-car-parks-
The dataset describes information on the usage data for electric vehicle chargepoints in Council car parks in Leeds, UK, including how often they are used, duration of charge, electricity consumed and type of vehicle.
Use Excel, Power BI, RapidMiner, Weka, Python, R, MATLAB, or other software (or software packages) you consider suitable, or their combinations, to perform analysis on the data, and answer the below
3
questions.
Question 1.
Before performing any analysis, but by (visually) inspecting the data,
describe as completely as possible, the data features or attributes, how the data may be acquired, their values, sizes and availability, any general patterns/trends, and the general nature of the data (e.g. the” 5 C’s” for data quality).
[30 marks]
Question 2.
Use software or software packages to answer the following:
(a) Data pre-processing:
(i) Are there any missing data? [2 marks]
(ii) Can the missing data be easily imputed? State why or why not. [2 marks]
(iii) Delete all rows with missing values in the column(s), and save it using a different filename. How many
rows
are left after deletion? [5 marks]
(b). Analytics:
(i) What are the
average
and
standard deviation
of the power used (in kWh) for the 2 quarters (Q3 and Q4) combined? [5 marks]
(ii) Re-calculate the
average
and
standard deviation
of the power used (in kWh) only for quarter Q3. Then repeat for quarter Q4. What are their
values? [5 marks]
Is the
average
power used for Q3 higher or lower than that for Q4?
[2 marks]
(iii) Which
site
has the
highest
total power usage? [2 marks]
Within this site, which
connector
has the
highest
total power usage?
[2 marks]
(iv) For Q3 and Q4 combined, plot a
bar chart
of
car park ID (CP ID)
in the horizontal axis and the
total
power used (in
kWh) on the vertical axis. [7 marks]
Based on the plotted graph, which car park ID (CP ID) has the
highest
usage of power for the two quarters (Q3 and Q4) combined? What is its total power usage
value? [2 marks]
Based on the plotted graph, which car park ID (CP ID) has the
lowest
usage of power for the two quarters (Q3 and Q4) combined? What is its total power usage
value? [2 marks]
(v) For Q3 and Q4 combined, plot a
bar chart
of
User ID
in the horizontal axis and the
total
power used (in
kWh) on the vertical axis. [7 marks]
Which User ID has the
highest
usage of power for the two quarters (Q3 and Q4) combined? What is its total power usage
value? [2 marks]
Sub-total: 45 marks
Question 3.
Describe
in detail:
(a). the (i)
usefulness
of this dataset; [5 marks]
and (ii)
limitations
of this dataset, [5 marks]
and what can be done to
improve
the data in terms of
data collection/curation; [5 marks]
(b). any (i) opportunities for policymaking; [5 marks]
and (ii) potential business opportunities.