This assignent still references the London Weather dataset mentioned in my last assignment (Order # 115126)
Complete the business case analytics and present a comprehensive business report.
This is the final phase of the data analytics project. With analysis results on hand, now you are applying them towards the business objectives. Keep in mind that the final outcome is an analysis report to demonstrate the conclusions from your analytical work, including how the results can be used for business objectives. Your final report must be business centric with data analytics as the supporting foundation.
Some of the key tasks include:
Continue the analysis results with at least one data visualization representation.
Combine all documents into a comprehensive business report
London Weather Data (Historical London weather data from 1979 to 2021) Introduction Climate change is undoubtedly one of the biggest challenges of our times. The increase of average temperature in the last centuries has already caused many harmful problems, like intense drought, storms, heat waves or rising sea levels. The primary objective of this exploratory data analysis is to uncover if the global trends are also visible in London’s weather. Data About Dataset Context The dataset featured below was created by reconciling measurements from requests of individual weather attributes provided by the European Climate Assessment (ECA). The measurements of this particular dataset were recorded by a weather station near Heathrow airport in London, UK. -> This weather dataset is a great addition to this London Energy Dataset. You can join both datasets on the 'date' attribute, after some preprocessing, and perform some interesting data analytics regarding how energy consumption was impacted by the weather in London. Content The size for the file featured within this Kaggle dataset is shown below — along with a list of attributes and their description summaries: · london_weather.csv - 15341 observations x 10 attributes 1. date - recorded date of measurement - (int) 2. cloud_cover - cloud cover measurement in oktas - (float) 3. sunshine - sunshine measurement in hours (hrs) - (float) 4. global_radiation - irradiance measurement in Watt per square meter (W/m2) - (float) 5. max_temp - maximum temperature recorded in degrees Celsius (°C) - (float) 6. mean_temp - mean temperature in degrees Celsius (°C) - (float) 7. min_temp - minimum temperature recorded in degrees Celsius (°C) - (float) 8. precipitation - precipitation measurement in millimeters (mm) - (float) 9. pressure - pressure measurement in Pascals (Pa) - (float) 10. snow_depth - snow depth measurement in centimeters (cm) - (float) Source Weather Data - https://www.ecad.eu/dailydata/index.php Methods This first step is essentially about understanding the business, the need for the specific project and the resources that we have Make data more accessible and understandable for everyone Provide fast analytics for the different Factors Affecting Weather. Use different prediction and classification algorithms calculating and comparing evaluation measures. Exploratory data analysis Mean temperatures in last 40 years Max temperatures in last 40 years Precipitation in last 40 years Number of days a year with downpours (top 5 year) year precipitation 2002 25 2009 22 2001 21 2014 20 2000 19 Number of days a year (top 10) without rain year precipitation 1990 229 1997 216 1995 216 1996 215 1991 214 2003 212 1989 212 2018 210 1993 201 1983 199 Top 5 hottest years year max_temp 2020 16.787778 2018 16.718630 2014 16.626575 2019 16.450137 2011 16.379726 Hot waves date month year max_temp 24 7 2019 37.9 9 8 2003 37.9 31 7 2020 37.9 30 6 2015 36.7 7 8 2020 36.5 Top 10 lowest temperature in last 40 years date month year max_temp 11 1 1987 -6.2 15 1 1985 -4.0 6 2 1991 -3.4 12 1 1987 -3.3 10 1 1987 -3.0 8 2 1991 -2.5 7 2 1991 -2.1 9 2 1985 -1.8 7 1 1982 -1.8 31 12 1996 -1.7 Snow depth - change between 1979 and 2020 Summary exploratory data analysis The main aim of this short exploratory analysis was to check main trend in the London weather in last 50 years. Here are concussions: there is a significant increasing trend in average mean, min and max temperature 1. 5 hottest years was after 2010 2. 4 of 5 hottest days in last 40 years was after 2015 (!) 3. top 10 lowest temperatures was before 2000 4. precipitation is more extreme 5. top 5 years with the highest number of days with strong precipitation was after 2000 6. Minimum and Maximum temperature in last 40 years are 37.9 & -6.2 respectively Classification Now we will use the classification approach to classify weather condition so we will first use classification technique on imbalanced data and check the equation whether model is performing good for lower class or not if our model will not not classifying lower class security then we will use SMOT technique to balance the data set and then apply them classification model and check the recall bias in the case of balance and imbalance data set First we will apply classification model on imbalance data set Using decision tree classifier for classification of weather condition The accuracy of model come out to be 73% But the recall bayes for the minority class is zero it proves that the model is more biased towards the majority class that is why model is not the best model the best model. Now we will apply imbalance data techniques SMOT and see the accuracy of model and recall bias up minority class After Oversampling our model accuracy decreases approximately by 10% but the recall bias for minority class increases sharply so above model is not bias towards minority class.