Microsoft Word - Assessment-1-T2-2022.docx SIT720 Machine Learning Assessment Task 1 : Problem solving task . SIT720 Machine Learning Assessment Task 1 : Problem solving task . Background Tracking oil...

1 answer below »
attached files






Microsoft Word - Assessment-1-T2-2022.docx SIT720 Machine Learning Assessment Task 1 : Problem solving task . SIT720 Machine Learning Assessment Task 1 : Problem solving task . Background Tracking oil temperature is important for saving electrical transformers from damaging due to electricity overloading. In this assessment, we are going to do some simple analysis on a dataset which contains information about electricity loading and corresponding oil temperatures recorded from electrical transformer at different date and time. Dataset Dataset file name: ETTh1.csv Dataset description: Dataset contains different features along with the oil temperature. It contains total 8 features, which are either floating point values or datetime. Feature names, data type and their descriptions are given in the following section. Data may contain 'null' or 'nan' values. Features: I. date (datetime): The recorded date II. HUFL (float): High useful load III. HULL (float): High Useless Load IV. MUFL (float): Middle useful load V. MULL (float): Middle useless load VI. LUFL (float): Low useful load VII. LULL (float): Low useless load VIII. OT (float): Oil temperature _____________________________________________________________________________________ Questions _____________________________________________________________________________________ 1. Load the data from supplied data file. Remove the observations/samples where the oil temperature is less than 4. Print the data dimension before and after removing the observations/samples. 2. Load the data from supplied data file. Display the number of rows and their indices that have missing value in one or more cells. Now, replace the missing values by the lowest value of the corresponding feature only if the data type is numeric. Print the mean values of all features before and after replacing missing data and explain the reason if they are not same. 3. Continue from question 2. Find the date where oil temperature is more than 20 and display those observations. 4. Continue from question 3. Which feature(s) having the strongest influence on oil temperature? 5. Load the data from supplied data file. Print the average value of all numeric features where oil temperature is less than 7 and greater than 4. 6. Display the number of observations that satisfies following conditions: a. OT value is within the range between 10 and 25 b. HUFL is positive c. 4 < mufl="">< 11 7. print the average of hull for observations, which has ot value between -5 and 10. is this higher than hull average calculated using the complete dataset? 8. create and print data frame where ot values are in the range from 1 to 40 with an interval of 5 (ex. 1-5, 6-10…, 36-40). 9. continue from question 8. for each group display the minimum and maximum of all the features except date and ot. 10. create data frames containing observations recorded in the date range either between january and june or between july and december for individual year (ex: 2016, 2017..etc). this means there will be two data frames for each year as long as there is one observation that falls in either january-june or july-december date range. store the data frames in separate csv file named ‘year_monthstart_monthend.csv’ in the same directory of your code file, where value of ‘month_start’ is either january or july and value of ‘monthend’ is either june or december based on the date range of the corresponding data frame. 11. visualise variations of ot values with respect to other features (one at a time) except ‘date’. explain your findings briefly. 12. draw a pie chart summarising number of observations per year. 13. create histogram of ot values and explain the graph. 14. load the data from supplied data file. add three more columns titled [‘hufl_hull’, ‘mufl_mull’’,’ lufl_lull’’] where, hufl_hull = (hufl + hull) / 2, mufl_mull = (mufl + mull) / 2, and lufl_lull = (lufl + lull) / 2. save the updated data frame to a csv file named ‘average_data.csv’ in the same directory of your code file. 15. continue from question 14. drop the row where any of the newly added features value is null. finally, display the filtered observations and save to a csv file named ‘average_clean_data.csv’ in the same directory of your code file. ©deakin university 1 sit720 ©deakin university 1 sit720 11="" 7.="" print="" the="" average="" of="" hull="" for="" observations,="" which="" has="" ot="" value="" between="" -5="" and="" 10.="" is="" this="" higher="" than="" hull="" average="" calculated="" using="" the="" complete="" dataset?="" 8.="" create="" and="" print="" data="" frame="" where="" ot="" values="" are="" in="" the="" range="" from="" 1="" to="" 40="" with="" an="" interval="" of="" 5="" (ex.="" 1-5,="" 6-10…,="" 36-40).="" 9.="" continue="" from="" question="" 8.="" for="" each="" group="" display="" the="" minimum="" and="" maximum="" of="" all="" the="" features="" except="" date="" and="" ot.="" 10.="" create="" data="" frames="" containing="" observations="" recorded="" in="" the="" date="" range="" either="" between="" january="" and="" june="" or="" between="" july="" and="" december="" for="" individual="" year="" (ex:="" 2016,="" 2017..etc).="" this="" means="" there="" will="" be="" two="" data="" frames="" for="" each="" year="" as="" long="" as="" there="" is="" one="" observation="" that="" falls="" in="" either="" january-june="" or="" july-december="" date="" range.="" store="" the="" data="" frames="" in="" separate="" csv="" file="" named="" ‘year_monthstart_monthend.csv’="" in="" the="" same="" directory="" of="" your="" code="" file,="" where="" value="" of="" ‘month_start’="" is="" either="" january="" or="" july="" and="" value="" of="" ‘monthend’="" is="" either="" june="" or="" december="" based="" on="" the="" date="" range="" of="" the="" corresponding="" data="" frame.="" 11.="" visualise="" variations="" of="" ot="" values="" with="" respect="" to="" other="" features="" (one="" at="" a="" time)="" except="" ‘date’.="" explain="" your="" findings="" briefly.="" 12.="" draw="" a="" pie="" chart="" summarising="" number="" of="" observations="" per="" year.="" 13.="" create="" histogram="" of="" ot="" values="" and="" explain="" the="" graph.="" 14.="" load="" the="" data="" from="" supplied="" data="" file.="" add="" three="" more="" columns="" titled="" [‘hufl_hull’,="" ‘mufl_mull’’,’="" lufl_lull’’]="" where,="" hufl_hull="(HUFL" +="" hull)="" 2,="" mufl_mull="(MUFL" +="" mull)="" 2,="" and="" lufl_lull="(LUFL" +="" lull)="" 2.="" save="" the="" updated="" data="" frame="" to="" a="" csv="" file="" named="" ‘average_data.csv’="" in="" the="" same="" directory="" of="" your="" code="" file.="" 15.="" continue="" from="" question="" 14.="" drop="" the="" row="" where="" any="" of="" the="" newly="" added="" features="" value="" is="" null.="" finally,="" display="" the="" filtered="" observations="" and="" save="" to="" a="" csv="" file="" named="" ‘average_clean_data.csv’="" in="" the="" same="" directory="" of="" your="" code="" file.="" ©deakin="" university="" 1="" sit720="" ©deakin="" university="" 1="">
Answered Same DayAug 04, 2022

Answer To: Microsoft Word - Assessment-1-T2-2022.docx SIT720 Machine Learning Assessment Task 1 : Problem...

Aditi answered on Aug 05 2022
83 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here