The Project Assignment Screenshot and the data file is attached.
Please code in PYTHON and include comments as to what the approach was. For example if you chose boxplot as the method to display distribution, then what was the reason and how did it determine data imbalance. Same for other questions. Perhaps a separate doc would be ideal. Keep in mind to develop TWO different models to classify CTG featuresthe three fetal health states (have comments on top displaying that the models are being developed)
I asked this question before in Unifolks and for reference I will attach the previous code, but please provide a new code with explanations, reasons and approaches.
Thank you so much for your time and help.
About this dataset Reduction of child morta is reflected in several of the United indicator of human progress The UN expects Br by 2000, countries en preventable dest of newborns and chien under 5 years of ge. vith wniies sing t reduce under 5 mertalty toa east 2 ow 35 25 per 1.000 ie births. fons Sustainable Development Gost and is 3 ey. Parsi ta notion of accounts for 295 000 deaths during nd following pregnancy snc chidtith (5 of 2017). The vas msjarity cf these deaths (9A) cecurre in a resource Seti, nt me could have been prevented it mertaty = of course mater marty, wh In ght of what was mentioned above, Cardiotocograms (CTG) rc 2 simple and cost accessible option tosses ot Hest, loving hestncars profesionals ta take ction in Crder to prevent child and maternal marty. The equipment ite wors by sending utasound ses and reading 1 response, hs shedding ht on fetal heart rte (FH fetal movements, uterine contractions and more. Data The dats 4 contains 2126 recorcs of festures extract from Cardiotocogram exams, which wre then classified by three cxpert obsetritians into 3 asses: + Normal + suspect + Pathological reate 3 model to classi he outcome of Cartitocogram (CTG) exam which represents the wellbeing of th fetus). Notes Note that this 3 multiclass problem that can fsa be reste 3 regression since the labels ae progressive. Tasks: + Present a visual distribution of the 3 cisses the cats balanced? How co you plan to ircumyent the dats imbalance problem, if there is one? (int: stratification needs to be included) (1) + Prosent 10 features that are most refeciv to ftal heath conditions there are more than one way of selecting features and any of these ae acceptable]. Present f th correlation i statistical significant using 95% and 90% artical values). (2) + Develop two diferent. model to classify CTG features nto the tree fetal hese states ntentonslly id not ame which two models. Note that tis is multiclass problem tht can aso be trated 2s regression since the labels are numeric) 2+2) + Visually present the confusion matrix (1) + With testing set of siz of 30% of al availabe dat, calculte (1.5) Area under the ROC Curve. E15 Area under the Precision: Recall Curve for both models in 3) + Without considering the cass bel stribute, us k means clustering to luster the records in diffrent chsters and visualize them use k 0 be 5.10. 15.2.5) Wit to submit? «a code .3 document that adequately describes your appro vislzations and othe results