please check the doc below
Assignment Step 0. Save the following RDS file to your R working directory https://dl.dropboxusercontent.com/s/8j886jtexoeiv1v/imdb.rds Step 1. Preprocess the imdb data using the following code library(keras) library(tidyverse) set.seed(123) n_sample <- 3000;="" maxlen="">-><- 200;="" max_features="">-><- 3000="" imdb="read_rds("imdb.rds")" c(c(x_train,="" y_train),="" c(x_test,="" y_test))="">-><-% imdb="" #="" loads="" the="" data="" x_train="">-%><- pad_sequences(x_train,="" maxlen="maxlen)" x_test="">-><- pad_sequences(x_test,="" maxlen="maxlen)" sample_indicators="sample(1:nrow(x_train)," n_sample)="" x_train="">-><- x_train[sample_indicators,]="" #="" use="" a="" subset="" of="" reviews="" for="" training="" y_train="">-><- y_train[sample_indicators]="" #="" use="" a="" subset="" of="" reviews="" for="" training="" x_test="">-><- x_test[sample_indicators,]="" #="" use="" a="" subset="" of="" reviews="" for="" testing="" y_test="">-><- y_test[sample_indicators] # use a subset of reviews for testing step 2. use x_train and y_train to fit the following deep learning models: 1. simple rnn 2. lstm 3. gru 4. bidirectional lstm 5. bidirectional gru 6. 1d convnet you can decide the parameters for the network structure (e.g., units, number of layers, etc) and model training (e.g., epochs, batch_size and validation_split). however, you need to find parameters such that the accuracy of each trained model on testing data should be at least 0.6. step 3. save the following files save each of these fitted models to an h5 file save the history of each model to an rds file (see write_rds) save x_test and y_test to rds files step 4. save the r code you used for steps 1 to 3 to an r file step 5. compress all the output files from step 3 to a zip file step 6. use r markdown to achieve the following: 1. specify author, date, and title in the yaml metadata of your document 2. read all the output files from step 3 (x_test, y_test, 6 fitted model files, and 6 training history files) 3. use x_test and y_test to show the following statistics: number of reviews in the test set number of positive reviews in the test set number of negative reviews in the test set 4. for each model: show model summary plot the training history (do not train your model in rmarkdown!) evaluate the performance of the model using the test set 5. summarize the performance of different models using a table. columns include model_name acc: overall accuracy of the predictions in the test set n_tp: number of true-positive predictions in the test set n_tn: number of true-negative predictions in the test set n_fp: number of false-positive predictions in the test set n_fn: number of false-negative predictions in the test set 6. discuss what you found from the table step 7. knit the r markdown file (.rmd) to an html file step 8. the r, rmd, html, zip files must follow the naming rule below: assignment3-yourlastname.fileextension for example: assignment4-lin.r assignment5-lin.rmd assignment6-lin.html assignment7-lin.zip step 9. submit the r, rmd, html, and zip files (individually) to icollege due by the beginning of next class extra credit: the student who has the best report (determined by the instructor) will be given 5 extra points towards the final grade submissions that are too similar would not be considered for the extra credit accuracy of the models plays a significant role for this extra credit grading is based on the following: grading is based on the submitted files on icollege. do not wait till the last minutes before the deadline. you will lose 10 points for late submission. you will receive 0 point if you submit your assignment via email. whether all required files were submitted to icollege on time, following the naming rule whether the rmd file is syntactically correct and can render the html file whether the report has a professional format and style (succinct and yet provides adequate and clear discussions) whether the report meets the requirements specified in step 6 y_test[sample_indicators]="" #="" use="" a="" subset="" of="" reviews="" for="" testing="" step="" 2.="" use="" x_train="" and="" y_train="" to="" fit="" the="" following="" deep="" learning="" models:="" 1.="" simple="" rnn="" 2.="" lstm="" 3.="" gru="" 4.="" bidirectional="" lstm="" 5.="" bidirectional="" gru="" 6.="" 1d="" convnet="" you="" can="" decide="" the="" parameters="" for="" the="" network="" structure="" (e.g.,="" units,="" number="" of="" layers,="" etc)="" and="" model="" training="" (e.g.,="" epochs,="" batch_size="" and="" validation_split).="" however,="" you="" need="" to="" find="" parameters="" such="" that="" the="" accuracy="" of="" each="" trained="" model="" on="" testing="" data="" should="" be="" at="" least="" 0.6.="" step="" 3.="" save="" the="" following="" files="" save="" each="" of="" these="" fitted="" models="" to="" an="" h5="" file="" save="" the="" history="" of="" each="" model="" to="" an="" rds="" file="" (see="" write_rds)="" save="" x_test="" and="" y_test="" to="" rds="" files="" step="" 4.="" save="" the="" r="" code="" you="" used="" for="" steps="" 1="" to="" 3="" to="" an="" r="" file="" step="" 5.="" compress="" all="" the="" output="" files="" from="" step="" 3="" to="" a="" zip="" file="" step="" 6.="" use="" r="" markdown="" to="" achieve="" the="" following:="" 1.="" specify="" author,="" date,="" and="" title="" in="" the="" yaml="" metadata="" of="" your="" document="" 2.="" read="" all="" the="" output="" files="" from="" step="" 3="" (x_test,="" y_test,="" 6="" fitted="" model="" files,="" and="" 6="" training="" history="" files)="" 3.="" use="" x_test="" and="" y_test="" to="" show="" the="" following="" statistics:="" number="" of="" reviews="" in="" the="" test="" set="" number="" of="" positive="" reviews="" in="" the="" test="" set="" number="" of="" negative="" reviews="" in="" the="" test="" set="" 4.="" for="" each="" model:="" show="" model="" summary="" plot="" the="" training="" history="" (do="" not="" train="" your="" model="" in="" rmarkdown!)="" evaluate="" the="" performance="" of="" the="" model="" using="" the="" test="" set="" 5.="" summarize="" the="" performance="" of="" different="" models="" using="" a="" table.="" columns="" include="" model_name="" acc:="" overall="" accuracy="" of="" the="" predictions="" in="" the="" test="" set="" n_tp:="" number="" of="" true-positive="" predictions="" in="" the="" test="" set="" n_tn:="" number="" of="" true-negative="" predictions="" in="" the="" test="" set="" n_fp:="" number="" of="" false-positive="" predictions="" in="" the="" test="" set="" n_fn:="" number="" of="" false-negative="" predictions="" in="" the="" test="" set="" 6.="" discuss="" what="" you="" found="" from="" the="" table="" step="" 7.="" knit="" the="" r="" markdown="" file="" (.rmd)="" to="" an="" html="" file="" step="" 8.="" the="" r,="" rmd,="" html,="" zip="" files="" must="" follow="" the="" naming="" rule="" below:="" assignment3-yourlastname.fileextension="" for="" example:="" assignment4-lin.r="" assignment5-lin.rmd="" assignment6-lin.html="" assignment7-lin.zip="" step="" 9.="" submit="" the="" r,="" rmd,="" html,="" and="" zip="" files="" (individually)="" to="" icollege="" due="" by="" the="" beginning="" of="" next="" class="" extra="" credit:="" the="" student="" who="" has="" the="" best="" report="" (determined="" by="" the="" instructor)="" will="" be="" given="" 5="" extra="" points="" towards="" the="" final="" grade="" submissions="" that="" are="" too="" similar="" would="" not="" be="" considered="" for="" the="" extra="" credit="" accuracy="" of="" the="" models="" plays="" a="" significant="" role="" for="" this="" extra="" credit="" grading="" is="" based="" on="" the="" following:="" grading="" is="" based="" on="" the="" submitted="" files="" on="" icollege.="" do="" not="" wait="" till="" the="" last="" minutes="" before="" the="" deadline.="" you="" will="" lose="" 10="" points="" for="" late="" submission.="" you="" will="" receive="" 0="" point="" if="" you="" submit="" your="" assignment="" via="" email.="" whether="" all="" required="" files="" were="" submitted="" to="" icollege="" on="" time,="" following="" the="" naming="" rule="" whether="" the="" rmd="" file="" is="" syntactically="" correct="" and="" can="" render="" the="" html="" file="" whether="" the="" report="" has="" a="" professional="" format="" and="" style="" (succinct="" and="" yet="" provides="" adequate="" and="" clear="" discussions)="" whether="" the="" report="" meets="" the="" requirements="" specified="" in="" step="">- y_test[sample_indicators] # use a subset of reviews for testing step 2. use x_train and y_train to fit the following deep learning models: 1. simple rnn 2. lstm 3. gru 4. bidirectional lstm 5. bidirectional gru 6. 1d convnet you can decide the parameters for the network structure (e.g., units, number of layers, etc) and model training (e.g., epochs, batch_size and validation_split). however, you need to find parameters such that the accuracy of each trained model on testing data should be at least 0.6. step 3. save the following files save each of these fitted models to an h5 file save the history of each model to an rds file (see write_rds) save x_test and y_test to rds files step 4. save the r code you used for steps 1 to 3 to an r file step 5. compress all the output files from step 3 to a zip file step 6. use r markdown to achieve the following: 1. specify author, date, and title in the yaml metadata of your document 2. read all the output files from step 3 (x_test, y_test, 6 fitted model files, and 6 training history files) 3. use x_test and y_test to show the following statistics: number of reviews in the test set number of positive reviews in the test set number of negative reviews in the test set 4. for each model: show model summary plot the training history (do not train your model in rmarkdown!) evaluate the performance of the model using the test set 5. summarize the performance of different models using a table. columns include model_name acc: overall accuracy of the predictions in the test set n_tp: number of true-positive predictions in the test set n_tn: number of true-negative predictions in the test set n_fp: number of false-positive predictions in the test set n_fn: number of false-negative predictions in the test set 6. discuss what you found from the table step 7. knit the r markdown file (.rmd) to an html file step 8. the r, rmd, html, zip files must follow the naming rule below: assignment3-yourlastname.fileextension for example: assignment4-lin.r assignment5-lin.rmd assignment6-lin.html assignment7-lin.zip step 9. submit the r, rmd, html, and zip files (individually) to icollege due by the beginning of next class extra credit: the student who has the best report (determined by the instructor) will be given 5 extra points towards the final grade submissions that are too similar would not be considered for the extra credit accuracy of the models plays a significant role for this extra credit grading is based on the following: grading is based on the submitted files on icollege. do not wait till the last minutes before the deadline. you will lose 10 points for late submission. you will receive 0 point if you submit your assignment via email. whether all required files were submitted to icollege on time, following the naming rule whether the rmd file is syntactically correct and can render the html file whether the report has a professional format and style (succinct and yet provides adequate and clear discussions) whether the report meets the requirements specified in step 6>