Please open the PDF file named "CS380 Assignment 4.pdf" for assignment instructions and details. This assignment has to be coded in Python. A zip folder with the base code in Python is provided as...

Please open the PDF file named "CS380 Assignment 4.pdf" for assignment instructions and details. This assignment has to be coded in Python. A zip folder with the base code in Python is provided as well, you would have to build on this based on the assignment instructions.


Any code written should be original and will be checked through Moss and other plagiarism detectors. Please make sure that all work is original. Also make sure that any libraries imported for use is allowed based on the assignment instructions.


CS380 Assignment 4 CS380:ArtificialIntelligence Assignment4:ReinforcementLearning Frogger(20pts) Frogger[https://en.wikipedia.org/wiki/Frogger]isanarcadegamefromthe1980sthat hasbeenportedtomanydifferenthomegamingsystemsandisstillavailabletoplay online(e.g.,https://froggerclassic.appspot.com).Inthegame,youcontrolafrogthatis tryingtocrossabusyroadandabusyrivertoarriveatahomerowatthetopofthe screen.Ontheroad,thefrogmustavoidthemovingcars;attheriver,thefrogmustjump betweenthefloatinglogsandturtlestoavoidfallingintothewater.Thereisa30-second timerduringwhichthefrogmustcompleteitsjourney,otherwiseitdiesandthefrogis regeneratedatthebottomofthescreen.Eachtimethefrogreachesthehomerow,itis awardedanumberofpointsproportionaltothetimeleftonthetimer. Inthisassignment,youwilldevelopareinforcement-learning(RL)agentthatplays Frogger.TheagentwillplaywithinasimplifiedversionofthegamethatrunsinPython andincludesmanyaspectsoftheoriginalgame.Hereisasamplescreenshot: Thisassignment,likeothersforthisclass,isanindividualassignmentthatwillbetested andgradedintheusualway.Asabonus,wealsoplantoconductamini-tournament betweenallthesubmittedFroggeragents;thedetailswillbeannouncedlater,butplease keepinmindthatyourcodewillbeembeddedinthistournamentandthusallstudents mustcarefullyfollowtheguidelinesbelowtoensurethattheiragentcancompete successfullyinthetournament. Setup Tostarttheassignment,pleasedownloadthegivencode,whichincludestwomain components:thefroggermodulethatimplementsthegameitself,andtheagentmodule whichhousestheagentcode.Atthetoplevel,thereisamain.pyfilethataccepts command-lineargumentsandrunsthesimulation. ThegamemoduleusesthePythonArcadelibrary[https://arcade.academy]which providesabaseengineonwhichtobuildthegameitself.Forthisreason,youwillfirst needtoinstallthislibraryonyourmachine(or,alternatively,avirtualmachineifyou prefer).Thisshouldbestraightforwardbyentering: > pip3 install arcade==2.3.15 Youcantestoutthegamebyentering: > python3 main.py --player=human Youshouldbeabletocontrolthefrogwiththearrowkeys,andeventuallyquitthegame bypressing‘q’ortheescapekey. Whilewearehopefulthateveryonewillbeabletorunthefullsystemwithgraphics, PythonArcadecanbedependentonyourspecificsetup.Ifyouhaveissues,youmight checktheirwebsite[https://arcade.academy]forfurtherinstallationinstructions specifictoyourmachine.Nevertheless,ifyoudonotgetthegraphicsworkingintheend, thereisabackupplan:ourgameprovidesawaytoprintthestatetotheterminalwithno separatewindoworgraphics.Tousetext-onlyoutput,simplyincludethe--output=text flaginthecommandlinewhenrunninganagent,forexample: > python3 main.py --player=agent --steps=10 --output=text (Unfortunately,thetext-onlyoutputversionofthegamerunsonlywithagentcontrol,and doesnotallowforhumancontrolviathekeyboard.)Thefullsuiteofcommand-line optionswillbedetailedlaterinthisdocument.Notethat,evenifyoucanrunthegraphics version,youmaychoosetousetext-onlyoutputforfastertraining,sincethetextoutputis noticeablyfasterthanthegraphicswhenrunningatfullspeed. Forthisassignment,allyourowncodeandfilesmustbeenclosedwithinyouragent module;youshouldnotmodifymain.pyorthefroggermodule(exceptifyouneedtext outputasnotedabove).Asbefore,exceptforthePythonArcadelibrary(andits dependencies),youmayonlyusebuilt-instandardlibraries(e.g.,math,random,etc.);you mayNOTdirectlyuseanyexternallibrariesorpackages,includingthoseinstalledwith Arcade(e.g.,numpy). FroggerModule Thefroggermoduleincludesallthecodeandotherfiles(e.g.,spriteimages)neededto runthegame.Ingeneral,youwillnotneedtounderstandthedetailsofthismodule;in fact,inmanyapplicationsofAI/ML,thedetailsoftheunderlyingenvironmentare unknown,opaque,orhiddenbehindablack-boxgamingengine.However,itisimportant tounderstandthedetailsoftheAPI:thestateinformationsentfromthegametoyour agent,andtheactioninformationyouragentneedstosendbacktothegame. Whencommunicatingwiththeagent,theAPIsendsthegamestatetotheagentwithall thenecessaryinformationencodedinastring.Thescreenitselfisencodedasagridof characterswhereeachcharacterdenotesadifferenttexture.Forexample,thenormal screenprovidedinthegamecanberepresentedwiththisstring: 'EEEEEEEEEEEEEEEE|~~~KLLLM~~~KLLLM~~|TT~~TTTT~~~TT~~~~~|LLM~~~~K LLLLM~~~KL|SSSSSSSSSSSSSSSSSS|-----DD--------DD-|--AAA------- AAA---|------C---------C-|SSSSSSSSSSSSSSSSSS' Here,thevariouscharactersrepresentdifferentobjects: • E:grassinthehomerow(atthetopofthescreen) • ~:water • K,L,M:start,middle,endofalog • T:turtle • S:purplemiddledivider • A,C,D:carsofdifferenttypes • –:road Inadditiontothescreen,theAPIstringalsoincludesafewotherpiecesofinformation: whetherthefroghasreachedthehomerow;ifithas,thescoregivenforreachingthegoal; andwhetherthetimerhasrunout.YouwillnotneedtocodethedetailsofparsingthisAPI string,however;thisisprovidedforyouintheagentbasecode,asdescribedbelow. AgentModule TheprimarytaskinthisassignmentistobuildoutyouragentmoduletoimplementanRL agentthatplaysFrogger.Asastartingpoint,thegivencodeincludestwofiles,state.py andagent.py.Thegivencodeincludesasampleagentthatsimplymakesrandommoves. Youcantestoutthisagentbyentering: > python3 main.py --player=agent --steps=10 Thiscommandwillrun10stepsofthesimulationwiththerandomagentandthen terminate. TheStateclassinstate.pyparsesthestateAPIstringandprovidesanumberofuseful variables:frog_xandfrog_yforthepositionofthefrog,at_goalwhichindicateswhether thefroghasreachedthehomerow,scorewhichgivesthescoreachievedifatthegoal,and is_donewhichindicateswhetherthefrogisdoneforthisepisode(i.e.,eitherreachedthe goalorfailedtoreachthehomerowbeforetimeexpired).Thereareadditionalmethodsto checkforlegalpositionsonthescreenandtogetthecellcharacteratapositiononthe screen.Youshouldnotneedtomodifystate.py,butmostimportantly,sinceyouragent willeventuallybetiedintothetournamentcode,youmustensurethattheAPIstring parsingremainsthesamesothatyouragentwillworkwithinthetournament. Thefileagent.pyiswhereyouwillneedtobuildyourRLagent,specificallyanagentthat implementsQ-learning.Thedetailsareprovidedinthesectionsbelow. StateRepresentation TheQ_StateclassprovidedextendsthebasicStateclasstoaddadditionalfunctionality neededforQ-learning.First,thereward()methodshowshowwemightcalculatea rewardforagivenstate,givingstate.scoreasarewardifthefroghasreachedthegoal, andanegativerewardifthestateis“done”andthetimerhasrunout. The_compute_key()methodintheQ_Stateclassiscriticaltolearning:itreducesthe complexgamestatetosomethingmanageablethatcanlearnandgeneralizefromthestate. Inessence,iftheQ-tableisimplementedasaPythondictionary,thiskeyisthestringthat willservetoindexthestateintothisdictionary.Therearemanypossibleoptionsforthis key.Ontheonehand,ifthekeyrepresentstheentirescreen,Q-learningwouldneedto learnactionsforeverypossiblescreen!–clearlynotafeasibleoption.Ontheotherhand,if thekeyrepresentsonly(forexample)thecellinfrontofthefrog,itwouldbemissinga greatdealofcontext.Thesamplemethodprovidedusesthethreecellsinfrontofthefrog. YoushouldexperimentwithbetteroptionsafteryouhaveimplementedthebasicQ- learningalgorithmbelow. Q-Learning TheAgentclassinagent.pyiswhereyouwillimplementtheactuallearningalgorithm. First,youshouldconsiderhowtoimplementtheQ-tableitself,likelyasadictionaryover statekeysandsecondarilyoverpossibleactions.Then,youwillneedtoimplementQ- learningwithinthechoose_action()method.Thismethod(whichmustkeepthisname forconsistencywithotheragents)isthecentralmethodthatreceivesthecurrentstate stringandreturnstheactiontobetaken.ItshouldconstructaninternalQ_Stateusingthe givenstatestring,andtheneithertraintheQ-table(iftrainingison)orsimplychoosean action(iftrainingisoff).Thepossibleactions,asincludedinstate.py,aresimplyoneof ['u','d','l','r','_'](forup,down,left,right,andnone,respectively). ThecriticalpieceofQ-learningtobeimplementedistheequationseeninlecture: ?(?, ?) ← (1 − ?)?(?, ?) + ?[? + ?max!"?(?", ?")] Thechoose_action()methodwillbecalledforeachupdateofthestate,andthusyouwill needtokeeptrackofthepreviousstateandactioninordertoproperlyimplementthis equation.Also,pleasethinkaboutthe“explorationvs.exploitation”aspectofthis algorithmdiscussedinclass:whiletheagentmaywanttotakethebestactionmostofthe time,itmaybeagoodideatoallowforsomeprobabilityofperformingsomeotheraction (e.g.,arandomaction)toallowtheagenttoexploreunseenstates. TheAgentclassalsoprovidesmethodsforsavingandloadingtheQ-table.Forsimplicity, werecommendsavingtheQ-tableoneverystepofthesimulation,ensuringthatall trainingissavedincrementally.YoucanalsobackupyourQ-tablesinthesefiles,and choosethebesttrainedagentasyourfinalsubmittedagent. Command-LineInterface Thecodeprovidedincludesacommand-lineinterfacewithmanyoptions.Thegeneral commandis: > python3 main.py [options] Thepossibleoptionsareasfollows: • --player=:ifplayeris‘human’,thegamerunswithhumaninput; otherwise,playershouldbe‘agent’(default)andwillrunusingtheagent • --screen=:screenshouldbeoneof‘medium’(default),‘hard’,or‘easy’to denoteenvironmentsofdifferentdifficulty • --steps=:stepsindicatesthenumberofsimulationstepstotake;ifnot provided,thesimulationwillrununtilmanuallyterminated(default) • --train=:trainprovidesthefilenameofthetrainingfilethatcontainstheQ- table,tobesavedinthetrain/subdirectory;ifnotprovided,thesimulationruns withouttraining(default) • --speed=:speedcanbeeither‘slow’forreal-timesimulation,or‘fast’for trainingpurposes • --restart=:restartisanumberbetween1and8indicatingtherowat whichthesimulationshouldrestartthefrogforeachepisode;thisisusefulfor trainingtokeepthefrognearthehomerowforinitialtraining • --output=
:outputiseither‘graphics’forafullgraphicswindowusing PythonArcade(default),or‘text’fortext-onlyoutputtotheterminal Likeotheraspectsofthiscode,youshouldnotmodifythecommand-lineinterface,since wewillrelyonasimilarstructurefortestingandinthetournament. TrainingandTesting Onceyourstaterepresentationandlearningalgorithmareimplemented,youcanbegin tryingtotrainyourQ-table.Therearemanywaystoaccomplishthis,andyoushouldfeel freetoexperimentwithvariousstaterepresentationsandtrainingregimenstoseewhat worksbest. Youmightstartbytrainingittosimplymakethefinalmovefromthetop-mostriverrow tothehomerow.Forexample,thiscommand > python3 main.py --player=agent --train=q --screen=medium \ --steps=500 --restart=1 usesthemedium-difficultyscreentotraintheagentwith500simulationsteps.Ateach episode,thefrogrestartsatrow1(thetop-mostriverrowbelowthehomerow).Ifyou haveusedthesave()method,theresultingQ-tablewillbesavedintrain/q.jsonasaJSON file;youshouldbeabletomanuallyinspectthisJSONfiletoseethevariousentriesofthe Q-table.Itmayhelptowatchthetraininginrealtime,using--speed=slow,andonceit’s clearthatthingsareworkingproperly,speeditupusing--speed=fast. Youmightthencontinuebytrainingyouragentincrementallysothat,eventually,itisable tonavigatetothehomerowfromthestartingposition.Ifyouragentishavingtrouble findingthehomerow,youmighttrainitfirstonhigherrows(likeabove)togetitworking forthoserowsbeforetraininglowerrows.Also,youmightconsidercreatingatraining scriptthatrunsthroughatrainingregimenautomatically(e.g.,fromhighertolowerrows, andusingallthreeboardsforgeneralizability). Whenyouragenthasbeensufficientlytrained,thiscommandwouldtestyouragentonthe easyscreen,viewableinrealtime: > python3 main.py --player=agent --screen=easy --steps=100 Ifthisisworking,youcouldtryrunningyouragentonthe‘medium’and‘hard’screens, perhapsusing--speed=fasttorapidlytestitandseewhetheritreachesthehomerow. Ingeneral,youshouldbeabletotrainanagentthatdoesthe‘easy’screenwithouttoo muchdifficulty.The‘hard’screenisindeedhard,butitshouldoccasionallysucceedwith thisscreenaswell(thoughperhapsnotveryoften). Tournament Weareplanningtorunatournamentattheendofthecoursewhereallthestudents’ agentswillcompeteagainstoneanother.Thedetailsofthetournamentarestilltobe determined,butyoucanassumethatwewillimportyouragentmoduleandplaythis agentagainstotheragents—soyoushouldmakesurethattheagentloads,bydefault, yourbestQ-tableandanyothersettingsneededtorun.Also,especiallyifyouhave successfullycompletedthemainpartofthisassignment,weencourageyoutoexperiment withdifferentstaterepresentations,trainingregimens,etc.tomakeyouragentplayas wellaspossible;theonlyhardrequirementisthatyouragentuseQ-learningasitscentral component(withnoimportedlibrariesincludingthoseinstalledwithArcade). Submission Forthisassignment,runningontux.cs.drexel.edumightpresentachallenge,sincethe codenormallyusesagraphicswindowtodisplaythegame(exceptforthetext-onlyoutput mode).Thus,youwillprobablyjusttestthiscodeonyourlocalmachineandthensubmit whatyouhave. Forthisassignment,youshouldsubmitallyourfilesintheentiredirectory.Asmentioned, yourPythoncodewillbelocatedintheagentmodule,butpleasealsoincludethetrained Q-table(JSON)filesandanytrainingscript(s)thatyoumayhavewritten.Pleaseusea compressionutilitytocompressyourfilesintoasingleZIPfile(notRARoranyother compressionformat). ThefinalZIPfilemustbesubmittedelectronicallyusingBlackboard—pleasedonotemail yourassignmenttoaTAorinstructor.IfyouarehavingdifficultywithyourBlackboard account,youareresponsibleforresolvingtheseproblemswithaTAorsomeonefromIRT beforetheassignmentitdue.Ifyouhaveanydoubts,completeyourworkearlysothat someonecanhelpyou. AcademicHonesty Pleaserememberthatallmaterialsubmittedforthisassignment(andallassignmentsfor thiscourse)mustbecreatedbyyou,onyourown,withouthelpfromanyoneexceptthe courseTA(s)orinstructor(s).Anymaterialtakenfromoutsidesourcesmustbe appropriatelycited.
Nov 24, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here