sas to do analytics
HMBA/MABS902 – Assignment 1: Hollywood Movie Description and goal In this assignment you will provide data analysis within the context of a business application. The specifications below indicate what you need to produce, but not how to produce it. MoCo is an organisation investing in movies. Using a multitude of factors, the managers want to 1. Have a better understanding and visualisation of the data through an interactive dashboard 2. Evaluate the potential success or revenues of future movie projects. Use your business analytics skills and SAS Visual Analytics to solve those questions. Data - The HOLLYWOODMOVIEDATASET_IM data set. The data set, available in SAS Viya, was obtained from several movie databases using both automated as well as manual means. It is more than likely that some of the values are captured/entered incorrectly. Hence, the accuracy of the data set cannot be guaranteed. This data set can be used for descriptive and predictive modelling. The data dictionary is provided below. Variable Definition Possible values MovieID A Unique Identifier for the movie. An integer number StarValue_Director; StarValue_Producer StartValue_Cast Signifies the star value of the director, producer and the cast (as per recent past box-office success). An ordinal category from 1 (lowest) to 5 (highest) OriginalScreePlay The movie is based on an original screen play. Yes/No Genre_CAT Specifies the content category CAT the movie belongs to. A movie can be classified in more than one content category (e.g. action and comedy). Therefore, each content category is represented with a separate binary variable. Action, Adventure, Animation, Biography, Comedy, Crime, Drama, Family, Fantasy, History, Mystery, etc. Binary variable (1 = Yes, 0 = No) Competition Indicates the level at which each movie competes for the same pool of entertainment dollars agains movies released at the same time. High, Medium, Low MPAA Rating The rating assigned by the Motion Picture Association of America. G, PG, PG13, R, NR MaxScreenCount Indicate the number of screens the movie is expected to be shown at its debut. An integer number BoxOfficeClass Box-office success category. An integer from 1 (flop) to 9 (blockbuster) – see table below GrossBoxOffice Box-office gross revenue on theatres. An integer number ShortStoryLine A short textual description of the script/story. A few sentences EstimatedBudget Estimated movie budget (this field has values for only a subset of the movies). An integer number MovieLength The number of minutes the movies runs (this field has values for only a subset of the movies). An integer number YEAR Year the movie was released, coded as a measure (numerical) value An integer number YearEnd Year the movie was released, coded as a categorical value. A timestamp SpecialEffect Specifies the amount of special effect in the movie. An ordinal category ranging from 1 (lowest) to 5 (highest) The following table shows the breakpoints/bins used to convert the gross box-office revenues to one of nine success categories Class no 1 2 3 4 5 6 7 8 9 Range (millions $) <1>1><10>10><20>20><40>40><65>65><100>100><150>150><200>200 Additional information • Regressions can be useful in this context (but you can use other models) • Report (PDF format) to be submitted into the Moodle site before the due date. • An export of your interactive (SAS) dashboard should be attached to your report as well • Due date: 4th of October, before 11:30pm • Maximum 2,500 words (excluding illustrations and appendices). Keep it simple (no need for references and executive summary). • Do not forget to mention your name on the first page • Individual assessment Description and goal Data - The HOLLYWOODMOVIEDATASET_IM data set. Additional information200>