Advanced Database Please solve it using Hadoop Assume that you have the following relations, each relation represents a dataset of text files stores on HDFS. 1. ratings ( UserID, MovieID, Rating ) //...




Advanced Database Please solve it using Hadoop


Assume that you have the following relations, each relation represents a dataset of
text files stores on HDFS.
1. ratings ( UserID, MovieID, Rating ) // where rating represent the rating between
(from 1 to 5) given by the user to the corresponding movieID
2. users ( UserID, Gender, Age)
3. movies ( MovieID, Title, Genres ) // where genres in the classification of the
movie such as comedy, children, action, ….
Suppose you have been given a task to find the average rating for each movie in
the form (movieID, Title, avg_rating). Computing the average rating must consider
the following:
4. only children and comedy movies
5. consider rating values that are above 2
6. consider ratings from users who’s age is above 25
What to submit:
• First briefly describe how to implement the above task in MapReduce jobs in an
efficient way
• specify how many jobs you need
• what the purpose of each MapReduce job (what it does)
• what the Map and Reduce functions do in each MapReduce Job. write that in
pseudo code similar to what we did in the lectures of relational algebra in
MapReduce



Jun 08, 2022
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here