Assignment Goals The goal of this assignment is to write a class that 'learns' from a file of movie reviews (with a 0-4 score and some text), and uses that information to predict what score a new...

1 answer below »

Assignment Goals


The goal of this assignment is to write a class that 'learns' from a file of movie reviews (with a 0-4 score and some text), and uses that information to predict what score a new review might get.


From a technical point of view, this assignment practices structuring and developing a class, and builds on examples of using HashMaps and ArrayLists.


Getting Started


To start, download the files in thisfolder(Links to an external site.). It contains a Java file and two text files.


In Eclipse, make an a7 package. Add the three files to this package.


How the Predictor Works


The predictor reads in a files of reviews. Each review has a 0-4 score as the first item on each line.


When the predictor sees a word, it associates that word with the score of the review it is in. For example, the word "good" might be in a 3 star review, so we want to associate good with this 3 value.


However, the word good may appear in multiple reviews, or even multiple times in a single review. "Good" may appear in a review like "1 good idea, but bad movie". Rather than trying to understand the context of its use, the predictor takes the simple approach of averaging all the scores it is associated with. If "good" appears in a 2-star review, a 1-star review, and four 3-star reviews, the predictor adds up all those scores (2 + 1 + 3 + 3 + 3 + 3 = 15) and the number of times it appears (6) to calculate an average value of 15/6, or 2.5.


This process is repeated for all words. One adjustment to make is that average words (those with a value around 2) are ignored. For example, "a" and "the" appear in a lot of reviews, but don't really contribute to the meaning. They also tend to overwhelm the values of the interesting words. So the predictor ignores those. The code has some comments on how to approach this.


Finally, given a new review that needs a predicted score, each word in the review is checked for its value. The total word value is added up and then an average for the review is found. As an example, the predictor may have learned the following associations from the set of reviews: { "good" = 3, "bad" = 1, "wonderful" = 4, "film" = 3}. Then, given a review "a good film", the predictor would ignore "a" and add up the values of "good" and "film" to get a total score of 6. Since 2 words were used to get that total, the average is 3, which is the predicted score. If the review were a real review with a score, we can compare the predicted and actual score to see how the approach is working.


The Structure of the Starter Code


The beginning of a class structure is provided to you. You should follow this structure. There are Javadocs describing what each method does. In addition, there are comments giving the outline of a solution approach.


Here is the basic design of the class. It is intended that an instance of this class will have learned word values from a file and will store those values in a wordValue instance variable that maps between a word key and a score value.


The constructor's job is to make a ready-to-go predictor. It takes a filename in the constructor, and calls methods to read the file and build the HashMap.


The main method drives the overall program. It makes an instance of the predictor class, then reads in a file of reviews to test and print out the predicted and actual review scores.


Other methods need to be implemented as described in their Javadocs and comments.


This assignment closely follows the example of the vote counting program in the Week 9 recitation. You are urged to look carefully at that code and to adapt it as needed for this assignment.


Some variables are instance variable. The wordValue HashMap gets repeatedly used as each review asks for a prediction. Other data in the program is transitory - it gets used to build up the final word value, but it does not need to persist with each instance. In this case, the intent is for you to make local variables as needed in the methods.


Finally, note that the linesFromFile method is static. It has more of a helper status in the class, and doesn't need access to the instance variables.


And definitely finally, some of the methods are public that are not called in main. This suggests that they could have been made private, as they are only called from other methods of the class. I have kept them public as an aid to testing during submission.


Developing your program


My suggestion for developing the program is to:



  1. Review the vote counting recitation. There is a connection between counting the votes associated with a candidate and totaling the scores associated with a word.

  2. Work through an example by hand. Use the smallReviews.txt to make a totalScores HashMap, a wordCount HashMap, and then from those make a wordValue HashMap.

  3. Implement the code and check each method with the results you expect from doing it by hand.

  4. Try it on the full reviews file.

  5. Study the predicted results and actual scores.


At the very bottom of the java file, add a block comment and write a short paragraph describing how well you think the predictor works, any trends you see, and any explanation you have as to why the prediction and the actual score might not match.


You do not need to modify comments, but they should be in the correct place in the code.


Some example output from learning the smallReviews.txt and testing with that is


Word totals: {a=4, pretty=3, movie=7, bad=0, excellent=4, decent=3}


Word counts: {a=2, pretty=1, movie=3, bad=1, excellent=1, decent=1}


{pretty=3.0, movie=2.3333333333333335, bad=0.0, excellent=4.0, decent=3.0}


Predicted: 3.2 actual: 4 a excellent movie


Predicted: 1.2 actual: 0 a bad movie


Predicted: 2.8 actual: 3 pretty decent movie


Submitting


Submit just the java file with the working predictor. Make sure to remove the debugging prints I have in the computeScoreAndCounts method. Make sure you have some explanatory text at the bottom of the file. Add your name with an additional @author tag to the top.

The following content is
Answered Same DayNov 01, 2021

Answer To: Assignment Goals The goal of this assignment is to write a class that 'learns' from a file of movie...

Arun Shankar answered on Nov 04 2021
155 Votes
package a7; //
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Scanner;
/**
* This class reads in a file of movie reviews. From that it "learns" wha
t words
* are associated with good reviews and what are associated with poor reviews.
* It can then predict the score for a new review that contains those words.
*
* @author Prof. David E. Johnson
*
*/
public class MovieReviewPredictor {
    // The wordValue maps a word to its average score from the reviews.
    HashMap wordValue;
    /**
     * Construct a new MovieReviewPredictor by reading in a file of reviews with
     * scores and using that to create a mapping between a word and its score. At
     * the end of the constructor, the wordValue HashMap should be filled in and
     * ready to be used.
     *
     * A helpful message should be printed and then System exit called if the file
     * is not found.
     * @throws FileNotFoundException
     */
    public MovieReviewPredictor(String filename)
    {
        // Read in the lines from the filename.
        ArrayList lines;
        try
        {
            lines = linesFromFile(filename);
            
            // Get the word values from the lines and store them in wordValue
            // using the appropriate class method.
            
            wordValue = new HashMap();
            computeWordValue(lines);
        }
        catch (FileNotFoundException e)
        {
            e.printStackTrace();
        }
    }
    /**
     * Read the lines from the file and stores each in an ArrayList. Each line
     * should be processed as follows: - the line is set to all lower case using the
     * String toLowerCase method - punctuation is removed by removing all characters
     * that are not a through z or 0 through 9. This can be done using the String
     * replaceAll method, like
     * String newString = string.replaceAll("[^a-z0-9 ]", "");
     *
     * @param filename the name of the file to read
     * @return an ArrayList of the lines with punctuation removed and made all
     * lowercase.
     * @throws FileNotFoundException
     */
    public static ArrayList linesFromFile(String filename) throws FileNotFoundException {
        ArrayList lines = new ArrayList<>();
        File file = new File(filename);
        Scanner s = new Scanner(file);
        while (s.hasNextLine()) {
            String line = s.nextLine().toLowerCase();
            line = line.replaceAll("[^a-z0-9...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here