Australian Wine Importers (AWI) asked you to develop a data mining method of classifying imported...

Question

Australian Wine Importers (AWI) asked you to develop a data mining method of classifying imported wines based on: • Price category (as 5 equal size bins)
AWI provided you with a sample of 130,000 wine tasting results, which include:  Taster name and twitter handle;  Wine “title” (name + vintage);  Country, Province and Region;  Variety and Winery;  Description and Designation (text data);  Price (US$) and Points (taster’s rating).

As there are great many tasting results, AWI would like to get the preliminary insight into the wine’s origin and its marketability. The following questions are of interests to AWI: A) What is the best source of wine in the optimum price-rating ratio? and, B) What is the expected price range category of newly imported wines? C) Can all wine tasters in the data set be trusted with their tasting results?
AWI wants you to cleanup and explore wine tasting data, develop and evaluate a classifier to determine the price range for new wines, and to minimize classification errors.
In technical terms: Your project objectives form a learning portfolio. The first objective (LP1) is to acquire and explore the available data, visualise and report any significant characteristics of non-text data, as well as, prepare the data for further processing. The second objective (LP2) is to create a classification system able to answer management questions using only non-text data. Text processing will be featured in assignment A2. Reports in PDF format and models developed in LP1 and LP2 in ZIP archives are to be submitted via CloudDeakin by their respective deadlines.

pag-a1-assignment-rm-info-v3-2-utuvyyur.pdf

Anirban · Accepted Answer

Load data.
      
       
         
           
             
             
               
               
               
               
               
            
             
               
                 
                 
                 
                 Define the target column for the predictive model.
              
               
               
               
               
               
               
            
             Should define a target column?
          
           
             
             
               
               
               
               
               
            
             
               
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 Discretize by binning (same range per bin).
              
               
               
               
               
               
               
            
             
               
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 Discretize by frequency (same count per bin).
              
               
               
               
               
               
               
            
             Should discretize numerical target column?
          
           
             
             
               
               
               
               
               
            
             
               
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 Map some nominal target values to new values.
              
               
               
               
               
               
               
            
             Should map nominal values?

Sun	Mon	Tue	Wed	Thu	Fri	Sat
30	31	1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	1	2	3

Australian Wine Importers (AWI) asked you to develop a data mining method of classifying imported wines based on: • Price category (as 5 equal size bins) AWI provided you with a sample of 130,000 wine...

Answer To: Australian Wine Importers (AWI) asked you to develop a data mining method of classifying imported...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment