Data mining stock prices Data mining is the process of sorting through large amounts of data and picking out relevant information. It is usually used by business intelligence organizations, and...


Data mining stock prices


Data mining is the process of sorting through large amounts of data and picking out relevant information. It is usually used by business intelligence organizations, and financial analysts, but is increasingly being used in the sciences to extract information from the enormous data sets generated by modern experimental and observational methods.


 In this project, we want to do some preliminary data mining to the prices of some company’s stock. So that we can speak in specifics, let’s look at Google. Your program will calculate the monthly average prices of Google stock from 2004 to 2008, and tell us the 6 best and 6 worst months for Google. We provide the data reading function; you write the next two and a main that calls the functions.


(a) First you need a history of stock prices. Go to finance.yahoo.com, enter Google in the search field, select “Historical Prices” (currently on the left of the page), and find the “download to spreadsheet” option. Choose to save the file in the folder where your Python program will be saved. The default name is “table.csv” so we will use that name. The file format is indicated by the first few lines:


Date,Open,High,Low,Close,Volume,Adj Close


2008-09-19,461.00,462.07,443.28,449.15,10006000,449.15


2008-09-18,422.64,439.18,410.50,439.08,8589400,439.08


(c)    get data list(file name)


The “csv” file is a “comma-separated file”, so we can split the data on commas. The following function will read a file, split the lines in the file on commas, and put the data in to a list that is returned. The result is a list of lists where each line is a list. Also, every item is a string. To read our file, call it using our file name:


get data list('table.csv'). Experiment with this function in the shell to get a sense of what is returned.


>def get_data_list(file_name):


data_file = open(file_name,"r")


data_list = [ ] # start with an empty list


for line_str in data_file:


# strip end-of-line, split on commas, and append items to list data_list.append(line_str.strip().split(',')) return data_list


C get monthly averages(data list)


In this function, you will use the data_list generated by the get_data_list function as the parameter. Use the Date, Volume, and Adj Close fields to calculate the average monthly prices. Here is a formula for the average price for a month, where Vi is the volume and Ci is the day i’s adjusted close price (Adj Close).


average_price = (V 1 ∗C1+V 2 ∗C2+···+V n ∗C n)/(V 1+V 2+···+V n)


For each month create a tuple with two items: the average for that month, and the date (you need only the month and year). Append the tuple for each month to a list (e.g., monthly_averages_list), and after calculating all the monthly averages, return this list. We use tuples here because once these values are calculated we don’t want to accidentally change them!



(d)   print info(monthly averages list)


In this function, you need to use the list of monthly averages calculated in the get_monthly_averages function. Here you will need to find and print the 6 best (highest average price) and 6 worst (lowest average price) months for Google’s stock. Print them in order from highest to lowest and print to 2 decimal places. Format the output so that it looks good and include informative headers. This function does not return anything.


(e) If you don’t call these functions, they are useless. Thus, you should write code to call them.


Hints:


(a)    The list sort() and reverse() methods will be useful. Experiment with how they sort a list of tuples—notice how they sort on the first item


(b)   To create a tuple, put items in a comma-separated list with parentheses: (x,y).


(c)    When working with a list-of-lists (or a list-of-tuples), the first item in the first list is some list[0][0] and the second item in that same first list is some list[0][1].

Nov 22, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here