The file Stocks.zip is a zip file containing nearly 100 sets of pricerecords for various stocks. A sample of the type of files containedin Stocks.zip is ABT.csv, which we have seen previously and is postedin recent course materials. Each file includes daily data for a specificstock, with stock ticker symbol given in the file name.
## ## File: assignment11.py (STAT 3250) ## Topic: Assignment 11 ## ## The file Stocks.zip is a zip file containing nearly 100 sets of price ## records for various stocks. A sample of the type of files contained ## in Stocks.zip is ABT.csv, which we have seen previously and is posted ## in recent course materials. Each file includes daily data for a specific ## stock, with stock ticker symbol given in the file name. Each line of ## a file includes the following: ## ## Date = date for recorded information ## Open = opening stock price ## High = high stock price ## Low = low stock price ## Close = closing stock price ## Volume = number of shares traded ## Adj Close = closing price adjusted for stock splits (ignored for this assignment) ## The time interval covered varies from stock to stock. For many files ## there are dates when the market was open but the data is not provided, so ## those records are missing. Note that some dates are not present because the ## market is closed on weekends and holidays. Those are not missing records. ## The Gradescope autograder will be evaluating your code on a subset ## of the set of files in the folder Stocks. Your code needs to automatically ## handle all assignments to the variables q1, q2, ... to accommodate the ## reduced set, so do not copy/paste things from the console window, and ## take care with hard-coding values. ## The autograder will contain a folder Stocks containing the stock data sets. ## This folder will be in the working directory so your code should be written ## assuming that is the case. import pandas as pd # load pandas import numpy as np # load numpy pd.set_option('display.max_columns', 10) # Display 10 columns in console ## 1. Find the mean for the Open, High, Low, and Close entries for all ## records for all stocks. Give your results as a Series with index ## Open, High, Low, Close (in that order) and the corresponding means ## as values. q1 = None # Series of means of Open, High, Low, and Close ## 2. Find all stocks with an average Close price less than 30. Give your ## results as a Series with ticker symbol as index and average Close price. ## price as value. Sort the Series from lowest to highest average Close ## price. (Note: 'MSFT' is the ticker symbol for Microsoft. 'MSFT.csv', ## 'Stocks/MSFT.csv' and 'MSFT ' are not ticker symbols.) q2 = None # Series of stocks with average close less than 30 ## 3. Find the top-10 stocks in terms of the day-to-day volatility of the ## price, which we define to be the mean of the daily differences ## High - Low for each stock. Give your results as a Series with the ## ticker symbol as index and average day-to-day volatility as value. ## Sort the Series from highest to lowest average volatility. q3 = None # Series of top-10 mean volatility ## 4. Repeat the previous problem, this time using the relative volatility, ## which we define to be the mean of ## ## (High − Low)/(0.5(Open + Close)) ## ## for each day. Provide your results as a Series with the same specifications ## as in the previous problem. q4 = None # Series of top-10 mean relative volatility ## 5. For each day the market was open in October 2008, find the average ## daily Open, High, Low, Close, and Volume for all stocks that have ## records for October 2008. (Note: The market is open on a given ## date if there is a record for that date in any of the files.) ## Give your results as a DataFrame with dates as index and columns of ## means Open, High, Low, Close, Volume (in that order). The dates should ## be sorted from oldest to most recent, with dates formatted (for example) ## 2008-10-01, the same form as in the files. q5 = None # DataFrame of means for each open day of Oct '08. ## 6. For 2011, find the date with the maximum average relative volatility ## for all stocks and the date with the minimum average relative ## volatility for all stocks. Give your results as a Series with ## the dates as index and corresponding average relative volatility ## as values, with the maximum first and the minimum second. q6 = None # Series of average relative volatilities ## 7. For 2010-2012, find the average relative volatility for all stocks on ## Monday, Tuesday, ..., Friday. Give your results as a Series with index ## 'Mon','Tue','Wed','Thu','Fri' (in that order) and corresponding ## average relative volatility as values. q7 = None # Series of average relative volatility by day of week ## 8. For each month of 2009, determine which stock had the maximum average ## relative volatility. Give your results as a Series with MultiIndex ## that includes the month (month number is fine) and corresponding stock ## ticker symbol (in that order), and the average relative volatility ## as values. Sort the Series by month number 1, 2, ..., 12. q8 = None # Series of maximum relative volatilities by month ## 9. The “Python Index” is designed to capture the collective movement of ## all of our stocks. For each date, this is defined as the average price ## for all stocks for which we have data on that day, weighted by the ## volume of shares traded for each stock. That is, for stock values ## S_1, S_2, ... with corresponding volumes V_1, V_2, ..., the average ## weighted volume is ## ## (S_1*V_1 + S_2*V_2 + ...)/(V_1 + V_2 + ...) ## ## Find the Open, High, Low, and Close for the Python Index for each date ## the market was open in January 2013. ## Give your results as a DataFrame with dates as index and columns of ## means Open, High, Low, Close (in that order). The dates should ## be sorted from oldest to most recent, with dates formatted (for example) ## 2013-01-31, the same form as in the files. q9 = None # DataFrame of Python Index values for each open day of Jan 2013. ## 10. For the years 2007-2012 determine the top-8 month-year pairs in terms ## of average relative volatility of the Python Index. Give your results ## as a Series with MultiIndex that includes the month (month number is ## fine) and year (in that order), and the average relative volatility ## as values. Sort the Series by average relative volatility from ## largest to smallest. q10 = None # Series of month-year pairs and average rel. volatilities ## 11. Each stock in the data set contains records starting at some date and ## ending at another date. In between the start and end dates there may be ## dates when the market was open but there is no record -- these are the ## missing records for the stock. For each stock, determine the percentage ## of records that are missing out of the total records that would be ## present if no records were missing. Give a Series of those stocks ## with less than 1.3% of records missing, with the stock ticker as index ## and the corresponding percentage as values, sorted from lowest to ## highest percentage. q11 = None # Series of stocks and percent missing