Recording URL: https://ucidce.zoom.us/rec/share/w956FbqpzHFOUK_T2XGAdYd9Pr3Eaaa80SRL_vNZmBrXXfjyYNxXRwdDYYfyA5OG Day 7: Assignment, Time Series Submit Assignment Instructions · Make a copy of the...

1 answer below »
same as before


Recording URL: https://ucidce.zoom.us/rec/share/w956FbqpzHFOUK_T2XGAdYd9Pr3Eaaa80SRL_vNZmBrXXfjyYNxXRwdDYYfyA5OG Day 7: Assignment, Time Series Submit Assignment Instructions · Make a copy of the colab notebook and save it to your own Google drive ·  (Links to an external site.)LINK TO NOTEBOOK (Links to an external site.) · This notebook is for your reference about time series operations in Python · You do NOT have to submit anything for this assignment unless you answer the bonus question. In that case: · Set sharing to "Anyone with a link can view" · Submit the link to the notebook Day 7: Content Overview A large amount of the data we collect comes in the form of time series. Time as an attribute contains a wealth of information such as seasonality patterns, time dependence on past values, and cross correlation between other attributes' time series data. In this module, we cover the basics of time series analysis and the difficulties in working with time attributes for modeling. Readings and Media · Class slides: Time Series Modeling · Resources: · Online time series book : https://otexts.com/fpp3/ (Links to an external site.) · Code example: · Pairs trading notebook: Pairs Trading.ipynb Modeling Methods, Deploying, and Refining Predictive Models Modeling Methods, Deploying, and Refining Predictive Models UCI Spring 2020 Class 7 Time Series Modeling Schedule 2 Introduction and Overview Data and Modeling + Simulation Modeling Error-based Modeling Probability-based Modeling Similarity-based Modeling Information-based Modeling Time-series Modeling Deployment At the end of this module: You will learn how to model: time series For Forecasting 3 Today’s Objectives Time-series forecasting Time series data ML models Classic TS models Cointegration Today’s Objectives Time-series forecasting Time series data ML models Classic TS models Cointegration The ABT for multivariate time series Time PeriodDescriptive Feature 1…Descriptive Feature mTarget Feature Period 1Obs 1Obs 1Target value 1 Period 2Obs 2Obs 2Target value 2 .... .... ..Obs n-2. ..Obs n-1. Period nObs nObs nTarget value n t Numeric, time ordered. Numeric, time ordered Discrete, ordered series that might contain information as well; e.g. month, season, year, etc. Can have numerical or categorical meanings Time series data difficulties Some of the common difficulties working time series data Order of the data matters, now have a time component Temporal relationships Seasonality Frequency mismatch Historical data revisions Noisy data or missing data Windowing Structural gaps like weekends, holidays Exogenous one-time events impact the data Underlying trends change Data was generated by a random process Complex interrelationships between historical and coincident data Difficult to split test and training data Etc. Time series forecasting is hard to do Today’s Objectives Time-series forecasting Time series data ML models Classic TS models Cointegration Supervised Methods Which methods do we use for time series? Error-based SIMILARITY-based Information-based Probability-based Neural networks and deep Learning-based methods Ensembles Supervised Methods Every possible modeling method can be used for time series forecasting in some form or another Error-based Instance-based Information-based Probability-based Neural networks and deep Learning-based methods Ensembles The ABT for multivariate time series Time PeriodDescriptive Feature 1…Descriptive Feature mTarget Feature Period 1Obs 1Obs 1Target value 1 Period 2Obs 2Obs 2Target value 2 .... .... ..Obs n-2. ..Obs n-1. Period nObs nObs nTarget value n t Numeric, time ordered. Numeric, time ordered Discrete, ordered series that might contain information as well; e.g. month, season, year, etc. Can have numerical or categorical meanings Regression-based time series model In its multivariate form: In general for every point in time t, Regression-based time series model In its simplest form: The ABT for univariate time series Time PeriodTarget Feature Period 1Target value 1 Period 2Target value 2 .. .. .. .. Period nTarget value n Numeric, time ordered Let us consider the simplest possible data set, a univariate numeric time series. Discrete, ordered series, that might contain information as well; e.g. month, season, etc. Regression-based time series model The reality is more complicated as there might be linear dependencies on past values: Predicted Y at time t constant Linear combination of lags of Y up to p lags Linear combination of lagged forecast errors up to q lags From time series to supervised learning In order to reframe a time series problem to a supervised learning problem, you must either: Assume time independence of all observations Or, for every point in time, consider the sequence up to that point in time Predicted Y at time t constant Linear combination of lags of Y up to p lags Linear combination of lagged forecast errors up to q lags Sequences of data Assume we have a sequence of data: Y = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34,…] What is the next number in the sequence? Sequences of data Assume we have a sequence of data: Y = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34,…] What is the 100th number in the sequence? Sequences of data Assume we have a sequence of data: Y = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34,…] What if we fit a linear regression? Sequences of data Assume we have a sequence of data: Y = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34,…] What if we fit a linear regression? The sequence has serial dependence. Because we did not include serial dependence, the model failed poorly. Sequences of data Let us now “shift” the prior data points and set them side by side: Sequences of data What relationship can you see? Y Shift(Y,1) Shift(Y,2) Sequences of data Its pretty clear that: Y = Shift(Y,1) + Shift(Y,2) In other words, this is just a Fibonacci sequence: Our feature matrix with sequentially ordered and shifted historical values allowed us to solve the model. Shifting to augment data When modeling time series, you can convert it to a supervised learning problem simply by “shifting“ your data set so every point in time considers the whole sequence or a subset of data up to that point in time. This is what we did in all the assignments where we predicted stock returns. We used the “shift” operator from the pandas module in Python: Using the full history So for predicting a target Y at time t+t, we can use: Every sequence of values up to time t for each predictor Every previous value of the target Y up to time t https://machinelearningmastery.com/convert-time-series-supervised-learning-problem-python/ Dangers with ML in time series However, even incorporating serial dependence and the full history up to a point in time does not always help with making time series predictions. Model evaluation Consider a forecast problem of predicting the evolution of a stock index. We use the first 250 trading days as the training set. We predict the remaining days of the dataset Sophisticated ML models and time series The prediction is using a state-of-the-art long short-term memory (LSTM) deep learning model. The model accuracy is calculated via the (recall from regression prediction error calc) and shows a strong score of .89. Don’t be fooled The actual and predicted values look amazingly close. However, if you zoom in, you can see that all the model did was use the most recent value as the prediction for the forecast value. This is called a persistence model and has no real predictive power as the best guess of tomorrow is just the value today. Today’s value for tomorrow This is confirmed by the cross-correlation plot between the actual and predicted values. There is a clear peak in correlation at a one day lag between the actual and predicted value. Random walk In fact, this time series was generated by a random walk process which cannot be forecast. We can simulate a random walk as a sequence of discrete fixed-length steps in random directions (recall the simulation modeling class). Comparison to classic time series methods In fact, most machine learning models have less robust forecasting ability than classical time series models on univariate series.* *https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0194889 Models compared were: Classical Naïve 2 Simple exponential smoothing Holt ARIMA ETS Etc. Machine Learning Multi-Layer Perceptron (MLP) Bayesian Neural Network (BNN) Radial Basis Functions (RBF) Kernel regression kNN Regression Trees (CART) LSTM Recurrent Neural Network (RNN) Results of the comparison As calculated by symmetric mean absolute percentage error (sMAPE). *https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0194889 Today’s Objectives Time-series forecasting Time series data ML models Classic TS models Cointegration Components of a time series Time component : is the value of the time series at time t : Trend is the tendency of a series to rise or fall and exhibit upward or downward trends. These can be long term or short term trends : Seasonality is the regular fluctuation of a time series within a certain period. These fluctuations form a predictable pattern that tends to repeat from one seasonal period to the next. : Cycles are long departures from trend that occur along larger time intervals than seasonality. The lengths of time between successive peaks or troughs of a cycle are not necessarily the same. : Noise is the movement in the series after the trend, seasonality and cyclical movements are removed from the series. It is often random noise in a time series. Visually easy to recognize Time series modeling is a process Depending on the objective of the model, it may be enough to isolate a single component of the series for prediction such as subtracting out the seasonality and cyclical components to estimate only the trend (+ random noise). Time series modeling is a flexible process Perhaps the goal is to measure the expected impact of seasonality on sales. A similar model could estimate the impact of a larger business cycle on sales as well. Assumption for classic TS modeling In order to forecast a time series it must be stationary or transformed into a stationary series. Roughly speaking, a series is stationary if its mean, variance, and covariance between points in the series are constant over time Stationarity transformations Differencing, that is given a series , we can create a new series This helps to remove changes in the level therefore reducing the trend and seasonality Stationarity transformations Differencing, that is given a series , we can create a new series This helps to remove changes in the level therefore reducing the trend and seasonality Remove the trend and seasonality components directly through estimating a line or curve through the series and subtracting it out. Stationarity transformations Differencing, that is given a series , we can create a new series This helps to remove changes in the level therefore reducing the trend and seasonality Remove the trend and seasonality components directly through estimating a line or curve through the series and subtracting it out. For non-constant variance or multiplicative series: Taking the logarithm of the series may stabilize the variance and also convert the effects of trend, seasonality
Answered Same DayJun 12, 2021

Answer To: Recording...

Ishvina answered on Jun 14 2021
147 Votes
Link to Solutions of Module 7 Homework - Time Series :
https://colab.research.google.com/drive/1RP6
iM3SXWlQ-ZlSoj3ApBBxUX_VQnaSh?usp=sharing
The solutions are completely as per the content provided.
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here