Multivariate Approach to Time Series Model Identification

ABSTRACT

This work suggests an exact and systematic model identification approach which is entirely new and addresses most of the challenges of existing methods. We developed quadratic discriminant functions for various orders of autoregressive moving average (ARMA) models. An Algorithm that is to be used alongside our functions was also developed. In achieving this, three hundred sets of time series data were simulated for the development of our functions. Another twenty five sets of simulated time series data were used in testing out the classifiers which correctly classified twenty three out of the twenty five sets. The two cases of misclassification merely imply that our Algorithm will require a second iteration to correctly identify the model in question. The Algorithm was also applied to some real life time series data and it correctly classified it in two iterations.

TABLE OF CONTENTS

Title page……………………………………………………………………………………………………i
Certification……………………………………………………………………….ii
Dedication…………………………………………………………………………iii
Acknowledgement…………………………………………………………………iv
Abstract………………………….………………………………………………..vi
Table of content……….…………………………………………………………..vii
CHAPTER ONE: INTRODUCTION
1.1 Introduction ……….…………………………………………………………1
1.2 Statement of Problem……………………………………………………………..3
1.3 Significance of the study………………………………………………..…….4
1.4 Objective of the study……………………………………………………………..4
1.5 Scope and Limitation………………………………………………………….5
CHAPTER TWO: LITERATURE REVIEW
2.1 Introduction…………………………………………………………………….6
2.2 Review of Literature……………………………………………………………………………6
CHAPTER THREE: METHODOLOGY
3.1 The Bayesian and Fisher’s Classification Rule……………………………..12
viii
3.2 Distributional Assumptions..………………………….……………………….15
3.3 Development of the Proposed Classifier……………………………………….17
3.4 The proposed Algorithm………………………………………………………..18
CHAPTER FOUR: RESULTS
4.1 The Proposed Classifiers….…………………..……………………………20
4.2 Application of the proposed classifiers to simulated Time Series….………26
4.3 Application of our method to real life series………………………………..27
4.4 Brief comparison with other methods………………………………………28
CHAPTER FIVE: SUMMARY AND CONCLUSION
5.1 Summary……………..………………………………………………………29
5.2 Discussion of Results………….………………………………………….…30
5.3 Contributions…………………………………………………………………32
References…………….…………………………………………………………..34

CHAPTER ONE

INTRODUCTION

1.1 INTRODUCTION

Model identification is a crucial part of Time Series model development. The main task of Time Series Modeling is to first examine the series at hand so as to establish the theoretical model that generates the Series. This task seems to be the most challenging and most ambiguous in Time Series Modeling. It has been approached from different perspectives over time. One of the most popular approach is the Box and Jenkins approach presented in Box and Jenkins (1976).

Their method involves going through some iterative steps before a final model is selected. The initial step involves calculating the sample autocorrelation function (ACF) and partial autocorrelation function (PACF) of the series at various lags and comparing their behaviour with known behaviour of some theoretical model and the model that best approximates the sample behavior is tentatively selected. There are two serious problems with their method. First is the fact that one will need to fit several models or do several adjustments to arrive at the final model. This makes the method computationally expensive. Another serious problem is the inability of the method to accurately differentiate between some classes of models. For example, it is not easy to determine the values of p and q in fitting ARMA (p, q) when p,q ¹ 0 since both the ACF and PACF tail off. However, Tsay and Tiao (1984, 1985) addressed this problem by proposing the use of the extended autocorrelation (EACF) and smallest canonical correlation SCAN respectively.

Tsay and Tiao methods are mere extension of the Box and Jenkins approach as they involve comparing behaviour of sample EACF and smallest canonical correlation with the theoretical behaviour. The difficulties in matching these behaviour is even more in these approach because the clear cut off in theoretical EACF table for example is hardly observed in sample EACF, Cryer and Chan (2008).

Far away from the Box and Jenkins approach, Akaike (1969) and lots of other scientists have done several works on various forms of information criteria. Their approach is based on values calculated from residual of already fitted models. The statistic calculated from residual of these fitted models is perceived as information loss as a result of fitting the model. The order of the model that minimizes the information loss is finally adopted. Model identification stage of time series modeling has suffered severe deficiency over time. All the available methods are deficient in terms of accurately selecting the model fit. There is no well defined procedure that gives the exact model or at least with a known error margin before going ahead to fit the model. The approach presented here is well spelt out rules that guides model development. Some of the model identification approaches especially the Box and Jenkins approach is more of art than science as it is highly judgmental.

The information criteria approach is not an exemption in this deficiency. Fitting models with virtually all order before selecting a particular order is almost the same as fitting all the possible models and selecting the one that passes the goodness of fit test. Model selection is a stage that should be put to rest before considering estimation, if one must come back to this stage then the initial approach is supposed to have pre-specified the next model/order to consider. One other important shortfall of the information criteria approach is that its results are comparative meaning that only selected tentative models will have their information criteria calculated compared and then the model with minimum information criteria chosen as the best model.

In this work, we are proposing an exact model identification method which will address most of the issues with available model identification methods. Our method does not require fitting any model into series at hand before selecting the right model. It will be capable of selecting the exact model with very high level of certainty and in event where the selected model is not the appropriate model (since there may be a small error margin); the method also predetermines the next model (from all models under consideration) to be considered. With further transformation as we have in this work, The behaviour of sample ACF and PACF is actually enough information for model identification. The choice of the lag is informed by the fact that ACF and PACF of stationary ARMA models usually cut off with higher lags. The ACF at lag 1 to 4 and PACF at lag 2 to 5 are the information used in this proposed method. Our method utilized the variations inherent in selected ACF and PACF in the classifiers and used them to classify models as ARMA (p, q) with p + q £ 2 (with ARMA(22) added to demonstrate the usefulness of our methods in cases of mixed models)

1.2 STATEMENT OF PROBLEMS

In Time Series analysis, there is need to establish the theoretical model to be fitted into a particular time Series at hand before proceeding with subsequent steps, this work particularly, seeks to develop a new method of achieving this.

In this work, we want to build discriminant functions for each of the classes of theoretical ARMA models considered (i.e. AR(1), AR(2), MA(1), MA(2), ARMA(1,1) and ARMA(2,2)) and also develop an algorithm which is a well organized iterative steps to be followed in applying the functions. We equally hope to test our new method out using both simulated and real life time series and comment on its performance.

1.3 SIGNIFICANCE OF THE STUDY

Series of work have been done on time series model identification but all existing methods are characterized with lack of precise problem formulation, presence of heavy individual judgments, and associated high computational cost due to several adjustments needed to arrive at the final model. Existing model identification methods only give merely tentative and inexact perception of the appropriate theoretical model before fitting hence several models are usually fitted at the initial stage. This work is aimed at developing an exact model identification approach which will address the challenges indicated above.

Our method is hoped to provide a systematic and well organized problem formulation which will be capable of reducing computational cost and rigour associated with time series modeling.

1.4 OBJECTIVES OF THE STUDY

This work is aimed at developing an entirely new model identification method which addresses most of the challenges of existing methods. In achieving this, our specific objectives are outlined below.

· To develop quadratic discriminant functions for each of the ARMA models considered and define the iterative steps (algorithm) to be followed in application of the functions.

· To apply the method to both simulated and real time series data.

· To briefly compare the proposed method with existing methods

1.5 SCOPE AND LIMITATION OF THE STUDY

This work is limited to six classes of ARMA models which are AR(1), AR(2), MA(1), MA(2), ARMA(1,1) and ARMA(2,2). This method can only be applied to those classes of ARMA model meaning that other classes of models like Autoregressive integrated moving average model (ARIMA), seasonal ARIMA, Autoregressive conditional heteroscedasticity (ARCH) models, Generalized Conditional Heteroscedasticity Model (GARCH) etc. are not covered by our
method.