Home

Gretl User's Guide

1. 0 76 Figure 23 2 Log log regression 2 observations dropped from full Engel data set This is OK for moderately large datasets up to say a few thousand observations but on very large problems the simplex algorithm may become seriously bogged down For example Koenker and Hallock 2001 present an analysis of the determinants of birth weights using 198377 observations and with 15 regressors Generating confidence intervals via Barrodale Roberts for a single value of T took about half an hour on a Lenovo Thinkpad T60p with 1 83GHz Intel Core 2 processor If you want confidence intervals in such cases you are advised not to use the intervals option but to compute them using the method of plus or minus so many standard errors One Frisch Newton run took about 8 seconds on the same machine showing the superiority of the interior point method The script below illustrates quantreg 10 y O xlist scalar crit qnorm 95 matrix ci coeff crit stderr ci ci coeff crit stderr print ci The matrix ci will contain the lower and upper bounds of the symmetrical 90 percent confidence intervals To avoid a situation where gretl becomes unresponsive for a very long time we have set the maxi mum number of iterations for the Borrodale Roberts algorithm to the somewhat arbitrary value of 1000 We will experiment further with this but for the meantime if you really want to use this method on a large dataset a
2. Y fel fel 0 18 9 t 1 This estimator is robust with respect to heteroskedasticity but not with respect to autocorrela tion A heteroskedasticity and autocorrelation consistent HAC variant can be obtained using the lThe data file used in this example is available in the Stock and Watson package for gretl See http gretl sourceforge net gretl_data html Chapter 18 GMM estimation 139 Example 18 2 TSLS via GMM open cig_ch10 gdt real avg price including sales tax genr ravgprs avgprs cpi real avg cig specific tax genr rtax tax cpi real average total tax genr rtaxs taxs cpi real average sales tax genr rtaxso rtaxs rtax logs of consumption price income genr lIpackpc log packpc genr lravgprs log ravgprs genr perinc income pop cpi genr Iperinc log perinc restrict sample to 1995 observations smpl restrict year 1995 Equation 10 16 by tsls list xlist const Travgprs Tperinc list zlist const rtaxso rtax lperinc tsls Ipackpc xlist zlist robust setup for gmm matrix Z zlist matrix W inv Z Z series e 0 scalar b0 1 scalar b1 al scalar b2 1 gmm e Ipackpc bO b1 lravgprs b2 lperinc orthog e Z weights W params b0 b1 b2 end gmm Bartlett kernel or similar A univariate version of this is used in the context of the 1rvar O function see equation 5 1 The multivariate version is set out in equation 18 10 T k k
3. 24 2 TpX related menu items The model window The fullest TeX support in gretl is found in the GUI model window This has a menu item titled LaTeX with sub items View Copy Save and Equation options see Figure 24 1 Experts will be aware of something called plain TX which is processed using the program tex The great majority of TEX users however use the BIEX macros initially developed by Leslie Lamport Gretl does not support plain TEX 189 Chapter 24 Gretl and TpX 190 Figure 24 1 BI X menu in model window File Edit Tests Save Graphs Analysis View Model 1 OLS estimates using the 51 obse Copy Dependent variable ENROLL Save a VARIABLE COEFFICIENT Equation options t P VALUE Tabular options const 0 241105 U VUbbUZZ5 3 652 0 00065 CATHOL 0 223530 0 0459701 4 863 0 00001 The first three sub items have branches titled Tabular and Equation By Tabular we mean that the model is represented in the form of a table this is the fullest and most explicit presentation of the results See Table 24 1 for an example this was pasted into the manual after using the Copy Tabular item in gretl a few lines were edited out for brevity Table 24 1 Example of BIFX tabular output Model 1 OLS estimates using the 51 observations 1 51 Dependent variable ENROLL Variable Coefficient Std Error t statistic p value const 0 241105 0 0660225 3 6519 0
4. 90 If the classical conditions for the validity of OLS are satisfied that is if the error term is inde pendently and identically distributed conditional on X then quantile regression is redundant all the conditional quantiles of the dependent variable will march in lockstep with the conditional mean Conversely if quantile regression reveals that the conditional quantiles behave in a manner quite distinct from the conditional mean this suggests that OLS estimation is problematic As of version 1 7 5 gretl offers quantile regression functionality in addition to basic LAD regres sion which has been available since early in gretl s history via the 1ad command 23 2 Basic syntax The basic invocation of quantile regression is quantreg tau reglist where e reglist is a standard gretl regression list dependent variable followed by regressors including the constant if an intercept is wanted and e tau is the desired conditional quantile in the range 0 01 to 0 99 given either as a numerical value or the name of a pre defined scalar variable but see below for a further option Estimation is via the Frisch Newton interior point solver Portnoy and Koenker 1997 which is sub stantially faster than the traditional Barrodale Roberts 1974 simplex approach for large prob lems lWe gratefully acknowledge our borrowing from the quantreg package for GNU R version 4 17 The core of the quantreg package is compos
5. In order to actually retrieve the data the data command is used Its syntax is data series obs format format string query string odbc where series is the name of the gretl series to contain the incoming data which needs not exist prior to the query Note that the data command imports one series at a time format string is an optional parameter used to handle cases when a rectangular organisation of the database cannot be assumed more on this later query string is a string containing the SOL statement used to extract the data Since designing a graphical interface for this is conceptually simple but rather time consuming what we re aiming at is a robust and reasonably powerful implementation of the data transfer Once all the issues are sorted out we ll start implementing a GUI interface Appendix B Data import via ODBC 213 The query string can in principle contain any valid SQL statement which results in a table a character at the end will be added automatically This string may be specified directly within the command as in data x SELECT foo FROM bar odbc which will store into the gretl variable x the content of the column foo from the table bar However since in a real life situation the string containing the SQL statement will be rather long it may be best to create it just before the call to data For example string SqlQry SELECT foo FROM bar data x SqlQry odbc If the optional param
6. Note that marking a variable as discrete does not affect its content It is the user s responsibility to make sure that marking a variable as discrete is a sensible thing to do Note that if you want to recode a continuous variable into classes you can use the genr command and its arithmetic functions as in the following example 49 Chapter 8 Discrete variables 50 nulldata 100 generate a variable with mean 2 and variance 1 genr x normal Q 2 split into 4 classes genr z x gt 0 x gt 2 004 now declare z as discrete discrete z Once a variable is marked as discrete this setting is remembered when you save the file 8 2 Commands for discrete variables The dummi fy command The dummi fy command takes as argument a series x and creates dummy variables for each distinct value present in x which must have already been declared as discrete Example open greene22_2 discrete Z5 mark Z5 as discrete dummify Z5 The effect of the above command is to generate 5 new dummy variables labeled DZ5_1 through DZ5_5 which correspond to the different values in Z5 Hence the variable DZ5_4 is 1 if Z5 equals 4 and 0 otherwise This functionality is also available through the graphical interface by selecting the menu item Add Dummies for selected discrete variables The dummify command can also be used with the following syntax list dlist dummify x This not only creates the dummy variables but also a named
7. The weights statement is used to specify the initial weighting matrix and its syntax is straightfor ward Note however that when more than one step is required that matrix will contain the final weight matrix which most likely will be different from its initial value The params statement specifies the parameters with respect to which the GMM criterion should be minimized it follows the same logic and rules as in the mle and nls commands The minimum is found through numerical minimization via BFGS see section 5 9 and chapter 17 The progress of the optimization procedure can be observed by appending the verbose switch to the end gmm line In this example GMM estimation is clearly a rather silly thing to do since a closed form solution is easily given by OLS 18 3 TSLS as GMM Moving closer to the proper domain of GMM we now consider two stage least squares TSLS as a case of GMM TSLS is employed in the case where one wishes to estimate a linear model of the form y X B u but where one or more of the variables in the matrix X are potentially endogenous correlated with the error term u We proceed by identifying a set of instruments Z which are explanatory for the endogenous variables in X but which are plausibly uncorrelated with u The classic two stage procedure is 1 regress the endogenous elements of X on Z then 2 estimate the equation of interest with the endogenous elements of X replaced by their fitted val
8. and or f It can be shown that the minimum number of restrictions that is necessary to guarantee identification is r Normalizing one coefficient per column to 1 or 1 according to taste is a trivial first step which also helps in that the remaining coefficients can be interpreted as the parameters in the equilibrium relations but this only suffices when r 1 The method that gretl uses by default is known as the Phillips normalization or triangular representation The starting point is writing B in partitioned form as in Bi sfa Bo where f is an r xr matrix and B2 is n 1 x r Assuming that f has full rank can be post multiplied by Bis giving 4 I I pot DL BoBy B The coefficients that gretl produces are B with B known as the matrix of unrestricted coefficients In terms of the underlying equilibrium relationship the Phillips normalization expresses the system lFor comparison with other studies you may wish to normalize differently Using the set command you can do set vecm_norm diag to select a normalization that simply scales the columns of the original 6 such that Bj 1 for i j andi lt r as used in the empirical section of Boswijk and Doornik 2004 Another alternative is set vecm_norm first which scales such that the elements on the first row equal 1 To suppress normalization altogether use set vecm_norm none To return to the default set vecm_norm phillips Chapter 21
9. and z 1 i Chapter 12 Matrix manipulation 89 zl 1 2 3 4 zl 1 2 3 4 Generated matrix z1 z2 1 2 z2 1 2 Generated matrix z2 conj_z1 z1 1 1 conj_z1 z1 1 1 Generated matrix conj_z1 eval cmult z1 z2 eval cmult z1 z2 1 2 4 3 eval cmult z1 conj_z1 eval cmult z1 conj_z1 5 25 Multiple returns and the null keyword Some functions take one or more matrices as arguments and compute one or more matrices these are eigensym Eigen analysis of symmetric matrix eigengen Eigen analysis of general matrix mols Matrix OLS qrdecomp QR decomposition svd Singular value decomposition SVD The general rule is the main result of the function is always returned as the result proper Auxiliary returns if needed are retrieved using pre existing matrices which are passed to the function as pointers see 10 4 If such values are not needed the pointer may be substituted with the keyword null The syntax for qrdecomp eigensym and eigengen is of the form matrix B func A amp C The first argument A represents the input data that is the matrix whose decomposition or analysis is required The second argument must be either the name of an existing matrix preceded by amp to indicate the address of the matrix in question in which case an auxiliary result is written to that matrix or the keyword nu11 in which case the auxiliary result is not produced or is discarded I
10. dot operations a binary operation is applied element by element the result of this operation is obvious if the matrices are of the same size However there are several other cases where such operators may be applied For example if we write matrix C A B then the result C depends on the dimensions of A and B Let A be an m x n matrix and let B be p X q the result is as follows Chapter 12 Matrix manipulation 86 Case Result Dimensions match m p and n q Cij Aij Dij A is a column vector rows match m p n 1 Cij Ai bij B is a column vector rows match m p q 1 Cij Aij bj A is a row vector columns match m 1 n q Cij Aj bij B is arow vector columns match m p q 1 Cij Aij bj A is a column vector Bis a row vector n 1 p 1 cij ai b A is a row vector B is a column vector m 1 q 1 cij a bi Aisa scalar m 1 and n 1 Cij a bij Bisascalar p 1 and q 1 Cij aij b If none of the above conditions are satisfied the result is undefined and an error is flagged Note that this convention makes it unnecessary in most cases to use diagonal matrices to perform transformations by means of ordinary matrix multiplication if Y XV where V is diagonal it is computationally much more convenient to obtain Y via the instruction gt matrix Y X v where v is a row vector containing the diagonal of V In column wise concatenation of an m
11. After estimating ordered models the uhat accessor yields generalized residuals as in binary mod els additionally the yhat accessor function returns 2 so it is possible to compute an unbiased estimator of the latent variable y simply by adding the two together 22 3 Multinomial logit When the dependent variable is not binary and does not have a natural ordering multinomial models are used Gretl does not provide a native implementation of these yet but simple models Chapter 22 Discrete and censored dependent variables 177 Example 22 3 Ordered probit model Replicate the results in Wooldridge Econometric Analysis of Cross Section and Panel Data section 15 10 using pension plan data from Papke AER 1998 The dependent variable pctstck percent stocks codes the asset allocation responses of mostly bonds mixed and mostly stocks as 0 50 100 The independent variable of interest is choice a dummy indicating whether individuals are able to choose their own asset allocations open pension gdt demographic characteristics of participant list DEMOG age educ female black married dummies coding for income level list INCOME finc25 finc35 finc50 finc75 finc100 finc101 Papke s OLS approach ols pctstck const choice DEMOG INCOME wealth89 prftshr save the OLS choice coefficient choice_ols coeff choice estimate ordered probit probit pctstck choice DEMOG INCOME wealth89 prftshr k
12. Chapter 3 Modes of working 14 or start up the command line program gretlcli and consult its help or consult the Gretl Command Reference If you run the script when part of itis highlighted gretl will only run that portion Moreover if you want to run just the current line you can do so by pressing Ctrl Enter Clicking the right mouse button in the script editor wndow produces a pop up menu This gives you the option of executing either the line on which the cursor is located or the selected region of the script if there s a selection in place If the script is editable this menu also gives the option of adding or removing comment markers from the start of the line or lines The gretl package includes over 70 practice scripts Most of these relate to Ramanathan 2002 but they may also be used as a free standing introduction to scripting in gretl and to various points of econometric theory You can explore the practice files under File Script files Practice file There you will find a listing of the files along with a brief description of the points they illustrate and the data they employ Open any file and run it to see the output Note that long commands in a script can be broken over two or more lines using backslash as a continuation character You can if you wish use the GUI controls and the scripting approach in tandem exploiting each method where it offers greater convenience Here are two suggestions e Open a
13. Composite Tests for the Gamma Distribution Journal of Quality Technology 33 pp 47 59 Silverman B W 1986 Density Estimation for Statistics and Data Analysis London Chapman and Hall Stock James H and Watson Mark W 2003 Introduction to Econometrics Boston MA Addison Wesley Swamy P A V B and Arora S S 1972 The Exact Finite Sample Properties of the Estimators of Coefficients in the Error Components Regression Models Econometrica 40 pp 261 75 Verbeek Marno 2004 A Guide to Modern Econometrics 2nd edition New York Wiley White H 1980 A Heteroskedasticity Consistent Covariance Matrix Astimator and a Direct Test for Heteroskedasticity Econometrica 48 pp 817 38 Windmeijer F 2005 A Finite Sample Correction for the Variance of Linear Efficient Two step GMM Estimators Journal of Econometrics 126 pp 25 51 Wooldridge Jeffrey M 2002a Econometric Analysis of Cross Section and Panel Data Cambridge Mass MIT Press Wooldridge Jeffrey M 2002b Introductory Econometrics A Modern Approach 2nd edition Mason Ohio South Western Yalta A Talha and Yalta A Yasemin 2007 GRETL 1 6 0 and its numerical accuracy Journal of Applied Econometrics 22 pp 849 54
14. Likelihood ratio test Chi square 3 15 4042 p value 0 001502 Akaike information criterion AIC 33 7793 Schwarz Bayesian criterion BIC 39 6422 Hannan Quinn criterion HQC 35 7227 Predicted 0 1 Actual 0 18 3 1 3 8 Model 2 Probit estimates using the 32 observations 1 32 Dependent variable GRADE VARIABLE COEFFICIENT STDERROR T STAT SLOPE at mean const 7 45232 2 54247 2 931 GPA 1 62581 0 693883 2 343 0 533347 TUCE 0 0517288 0 0838903 0 617 0 0169697 PSI 1 42633 0 595038 2 397 0 467908 Mean of GRADE 0 344 Number of cases correctly predicted 26 81 2 f beta x at mean of independent vars 0 328 Chapter 22 Discrete and censored dependent variables 175 McFadden s pseudo R squared 0 377478 Log likelihood 12 8188 Likelihood ratio test Chi square 3 15 5459 p value 0 001405 Akaike information criterion AIC 33 6376 Schwarz Bayesian criterion BIC 39 5006 Hannan Quinn criterion HQC 35 581 Predicted 0 1 Actual 0 18 3 1 3 8 In this context the uhat accessor function takes a special meaning it returns generalized resid uals as defined in Gourieroux et al 1987 which can be interpreted as unbiased estimators of the latent disturbances These are defined as yi Pi for the logit model 22 7 ui gt gt i Yi pn 1 4 es for the probit model Among other uses generalized residuals are often used for diagnostic purposes For example it is very e
15. data2 1 SAT scores gretl data files data2 2 College and high school GPAs data2 3 Unemployment inflation and wages data3 1 House prices and sqft data3 2 Income and health care spending data3 3 Patents and R amp D expenditures data3 4 Gross Income and Taxes by States data3 5 Sealing compound shipment data data3 6 Disposable income and consumption data3 7 Toyota station wagon repairs data3 8 Tuition and salary gain for MBAs data3 9 Return on equity and assets Figure 2 1 Practice data files window If you select a row in this window and click on Info this opens a window showing information on the data set in question for example on the sources and definitions of the variables If you find a file that is of interest you may open it by clicking on Open or just double clicking on the file name For the moment let s open data3 6 t In gretl windows containing lists double clicking on a line launches a default action for the associated list entry e g displaying the values of a data series opening a file lFor convenience I will refer to the graphical client program simply as gretl in this manual Note however that the specific name of the program differs according to the computer platform On Linux it is called gret1_x11 while on MS Windows it is gretlw32 exe On Linux systems a wrapper script named gretl is also installed see also the Gretl Command Reference Chapter 2 Getting started 6 This f
16. setting it In the case of setting the program merely records the starting and ending observations and uses these as parameters to the various commands calling for the estimation of models the computation of statistics and so on In the case of restriction the program makes a reduced copy of the dataset and by default treats this reduced copy as a simple undated cross section If you wish to re impose a time series or panel interpretation of the reduced dataset you can do so using the setobs command or the GUI menu item Data Dataset structure lWith one exception if you start with a balanced panel dataset and the restriction is such that it preserves a balanced panel for example it results in the deletion of all the observations for one cross sectional unit then the reduced dataset is still by default treated as a panel Chapter 6 Sub sampling a dataset 45 The fact that restricting the sample results in the creation of a reduced copy of the original dataset may raise an issue when the dataset is very large say several thousands of observations With such a dataset in memory the creation of a copy may lead to a situation where the computer runs low on memory for calculating regression results You can work around this as follows 1 Open the full data set and impose the sample restriction 2 Save a copy of the reduced data set to disk 3 Close the full dataset and open the reduced one 4 Proceed
17. vcv covariance matrix of parameter estimates yhat matrix of fitted values Table 12 4 Matrix accessors for model data Many of the accessors in Table 12 4 behave somewhat differently depending on the sort of model that is referenced as follows e Single equation models sigma gets a scalar the standard error of the residuals uhat and yhat get series e All system estimators sigma gets the cross equation residual covariance matrix uhat gets a matrix of residuals one column per equation e VARs and VECMs stderr and yhat are not available coeff gets a matrix of coefficients one column per equation If the accessors are given without any prefix they retrieve results from the last model estimated if any Alternatively they may be prefixed with the name of a saved model plus a period in which case they retrieve results from the specified model Here are some examples matrix u uhat matrix b m1 coeff matrix v2 m1 vcv 1 2 1 2 The first command grabs the residuals from the last model the second grabs the coefficient vector from model m1 and the third which uses the mechanism of sub matrix selection described above grabs a portion of the covariance matrix from model m1 If the model in question a VAR or VECM only compan returns the companion matrix After a vector error correction model is estimated via Johansen s procedure the matrices jalpha and jbeta are also available These have a number of
18. 21 587 27 0293 0 7987 0 4430 baths 12 192 43 2500 0 2819 0 7838 1R s homepage is at http www r project org 20K who are we kidding But it s friendly competition 3The main reference for R documentation is http cran r project org manuals html In addition R tutorials abound on the Net as always Google is your friend 195 Chapter 25 Gretl and R 196 We will now replicate the above results using R Select the menu item Tools Start GNU R A window similar to the one shown in figure 25 1 should appear R version 2 7 0 2008 04 22 Copyright C 2008 The R Foundation for Statistical Computing ISBN 3 900051 07 0 R is free software and comes with ABSOLUTELY NO WARRANTY You are welcome to redistribute it under certain conditions Type licensel or licence for distribution details Natural language support but running in an English locale R is a collaborative project with many contributors Type contributors for more information and citation on how to cite R or R packages in publications Type demo for some demos help for on line help or help start for an HTML browser interface to help Type q to quit R gt load data from gretl gt gretldata lt read table home jack gretl Rdata tmp header TRUE gt attach gretldata gt 1 Figure 25 1 R window The actual look of the R window may be somewhat different from what you see in Figure 25 1 especially f
19. 2k 19 1 where L represents the maximum loglikelihood as a function of the vector of parameter esti mates 0 and k as above denotes the number of independently adjusted parameters within the model In this formulation with AIC negatively related to the likelihood and positively related to the number of parameters the researcher seeks the minimum AIC The AIC can be confusing in that several variants of the calculation are in circulation For exam ple Davidson and MacKinnon 2004 present a simplified version AIC 6 k which is just 2 times the original in this case obviously one wants to maximize AIC In the case of models estimated by least squares the loglikelihood can be written as 6 54 log 27 logn gt log SSR 19 2 144 Chapter 19 Model selection criteria 145 Substituting 19 2 into 19 1 we get AIC n 1 log 27 logn nlogSSR 2k which can also be written as AIC n log 2k n 1 log 271 19 3 Some authors simplify the formula for the case of models estimated via least squares For instance William Greene writes 19 4 AIC log 53 oh n n This variant can be derived from 19 3 by dividing through by n and subtracting the constant 1 log 27r That is writing AICg for the version given by Greene we have AICG AIC 1 log 271 Finally Ramanathan gives a further variant AICR ones n which is the exponentia
20. 30 087 0 0286 1 0 17758 19 057 0 7833 10 362 0 8017 2 0 11279 8 6950 0 7645 6 3427 0 7483 3 0 043411 2 3522 0 7088 2 3522 0 7076 Both the trace and A max tests accept the null hypothesis that the smallest eigenvalue is O see the last row of the table so we may conclude that the series are in fact non stationary However some linear combination may be I 0 since the A max test rejects the hypothesis that the rank of Mis 0 though the trace test gives less clear cut evidence for this with a p value of 0 1284 21 5 Identification of the cointegration vectors The core problem in the estimation of equation 21 2 is to find an estimate of II that has by con struction rank r so it can be written as II af where is the matrix containing the cointegration vectors and contains the adjustment or loading coefficients whereby the endogenous vari ables respond to deviation from equilibrium in the previous period Without further specification the problem has multiple solutions in fact infinitely many The parameters x and f are under identified if all columns of 6 are cointegration vectors then any arbitrary linear combinations of those columns is a cointegration vector too To put it differently if TI 08 for specific matrices y and Bo then TI also equals x0Q Q7 B for any conformable non singular matrix Q In order to find a unique solution it is therefore necessary to impose some restrictions on
21. 3See http www almeopedia com index php Verduria Appendix B Data import via ODBC Table Consump Field Type time decimal 7 2 income decimal 1 6 6 consump decimal 16 6 Table DATA Field Type year decimal 4 0 qtr decimal 1 0 varname varchar 16 xval decimal 20 10 Table B 3 Example AWM database structure Table Consump 1970 00 1970 25 1970 50 1970 75 1971 00 1971 25 1971 50 424278 975500 433218 709400 440954 219100 446278 664700 447752 681800 453553 860100 460115 133100 344746 944000 350176 890400 355249 672300 361794 719900 362489 970500 368313 558500 372605 015300 Table DATA 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 1 Ne AUNG BW PD CAN CAN CAN CAN COMPR COMPR COMPR COMPR D1 D1 Table B 4 Example AWM database data 517 9085000000 662 5996000000 1130 4155000000 467 2508000000 18 4000000000 18 6341000000 18 3000000000 18 2663000000 1 0000000000 0 0000000000 214 Example B 1 shows two elementary queries first we set up an empty quarterly dataset Then we connect to the database by the open statement Once the connection is established we retrieve one column at a time the data from the Consump table In this case no observation string is necessary because the data are already arranged in a matrix like structure so we only need to bring over the relevant columns In example B 2 on the contrary we make use
22. Chapter 21 Cointegration and Vector Error Correction Models 168 Example 21 2 Further testing of money demand system Input restrict b 1 5 1 end restrict genr 11_uie rln1 restrict b 2 3 1 end restrict genr 11_hfh rIn1 replicate table 5 page 824 printf Testing zero restrictions in cointegration space n printf LR test rank 3 chiA2 3 6 4f 6 4f n 2 C 110 111 pvalue X 3 2 110 111 printf Unit income elasticity LR test rank 3 n printf chiA2 4 g 6 4f n 2 110 11_uie pvalue X 4 2 110 11_uie printf Homogeneity in the Fisher hypothesis n printf LR test rank 3 chiA2 4 6 3f 6 4f n 2 110 11_hfh pvalue X 4 2 110 11_hfh Output Testing zero restrictions in cointegration space LR test rank 3 chiA2 3 1 4763 0 6877 Unit income elasticity LR test rank 3 chiA2 4 17 2071 0 0018 Homogeneity in the Fisher hypothesis LR test rank 3 chiA2 4 15 547 0 0037 Another type of test that is commonly performed is the weak exogeneity test In this context a variable is said to be weakly exogenous if all coefficients on the corresponding row in the matrix are zero If this is the case that variable does not adjust to deviations from any of the long run equilibria and can be considered an autonomous driving force of the whole system The code in Example 21 3 performs this test for each variable in turn thus replicati
23. Fractional differencing The concept of differencing a time series d times is pretty obvious when d is an integer it may seem odd when d is fractional However this idea has a well defined mathematical content consider the function where z and d are real numbers By taking a Taylor series expansion around z 0 we see that d d 1 fu 1 dz D2 30 Chapter 5 Special functions in genr 31 or more compactly f z 1 gt wiz i 1 with m d i 1 d k 1 ki Wk 1 k Yk The same expansion can be used with the lag operator so that if we defined Y 1 Ee X this could be considered shorthand for Y Xt 0 5Xp 1 0 125Xt 2 0 0625Xp 3 In gretl this transformation can be accomplished by the syntax genr Y fracdiff X 0 5 The Hodrick Prescott filter This filter is accessed using the hpfiltO function which takes one argument the name of the variable to be processed A time series y may be decomposed into a trend or growth component grt and a cyclical component Ct Yea Ort Ch Eslida T The Hodrick Prescott filter effects such a decomposition by minimizing the following T T 1 ve gt A gt gii gt Gt an t 1 t 2 The first term above is the sum of squared cyclical components ct y gt The second term is a multiple A of the sum of squares of the trend component s second differences This second term penalizes variations in the growth rate of the trend component
24. Kronecker product test for equality In addition the following operators dot operators apply on an element by element basis s ow CA Na oS a Here are explanations of the less obvious cases For matrix addition and subtraction in general the two matrices have to be of the same dimensions but an exception to this rule is granted if one of the operands is a 1 x 1 matrix or scalar The scalar is implicitly promoted to the status of a matrix of the correct dimensions all of whose elements are equal to the given scalar value For example if A is an m x n matrix and k a scalar then the commands Atk A k matrix C matrix D both produce m x n matrices with elements cij aij k and dij aij k respectively By pre multiplication by transpose we mean for example that matrix C X Y produces the product of X transpose and Y In effect the expression X Y is shorthand for X Y which is also valid In matrix division the statement matrix C A B is interpreted as a request to find the matrix C that solves BC A If B is a square matrix this is treated as equivalent to B7 A which fails if B is singular the numerical method employed here is the LU decomposition If B is a T x k matrix with T gt k then C is the least squares solution C B B B A which fails if B B is singular the numerical method employed here is the QR decomposition Otherwise the operation necessarily fails In
25. The attraction of the Vector Error Correction Model VECM is that it allows the researcher to embed a representation of economic equilibrium relationships within a relatively rich time series specification This approach overcomes the old dichotomy be tween a structural models that faithfully represented macroeconomic theory but failed to fit the data and b time series models that were accurately tailored to the data but difficult if not impos sible to interpret in economic terms The basic idea of cointegration relates closely to the concept of unit roots see section 20 3 Sup pose we have a set of macroeconomic variables of interest and we find we cannot reject the hypoth esis that some of these variables considered individually are non stationary Specifically suppose we judge that a subset of the variables are individually integrated of order 1 or I 1 That is while they are non stationary in their levels their first differences are stationary Given the statistical problems associated with the analysis of non stationary data for example the threat of spurious regression the traditional approach in this case was to take first differences of all the variables before proceeding with the analysis But this can result in the loss of important information It may be that while the variables in question are I 1 when taken individually there exists a linear combination of the variables that is stationary without differencing or I
26. all cases with income over 50000 or just women with income over 50000 By default in a gretl script the answer is the latter women with income over 50000 The second restriction augments the first or in other words the final restriction is the logical product of the new restriction and any restriction that is already in place If you want a new restriction to replace any existing restrictions you can first recreate the full dataset using smp1 full Alternatively you can add the replace option to the smpl command smp1 income gt 50000 restrict replace This option has the effect of automatically re establishing the full dataset before applying the new restriction Unlike a simple setting of the sample restricting the sample may result in selection of non contiguous observations from the full data set It may also change the structure of the data set This can be seen in the case of panel data Say we have a panel of five firms indexed by the variable fi rm observed in each of several years identified by the variable year Then the restriction smp1 year 1995 restrict produces a dataset that is not a panel but a cross section for the year 1995 Similarly smp1 firm 3 restrict produces a time series dataset for firm number 3 For these reasons possible non contiguity in the observations possible change in the structure of the data gretl acts differently when you restrict the sample as opposed to simply
27. are estimated The two commands that gretl offers for estimating these systems are coint2 and vecn respectively The syntax for coint2 is coint2 p ylist xlist zlist Chapter 21 Cointegration and Vector Error Correction Models 159 where p is the number of lags in 21 1 ylist is a list containing the y variables xlist is an optional list of exogenous variables and zlist is another optional list of exogenous variables whose effects are assumed to be confined to the cointegrating relationships The syntax for vecm is vecm p r ylist xlist zlist where p is the number of lags in 21 1 r is the cointegration rank and the lists ylist xlist and zlist have the same interpretation as in coint2 Both commands can be given specific options to handle the treatment of the deterministic compo nent ur These are discussed in the following section 21 3 Interpretation of the deterministic components Statistical inference in the context of a cointegrated system depends on the hypotheses one is willing to make on the deterministic terms which leads to the famous five cases In equation 21 2 the term p is usually understood to take the form Hi Ho M1 E In order to have the model mimic as closely as possible the features of the observed data there is a preliminary question to settle Do the data appear to follow a deterministic trend If so is it linear or quadratic Once this is established one
28. as in Chapter 12 Matrix manipulation 95 list X M where Mis a matrix The matrix must be suitable for conversion that is it must be a row or column vector containing non negative whole number values none of which exceeds the highest ID number of a variable series or scalar in the current dataset Example 12 3 illustrates the use of this sort of conversion to normalize a list moving the constant variable 0 to first position Example 12 3 Manipulating a list function normalize_list matrix x If the matrix representing a list contains var 0 but not in first position move it to first position if x 1 0 scalar k cols x loop for i 2 i lt k i quiet if i 0 x i x 1 x 1 0 break endif end loop end if end function open data9 7 listXl 2304 matrix x X1 normalize_list amp x list Xl x 12 11 Deleting a matrix To delete a matrix just write delete M where M is the name of the matrix to be deleted 12 12 Printing a matrix To print a matrix the easiest way is to give the name of the matrix in question on a line by itself which is equivalent to using the print command matrix M mnormal 100 2 M print M You can get finer control on the formatting of output by using the printf command for example the following code Chapter 12 Matrix manipulation matrix Id I 2 printf 10 3f Id produces print Id print Id Id 2 x 2 1 0 0 1 print
29. benchmarks page http www stanford edu clint bench Chapter 15 Panel data 116 assumption that the covariance matrix of the ni terms is proportional to 2 1 0 0 1 Bol Be 0 Peal 2 gt 0 0 0 0 gt 2 as should be the case if the disturbances in the original model ui were homoskedastic and uncor related This yields a consistent but not necessarily efficient estimator Step 2 uses the parameters estimated in step 1 to compute an estimate of the covariance of the ni t and re estimates the parameters based on that This procedure has the double effect of handling heteroskedasticity and or serial correlation plus producing estimators that are asymptotically ef ficient One step estimators have sometimes been preferred on the grounds that they are more robust Moreover computing the covariance matrix of the 2 step estimator via the standard GMM formulae has been shown to produce grossly biased results in finite samples Gretl however implements the finite sample correction devised by Windmeijer 2005 so standard errors for the 2 step estimator can be considered relatively accurate By default gretl s arbond command estimates the parameters in A L Vit XitP Vi Uit via the 1 step procedure The dependent variable is automatically differenced but note that the right hand side variables are not automatically differenced and all available instruments are used However these choices plus some others can be overridde
30. command you can specify a custom row format using the format flag The format string must be enclosed in double quotes and must be tied to the flag with an equals sign The pattern for the format string is as follows There are four fields representing the coefficient standard error t ratio and p value respectively These fields should be separated by vertical bars they may contain a printf type specification for the formatting of the numeric value in question or may be left blank to suppress the printing of that column subject to the constraint that you can t leave all the columns blank Here are a few examples format 4F 4F 4F 4F format 4f 4F 3f format 5F 4F 4f format 8g 8g 4f The first of these specifications prints the values in all columns using 4 decimal places The second suppresses the p value and prints the t ratio to 3 places The third omits the t ratio The last one again omits the t and prints both coefficient and standard error to 8 significant figures Once you set a custom format in this way it is remembered and used for the duration of the gretl session To revert to the default formatting you can use the special variant format default Further editing Once you have pasted gretl s TeX output into your own document or saved it to file and opened it in an editor you can of course modify the material in any wish you wish In some cases machine generated TeX is hard
31. o world Chapter 12 Matrix manipulation Together with the other two basic types of data series and scalars gretl offers a quite compre hensive array of matrix methods This chapter illustrates the peculiarities of matrix syntax and discusses briefly some of the more complex matrix functions For a full listing of matrix functions and a comprehensive account of their syntax please refer to the Gretl Command Reference 12 1 Creating matrices Matrices can be created using any of these methods 1 By direct specification of the scalar values that compose the matrix in numerical form by reference to pre existing scalar variables or using computed values 2 By providing a list of data series 3 By providing a named list of series 4 Using a formula of the same general type that is used with the genr command whereby a new matrix is defined in terms of existing matrices and or scalars or via some special functions To specify a matrix directly in terms of scalars the syntax is for example matrix A 1 2 3 4 5 6 The matrix is defined by rows the elements on each row are separated by commas and the rows are separated by semi colons The whole expression must be wrapped in braces Spaces within the braces are not significant The above expression defines a 2 x 3 matrix Each element should be a numerical value the name of a scalar variable or an expression that evaluates to a scalar Directly after the closing brace you ca
32. robust variant is computed The documentation for the set command explains the specific options available in this regard Since NLS results are asymptotic there is room for debate over whether or not a correction for degrees of freedom should be applied when calculating the standard error of the regression and the standard errors of the parameter estimates For comparability with OLS and in light of the reasoning given in Davidson and MacKinnon 1993 the estimates shown in gretl do use a degrees of freedom correction 16 7 Numerical accuracy Table 16 1 shows the results of running the gretl NLS procedure on the 27 Statistical Reference Datasets made available by the U S National Institute of Standards and Technology NIST for test ing nonlinear regression software For each dataset two sets of starting values for the parameters lOn a 32 bit Intel Pentium machine a likely value for this parameter is 1 82 x 10722 For a discussion of gretl s accuracy in the estimation of linear models see Appendix D Chapter 16 Nonlinear least squares 121 are given in the test files so the full test comprises 54 runs Two full tests were performed one using all analytical derivatives and one using all numerical approximations In each case the default tolerance was used Out of the 54 runs gretl failed to produce a solution in 4 cases when using analytical derivatives and in 5 cases when using numeric approximation Of the four failur
33. say T The trouble with this is that the resulting may not be a positive definite matrix In practical terms we may end up with negative estimated variances One solution to this problem is offered by The Newey West estimator Newey and West 1987 which assigns declining weights to the sample autocovariances as the temporal separation increases To understand this point it is helpful to look more closely at the covariance matrix given in 14 5 namely MX UX ONO X This is known as a sandwich estimator The bread which appears on both sides is X X yA This is a k x k matrix and is also the key ingredient in the computation of the classical covariance matrix The filling in the sandwich is a X X kxk kxT TXT Txk Since Q E uu the matrix being estimated here can also be written as X E X uu X which expresses gt as the long run covariance of the random k vector X u From a computational point of view it is not necessary or desirable to store the potentially very large T x T matrix as such Rather one computes the sandwich filling by summation as p f 0 gt w f j 1 Chapter 14 Robust covariance matrix estimation 106 where the k x k sample autocovariance matrix j for j gt 0 is given by T a 1 e 5 MN gt UrUt j Xt Xt j t j 1 and w is the weight given to the autocovariance at lag j gt 0 This leaves two questions How exactly do we determine the maximum la
34. select the New script R script menu entry HUBso0BAB 0 lg lt gretldata lg arima lg c 0 1 1 seasonal c 0 1 1 Figure 25 3 Editing window for R scripts In either case you are presented with a window very similar to the editor window used for ordinary gretl scripts as in Figure 25 3 There are two main differences First you get syntax highlighting for R s syntax instead of gretl s Second clicking on the Execute button the gears icon launches an instance of R in which your commands are executed Before R is actually run you are asked if you want to run R interactively or not see Figure 25 4 An interactive run opens an R instance similar to the one seen in the previous section your data will be pre loaded if the pre load data box is checked and your commands will be executed Once this is done you will find yourself at the R prompt where you can enter more commands A non interactive run on the other hand will execute your script collect the output from R and present it to you in an output window R will be run in the background If for example the script in Figure 25 3 is run non interactively a window similar to Figure 25 5 will appear 25 4 Taking stuff back and forth As regards the passing of data between the two programs so far we have only considered passing series from gretl to R In order to achieve a satisfactory degree of interoperability more is needed In the foll
35. so you have an infinite loop unless you arrange for some other way out such as a break statement If the initialization expression in a for loop takes the common form of setting a scalar variable to a given value the string representation of that scalar s value will be available within the loop via the accessor varname 9 3 Progressive mode If the progressive option is given for a command loop special behavior is invoked for certain commands namely print store and simple estimation commands By simple here we mean commands which a estimate a single equation as opposed to a system of equations and b do so by means of a single command statement as opposed to a block of statements as with nls and mle The paradigm is ols other possibilities include ts1s wls logit and so on The special behavior is as follows Estimators The results from each individual iteration of the estimator are not printed Instead after the loop is completed you get a printout of a the mean value of each estimated coefficient across all the repetitions b the standard deviation of those coefficient estimates c the mean value of the estimated standard error for each coefficient and d the standard deviation of the estimated standard errors This makes sense only if there is some random input at each step print When this command is used to print the value of a variable you do not get a print each time round the loop Instead when the
36. t Italy Note that this method does not require scripting at all In fact you might as well use the GUI Menu Add Define new variable for the same purpose with the same syntax Generating an ARMA 1 1 Problem Generate y 0 9y 1 0 5 1 With e NIID O 1 Solution alpha 0 9 theta 0 5 series e normal O series y 0 series y alpha y 1 e theta e 1 Chapter 13 Cheat sheet 99 Comment The statement series y 0 is necessary because the next statement evaluates y re cursively so y 1 must be set Note that you must use the keyword series here instead of writing genr y Oor simply y 0 to ensure that y is a series and not a scalar Conditional assignment Problem Generate y via the following rule o X for dt gt a ae Zt Tor di lt a Solution series y d gt a x z Comment There are several alternatives to the one presented above One is a brute force solution using loops Another one more efficient but still suboptimal would be series y d gt a x d lt a z However the ternary conditional assignment operator is not only the most numerically efficient way to accomplish what we want it is also remarkably transparent to read when one gets used to it Some readers may find it helpful to note that the conditional assignment operator works exactly the same way as the IF function in spreadsheets Generating a time index for panel datasets Problem Gretl ha
37. whether models graphs or pieces of text output can be destroyed using the command free appended to the name of the object as in ADF1 free 3 3 The gretl console A further option is available for your computing convenience Under gretl s Tools menu you will find the item Gretl console there is also an open gretl console button on the toolbar in the main window This opens up a window in which you can type commands and execute them one by one by pressing the Enter key interactively This is essentially the same as gretlcli s mode of operation except that the GUI is updated based on commands executed from the console enabling you to work back and forth as you wish In the console you have command history that is you can use the up and down arrow keys to navigate the list of command you have entered to date You can retrieve edit and then re enter a previous command Chapter 3 Modes of working 16 In console mode you can create display and free objects models graphs or text aa described above for script mode 3 4 The Session concept gretl offers the idea of a session as a way of keeping track of your work and revisiting it later The basic idea is to provide an iconic space containing various objects pertaining to your current working session see Figure 3 2 You can add objects represented by icons to this space as you go along If you save the session these added objects should be avai
38. while right clicking brings up a menu which lets you display or delete the object This popup menu also gives you the option of editing graphs The model table In econometric research it is common to estimate several models with a common dependent vari able the models differing in respect of which independent variables are included or perhaps in respect of the estimator used In this situation it is convenient to present the regression results in the form of a table where each column contains the results coefficient estimates and standard errors for a given model and each row contains the estimates for a given variable across the models In the Icon view window gretl provides a means of constructing such a table and copying it in plain Chapter 3 Modes of working text BTX or Rich Text Format Here is how to do it 1 Estimate a model which you wish to include in the table and in the model display window under the File menu select Save to session as icon or Save as icon and close Repeat step 1 for the other models to be included in the table up to a total of six models When you are done estimating the models open the icon view of your gretl session by se lecting Icon view under the View menu in the main gretl window or by clicking the session icon view icon on the gretl toolbar In the Icon view there is an icon labeled Model table Decide which model you wish to appear in the
39. 0 d_X Y 2 Anderson Hsiao using d_Y 2 as instrument tsls d_Y d_Y 1 d_X 0 d_X d_Y 2 Although the Anderson Hsiao estimator is consistent it is not most efficient it does not make the fullest use of the available instruments for Ay 1 nor does it take into account the differenced structure of the error ni It is improved upon by the methods of Arellano and Bond 1991 and Blundell and Bond 1998 Gretl implements natively the Arellano Bond estimator The rationale behind it is strictly speaking that of a GMM estimator but it can be illustrated briefly as follows see Arellano 2003 for a com prehensive exposition Consider again equation 15 8 if for each individual we have observations dated from 1 to T we may write the following system Ayi3 AXi3B pPAYVi2 Ni3 15 9 Avia AXi4B PAYVia Ni 15 10 AYir AXirB pA Yir Nir 15 11 Following the same logic as for the Anderson Hsiao estimator we see that the only possible in strument for Ay 2 in equation 15 9 is y 1 but for equation 15 10 we can use both y and y 2 as instruments for Ay 3 thereby gaining efficiency Likewise for the final period T we can use as instruments all values of yj up to t T 2 The Arellano Bond technique estimates the above system with an increasing number of instruments for each equation Estimation is typically carried out in two steps in step 1 the parameters are estimated on the 3 Also see Clint Cummins
40. 0007 CATHOL 0 223530 0 0459701 4 8625 0 0000 PUPIL 0 00338200 0 00271962 1 2436 0 2198 WHITE 0 152643 0 0407064 3 7499 0 0005 Mean of dependent variable 0 0955686 S D of dependent variable 0 0522150 Sum of squared residuals 0 0709594 Standard error of residuals 0 0 0388558 Unadjusted R 0 479466 Adjusted R2 0 446241 F 3 47 14 4306 The Equation option is fairly self explanatory the results are written across the page in equa tion format as below ENROLL 0 241105 0 223530 CATHOL 0 00338200 PUPIL 0 152643 WHITE 0 066022 0 04597 0 0027196 0 040706 51 R 0 4462 F 3 47 14 431 6 0 038856 standard errors in parentheses The distinction between the Copy and Save options for both tabular and equation is twofold First Copy puts the TeX source on the clipboard while with Save you are prompted for the name of a file into which the source should be saved Second with Copy the material is copied as a Chapter 24 Gretl and Tex 191 fragment while with Save it is written as a complete file The point is that a well formed T X source file must have a header that defines the documentclass article report book or whatever and tags that say begin document and end document This material is included when you do Save but not when you do Copy since in the latter case the expectation is that you will paste the data into an existing TeX
41. 29 pp 1 16 Imhof J P 1961 Computing the Distribution of Quadratic Forms in Normal Variables Bio metrika 48 pp 419 26 Johansen S ren 1995 Likelihood Based Inference in Cointegrated Vector Autoregressive Models Oxford Oxford University Press Keane Michael P and Wolpin Kenneth I 1997 The Career Decisions of Young Men Journal of Political Economy 105 pp 473 522 Kiviet J F 1986 On the Rigour of Some Misspecification Tests for Modelling Dynamic Relation ships Review of Economic Studies 53 pp 241 61 Koenker R 1981 A Note on Studentizing a Test for Heteroscedasticity Journal of Econometrics 17 pp 107 12 Koenker R 1994 Confidence Intervals for regression quantiles in P Mandl and M Huskova eds Asymptotic Statistics pp 349 359 New York Springer Verlag Koenker R and Bassett G 1978 Regression quantiles Econometrica 46 pp 33 50 Koenker R and Hallock K 2001 Quantile Regression Journal of Economic Perspectives 15 4 pp 143 56 Koenker R and Machado J 1999 Goodness of fit and related inference processes for quantile regression Journal of the American Statistical Association 94 pp 1296 1310 Koenker R and Zhao Q 1994 L estimation for linear heteroscedastic models Journal of Non parametric Statistics 3 pp 223 235 Kwiatkowski D Phillips P C B Schmidt P and Shin Y 1992 Testing the Null
42. 4f 8 3f 8 3f n S C se ZS pv k end loop end function The function zip_estimate is not meant to be executed directly it just contains the number crunching part of the job whose results are then picked up by the end function zipQ In turn Zip_estimate calls other user written functions to perform other tasks The whole set of in ternal functions is shown in the panel 17 3 All the functions shown in 17 2 and 17 3 can be stored in a separate inp file and executed once at the beginning of our job by means of the include command Assuming the name of this script file is zip_est inp the following is an example script which e includes the script file e generates a simulated dataset e performs the estimation of a ZIP model on the artificial data set echo off set messages off include the user written functions include zip_est inp generate the artificial data nulldata 1000 set seed 732237 scalar truep 0 2 Chapter 17 Maximum likelihood estimation 133 scalar b0 0 2 scalar b1 0 5 series x normal O series y uniformO lt truep 0 genpoisCexp b0 b1 x list X const x estimate the zero inflated Poisson model zipty X The results are as follows Zero inflated Poisson model alpha 0 2031 0 0238 8 531 0 000 const 0 2570 0 0417 6 162 0 000 x 0 4667 0 0321 14 527 0 000 A further step may then be creating a function package for accessing your new zipO function via gretl
43. Chapter 18 GMM estimation 141 then setting k 1 equation 18 11 implies the following for any asset j E ES Es Dit Ct F 1 18 12 where Cr is aggregate consumption and and 6 are the risk aversion and discount rate of the representative individual In this case it is easy to see that the deep parameters and 6 can be estimated via GMM by using Y C a 1 e 5 t 1 Pit Ct as the moment condition while any variable known at time t may serve as an instrument Example 18 4 Estimation of the Consumption Based Asset Pricing Model open hall gdt set force_hc on scalar alpha scalar delta series e 0 oo uw list inst const consrat 1 consrat 2 ewr 1 ewr 2 matrix VO 100000 I nelem inst matrix Z inst matrix V1 nobs inv Z Z I H gmm e delta ewr consratA alpha 1 orthog e inst weights VO params alpha delta end gmm I H gmm e delta ewr consratA alpha 1 orthog e inst weights V1 params alpha delta end gmm I H gmm e delta ewr consratA alpha 1 orthog e inst weights VO params alpha delta end gmm iterate I He gmm e delta ewr consratA alpha 1 orthog e inst weights V1 params alpha delta end gmm iterate In the example code given in 18 4 we replicate selected portions of table 3 7 in Hall 2005 The variable consrat is defined as the ratio of monthly consecutive real per capita consumption ser vices and nondurables for
44. Cointegration and Vector Error Correction Models 163 of r equilibrium relations as Mot Dbir riYreit DinYnt y2t DerriYreit D2onYnt drt Drr 1Yritt brnyYr t where the first r variables are expressed as functions of the remaining n r Although the triangular representation ensures that the statistical problem of estimating f is solved the resulting equilibrium relationships may be difficult to interpret In this case the user may want to achieve identification by specifying manually the system of r constraints that gretl will use to produce an estimate of As an example consider the money demand system presented in section 9 6 of Verbeek 2004 The variables used are m the log of real money stock M1 inf 1 inflation cpr the commercial paper rate y log of real GDP and tbr the Treasury bill rate Estimation of can be performed via the commands open money gdt smp1 1954 1 1994 4 vecm 6 2 m infl cpr y tbr rc and the relevant portion of the output reads Maximum likelihood estimates observations 1954 1 1994 4 T 164 Cointegration rank 2 Case 2 Restricted constant beta cointegrating vectors standard errors in parentheses m 1 0000 0 0000 0 0000 0 0000 infl 0 0000 1 0000 0 0000 0 0000 cpr 0 56108 24 367 0 10638 4 2113 y 0 40446 0 91166 0 10277 4 0683 tbr 0 54293 24 786 0 10962 4 3394 const 3 7483 16 751 0 78082 30 909 Interpretati
45. Convergence and initialization The numerical methods used to maximize the likelihood for ARMA models are not guaranteed to converge Whether or not convergence is achieved and whether or not the true maximum of the likelihood function is attained may depend on the starting values for the parameters Gretl employs one of the following two initialization mechanisms depending on the specification of the model and the estimation method chosen 1 Estimate a pure AR model by Least Squares nonlinear least squares if the model requires it otherwise OLS Set the AR parameter values based on this regression and set the MA parameters to a small positive value 0 0001 2 The Hannan Rissanen method First estimate an autoregressive model by OLS and save the residuals Then in a second OLS pass add appropriate lags of the first round residuals to the model to obtain estimates of the MA parameters To see the details of the ARMA estimation procedure add the verbose option to the command This prints a notice of the initialization method used as well as the parameter values and log likelihood at each iteration Besides the build in initialization mechanisms the user has the option of specifying a set of starting values manually This is done via the set command the first argument should be the keyword initvals and the second should be the name of a pre specified matrix containing starting values For example matrix start 0 0 85 0 34 s
46. Ei Ani Vi substituting the above expression in 22 10 you obtain the model that is actually estimated k Vi gt xijBj A i Vi Gal so the hypothesis that censoring does not matter is equivalent to the hypothesis Ho A 0 which can be easily tested The parameters can be estimated via maximum likelihood under the assumption of joint normality of e and ni however a widely used alternative method yields the so called Heckit estimator named after Heckman 1979 The procedure can be briefly outlined as follows first a probit model is fit on equation 22 11 next the generalized residuals are inserted in equation 22 10 to correct for the effect of sample selection Gretl provides the heckit command to carry out estimation its syntax is heckit yX dZ where y is the dependent variable X is a list of regressors d is a dummy variable holding 1 for uncensored observations and Z is a list of explanatory variables for the censoring equation Since in most cases maximum likelihood is the method of choice by default gretl computes ML estimates The 2 step Heckit estimates can be obtained by using the two step option After estimation the uhat accessor contains the generalized residuals As in the ordinary Tobit model the residuals equal the difference between actual and fitted y only for uncensored observations those for which d 1 Example 22 6 shows two estimates from the dataset used in Mroz 1987 the first o
47. If we define the information set at time t as Fi Viola Chapter 17 Maximum likelihood estimation 128 Example 17 1 Estimation of stochastic frontier cost function open banks91 Cobb Douglas cost function ols cost const y pl p2 p3 Cobb Douglas cost function with homogeneity restrictions genr rcost cost p3 genr rpl pl p3 genr rp2 p2 p3 ols rcost const y rpl rp2 Cobb Douglas cost function with homogeneity restrictions and inefficiency scalar b0 coeff const scalar b1 coeff y scalar b2 coeff rp1 scalar b3 coeff rp2 scalar su 0 1 scalar sv 0 1 mle logl InCcnorm e lambda ss In ss 0 5 e ss 2 scalar ss sqrt suA2 svA2 scalar lambda su sv series e rcost bO const bl y b2 rp1 b3 rp2 params b0 b1 b2 b3 su sv end mle then the density of y conditional on F _1 is normal YtlF 1 N u ht By means of the properties of conditional distributions the joint density can be factorized as follows T FO i Foi f vo t 1 If we treat yo as fixed then the term f yo does not depend on the unknown parameters and there fore the conditional log likelihood can then be written as the sum of the individual contributions as T E u w a B gt 17 6 t 1 where A 1 YH 1 Y H ti 108 Fo E7 log E TR The following script shows a simple application of this technique which uses the data file djclose Chapter 17 Maximum
48. Panel data 15 1 Estimation of panel models Pooled Ordinary Least Squares The simplest estimator for panel data is pooled OLS In most cases this is unlikely to be adequate but it provides a baseline for comparison with more complex estimators If you estimate a model on panel data using OLS an additional test item becomes available In the GUI model window this is the item panel diagnostics under the Tests menu the script counterpart is the hausman command To take advantage of this test you should specify a model without any dummy variables represent ing cross sectional units The test compares pooled OLS against the principal alternatives the fixed effects and random effects models These alternatives are explained in the following section The fixed and random effects models In gretl version 1 6 0 and higher the fixed and random effects models for panel data can be es timated in their own right In the graphical interface these options are found under the menu item Model Panel Fixed and random effects In the command line interface one uses the panel command with or without the random effects option This section explains the nature of these models and comments on their estimation via gretl The pooled OLS specification may be written as Vit Xi B Uit 15 1 where Yit is the observation on the dependent variable for cross sectional unit i in period t Xit is a 1 x k vector of independent variables observed for
49. T variety that is many units are observed in relatively few periods The Arellano estimator is n XX 3 Xiu X X i 1 where X is the matrix of regressors with the group means subtracted in the case of fixed effects u denotes the vector of residuals for unit i and n is the number of cross sectional units Cameron and Trivedi 2005 make a strong case for using this estimator they note that the ordinary White HCCME can produce misleadingly small standard errors in the panel context because it fails to take autocorrelation into account In cases where autocorrelation is not an issue however the estimator proposed by Beck and Katz 1995 and discussed by Greene 2003 chapter 13 may be appropriate This estimator which takes into account contemporaneous correlation across the units and heteroskedasticity by unit is 5 1 1 E Oj 4 f 1 Se OX ISS xx xx ees The covariances 6 are estimated via Chapter 14 Robust covariance matrix estimation 109 where T is the length of the time series for each unit Beck and Katz call the associated standard errors Panel Corrected Standard Errors PCSE This estimator can be invoked in gretl via the command set pcse on The Arellano default can be re established via set pcse off Note that regardless of the pcse setting the robust estimator is not used unless the robust flag is given or the Robust box is checked in the GUI program Chapter 15
50. active variable is set by highlighting it clicking on its row in the main data window Most options will be self explanatory Note that you can rename a variable and can edit its descriptive label under Edit attributes You can also Define a new variable via a formula e g involving Chapter 2 Getting started 11 some function of one or more existing variables For the syntax of such formulae look at the online help for Generate variable syntax or see the genr command in the Gretl Command Reference One simple example foo x1 x2 will create a new variable foo as the product of the existing variables x1 and x2 In these formulae variables must be referenced by name not number e Model menu For details on the various estimators offered under this menu please consult the Gretl Command Reference Also see Chapter 16 regarding the estimation of nonlinear models e Help menu Please use this as needed It gives details on the syntax required in various dialog entries 2 4 Keyboard shortcuts When working in the main gretl window some common operations may be performed using the keyboard as shown in the table below Return Opens a window displaying the values of the currently selected variables it is the same as selecting Data Display Values Delete Pressing this key has the effect of deleting the selected variables A confirma tion is required to prevent accidental deletions e Has the same effect
51. anything other than the expected result please send a bug report to cottrel 1 wfu edu All regression statistics are printed to 6 significant figures in the current version of gretl except when the multiple precision plugin is used in which case results are given to 12 figures If you want to examine a particular value more closely first save it for example using the genr command then print it using print long see the Gretl Command Reference This will show the value to 10 digits or more if you set the internal variable longdigits to a higher value via the set command Appendix E Related free software Gretl s capabilities are substantial and are expanding Nonetheless you may find there are some things you can t do in gretl or you may wish to compare results with other programs If you are looking for complementary functionality in the realm of free open source software we recommend the following programs The self description of each program is taken from its website e GNU R r project org R is a system for statistical computation and graphics It consists of a language plus a run time environment with graphics a debugger access to certain system functions and the ability to run programs stored in script files It compiles and runs on a wide variety of UNIX platforms Windows and MacOS Comment There are numerous add on packages for R covering most areas of statistical work e GNU Octave www octave org GNU Oc
52. case is a matrix with r columns and the case with restricted constant entails the restriction that uy should be some linear combination of the columns of 3 If a linear trend is included in the model the deterministic part of the VAR becomes Ho ut The reasoning is practically the same as above except that the focus now centers on uy rather than fo The counterpart to the restricted constant case discussed above is a restricted trend case such that the cointegration relationships include a trend but the first differences of the variables in question do not In the case of an unrestricted trend the trend appears in both the cointegration relationships and the first differences which corresponds to the presence of a quadratic trend in the variables themselves in levels In order to accommodate the five cases gretl provides the following options to the coint2 and vecm commands Ht option flag description 0 nc no constant Ho amp Ho 0 rc restricted constant Ho default unrestricted constant Ho pit 0 crt constant restricted trend Ho Hit ct constant unrestricted trend Note that for this command the above options are mutually exclusive In addition you have the option of using the seasonal options for augmenting u with centered seasonal dummies In each case p values are computed via the approximations by Doornik 1998 21 4 The Johansen cointegration tests The two Johansen tests for cointegr
53. context the IID assumption means that E uz in relation to equa tion 15 1 equals a constant d for all i and t while the covariance E u su equals zero for all s t and the covariance E u ru r equals zero for all j i If these assumptions are not met and they are unlikely to be met in the context of panel data OLS is not the most efficient estimator Greater efficiency may be gained using generalized least squares GLS taking into account the covariance structure of the error term Consider observations on a given unit i at two different times s and t From the hypotheses above it can be worked out that Var u s Var uit 0 o while the covariance between Uis and Uit is given by E ujswit o In matrix notation we may group all the T observations for unit i into the vector y and write it as yi X B ui 15 4 The vector u which includes all the disturbances for individual i has a variance covariance matrix given by Var u gt o fI oJ 15 5 where J is a square matrix with all elements equal to 1 It can be shown that the matrix 0 Kel 2 where 0 1 ae has the property KEK ofl Chapter 15 Panel data 112 It follows that the transformed system Kiyi KiXiB Ku 15 6 satisfies the Gauss Markov conditions and OLS estimation of 15 6 provides efficient inference But since Kiyi yi 9Yi GLS estimation is equivalent to OLS using quasi demeaned varia
54. e The program then searches for valid file collections not necessarily known in advance in these places the system data directory the system script directory the user directory and all first level subdirectories of these For reference typical values for these directories are shown in Table 4 1 Note that PERSONAL is a placeholder that is expanded by Windows corresponding to My Documents on English language systems Linux MS Windows system data dir usr share gret1 data c Program Files gretl data system script dir usr share gretl scripts c Program Files gret1l scripts user dir HOME gret PERSONAL gret1 Table 4 1 Typical locations for file collections Any valid collections will be added to the selection windows So what constitutes a valid file collec tion This comprises either a set of data files in gretl XML format with the gdt suffix or a set of script files containing gretl commands with inp suffix in each case accompanied by a master file or catalog The gretl distribution contains several example catalog files for instance the file descriptions in the misc sub directory of the gretl data directory and ps_descriptions in the misc sub directory of the scripts directory If you are adding your own collection data catalogs should be named descriptions and script catalogs should be be named ps_descriptions In each case the catalog should be placed along with the associated data or script files in its o
55. estimation problem However this does seem to be the case more often than not Gretl therefore performs scale removal where feasible unless you e explicitly forbid this by giving the no scaling option flag to the restrict command or e provide a specific vector of initial values or e select the LBFGS algorithm for maximization Scale removal is deemed infeasible if there are any cross column restrictions on f or any non homogeneous restrictions involving more than one element of In addition experimentation has suggested to us that scale removal is inadvisable if the system is just identified with the normalization s included so we do not do it in that case By just identified we mean that the system would not be identified if any of the restrictions were removed On that criterion the above example is not just identified since the removal of the second restriction would not affect identification and gretl would in fact perform scale removal in this case unless the user specified otherwise 9As a numerical matter that is In principle this should make no difference Chapter 22 Discrete and censored dependent variables 22 1 Logit and probit models It often happens that one wants to specify and estimate a model in which the dependent variable is not continuous but discrete A typical example is a model in which the dependent variable is the occupational status of an individual 1 employed 0 unemployed A con
56. gretl as in most statistical programs floating point numbers are represented as doubles double precision values that typically have a storage size of eight bytes or 64 bits Since there are only so many bits available only so many floating point numbers can be represented doubles do not model the real line Typically doubles can represent numbers over the range roughly 1 7977x 10308 but only to about 15 digits of precision Suppose you re interested in the left tail of the x distribution with 50 degrees of freedom you d like to know the CDF value for x 0 9 Take a look at the following interactive session genr pl cdf X 50 0 9 Generated scalar pl ID 2 8 94977e 35 genr p2 pvalue X 50 0 9 Generated scalar p2 ID 3 1 genr test 1 p2 Generated scalar test ID 4 0 The cdf function has produced an accurate value but the pvalue function gives an answer of 1 from which it is not possible to retrieve the answer to the CDF question This may seem surprising at first but consider if the value of p1 above is correct then the correct value for p2 is 1 8 94977 x 10735 But there s no way that value can be represented as a double that would require over 30 digits of precision Of course this is an extreme example If the x in question is not too far off into one or other tail of the distribution the cdf and pvalue functions will in fact produce complementary answers as shown below genr pl cd
57. intimidating at first we encourage users to take advantage of the power of gretl s scripting language as soon as they feel comfortable with the program 13 1 Dataset handling Weird periodicities Problem You have data sampled each 3 minutes from 9am onwards you ll probably want to specify the hour as 20 periods Solution setobs 20 9 1 special Comment Now functions like sdiff O seasonal difference or estimation methods like seasonal ARIMA will work as expected Help my data are backwards Problem Gretl expects time series data to be in chronological order most recent observation last but you have imported third party data that are in reverse order most recent first Solution setobs 1 1 cross section genr sortkey obs dataset sortby sortkey setobs 1 1950 time series Comment The first line is required only if the data currently have a time series interpretation it removes that interpretation because for fairly obvious reasons the dataset sortby operation is not allowed for time series data The following two lines reverse the data using the negative of the built in index variable obs The last line is just illustrative it establishes the data as annual time series starting in 1950 If you have a dataset that is mostly the right way round but a particular variable is wrong you can reverse that variable as follows genr x sortby obs x Dropping missing observations selectively P
58. is a complex script that detects which tools you have on your system and sets things up The configure command accepts many options you may want to run configure help first to see what options are available One option you way wish to tweak is prefix By default the installation goes under usr local but you can change this For example configure prefix usr will put everything under the usr tree Another useful option refers to the fact that by default gretl offers support for the gnome desktop If you want to suppress the gnome specific features you can pass the option without gnome to configure In order to have the documentation built we need to pass the relevant option to configure as in configure enable build doc You will see a number of checks being run and if everything goes according to plan you should see a summary similar to that displayed in Example C 1 t If you re using CVS it s a good idea to re run the configure script after doing an update This is not always necessary but sometimes it is and it never does any harm For this purpose you may want to write a little shell script that calls configure with any options you want to use Appendix C Building gretl 220 Example C 1 Output from configure enable build doc Configuration Installation path usr local Use readline library yes Use gnuplot for graphs yes Use PNG for gnuplot graphs yes Use LaTeX for typesetting output yes
59. lags this is equivalent to giving a list with just one member The dummi fy function creates a set of dummy variables coding for all but one of the distinct values taken on by the original variable which should be discrete The smallest value is taken as the omitted catgory Like lags this function returns a list even if the input is a single series Generating series from lists Once a list is defined gretl offers several functions that apply to the list and return a series In most cases these functions also apply to single series and behave as natural extensions when applied to a list but this is not always the case For recognizing and handling missing values Gretl offers several functions see the Gretl Command Reference for details In this context it is worth remarking that the ok function can be used with a list argument For example list xlist x1 x2 x3 series xok ok xlist Chapter 11 Named lists and strings 78 YpcFR YpcGE YpcIT NFR NGE NIT 1997 1149 124 6 1193 59830 635 82034 771 56890 372 1998 115 3 122 7 120 0 60046 709 82047 195 56906 744 1999 1150 122 4 117 8 60348 255 82100 243 56916 317 2000 115 6 118 8 117 2 60750 876 82211 508 56942 108 2001 1160 116 9 118 1 61181 560 82349 925 56977 217 2002 116 3 1155 112 2 61615 562 82488 495 57157 406 2003 1121 1169 111 0 62041 798 82534 176 57604 658 2004 1103 116 6 106 9 62444 707 82516 260 58175 310 2005 1124 115 1 105 1 62818 185 82469 422 58607 043 2006 1
60. likelihood estimation 129 itis one of the example dataset supplied with gretl and contains daily data from the Dow Jones stock index open djclose series y 100 1diff djclose scalar mu 0 0 scalar omega 1 scalar alpha 0 4 scalar beta 0 0 mle 11 0 5 log h CeA2 h series e y mu series h var y series h omega alpha e 1 A2 beta h 1 params mu omega alpha beta end mle 17 5 Analytical derivatives Computation of the score vector is essential for the working of the BFGS method In all the previous examples no explicit formula for the computation of the score was given so the algorithm was fed numerically evaluated gradients Numerical computation of the score for the i th parameter is performed via a finite approximation of the derivative namely 0 01 On Mr 0i h 0n Pi 0i h On 00 2h where h is a small number In many situations this is rather efficient and accurate However one might want to avoid the approximation and specify an exact function for the derivatives As an example consider the following script nulldata 1000 genr x1 normal O genr x2 normal O genr x3 normal O genr ystar x1 x2 x3 normal genr y ystar gt 0 scalar b0 scalar b1 scalar b2 scalar b3 oooo mle logl y In P 1 yY In 1 P series ndx bO b1 x1 b2 x2 b3 x3 series P cnorm ndx params b0 b1 b2 b3 end mle verbose Here 1000 data
61. list see section 11 1 that can be used afterwards The following example computes summary statistics for the variable Y for each value of Z5 open greene22_2 discrete Z5 mark Z5 as discrete list foo dummify Z5 loop foreach i foo smpl i restrict replace summary Y end loop smp1 full Since dummi fy generates a list it can be used directly in commands that call for a list as input such as ols For example open greene22_2 discrete Z5 mark Z5 as discrete ols Y O dummify Z5 The freq command The freq command displays absolute and relative frequencies for a given variable The way fre quencies are counted depends on whether the variable is continuous or discrete This command is also available via the graphical interface by selecting the Variable Frequency distribution menu entry Chapter 8 Discrete variables For discrete variables frequencies are counted for each distinct value that the variable takes For continuous variables values are grouped into bins and then the frequencies are counted for each bin The number of bins by default is computed as a function of the number of valid observations in the currently selected sample via the rule shown in Table 8 1 However when the command is invoked through the menu item Variable Frequency Plot this default can be overridden by the user For example the following code open greenel19_1 freq TUCE Table 8 1 Number of bins for various s
62. loop is terminated you get a printout of the mean and standard deviation of the variable across the repetitions of the loop This mode is intended for use with variables that have a scalar value at each iteration for example the error sum of squares from a regression Data series cannot be printed in this way store This command writes out the values of the specified scalars from each time round the loop to a specified file Thus it keeps a complete record of their values across the iterations For example coefficient estimates could be saved in this way so as to permit subsequent examination of their frequency distribution Only one such store can be used in a given loop Chapter 9 Loop constructs 58 9 4 Loop examples Monte Carlo example A simple example of a Monte Carlo loop in progressive mode is shown in Example 9 1 Example 9 1 Simple Monte Carlo loop nulldata 50 seed 547 genr x 100 uniform open a progressive loop to be repeated 100 times loop 100 progressive genr u 10 normalQ construct the dependent variable genr y 10 x u run OLS regression ols y const x grab the coefficient estimates and R squared genr a coeff const genr b coeff x genr r2 rsq arrange for printing of stats on these print a b r2 and save the coefficients to file store coeffs gdt a b endloop This loop will print out summary statistics for the a and b estimates and R across the 100 rep etiti
63. marked with a prefix of in the function definition and the corresponding argument is marked with the complementary prefix amp in the caller For example function get_uhat_and_ess series y list xvars scalar ess ols y 0 xvars quiet ess ess series uh uhat return series uh end function main script open data4 1 list xlist 2 3 4 function call scalar SSR series resid get_uhat_and_ess price xlist amp SSR In the above we may say that the function is given the address of the scalar variable SSR and it assigns a value to that variable under the local name ess For anyone used to programming in C note that it is not necessary or even possible to dereference the variable in question within the function using the operator Unembellished use of the name of the variable is sufficient to access the variable in outer scope An address parameter of this sort can be used as a means of offering optional information to the caller That is the corresponding argument is not strictly needed but will be used if present In that case the parameter should be given a default value of nu11 and the the function should test to see if the caller supplied a corresponding argument or not using the built in function isnul1Q For example here is the simple function shown above modified to make the filling out of the ess value optional function get_uhat_and_ess series y list xvars scalar ess nul1 ols y 0 xvars q
64. no copy 3 66 seconds 0 01 seconds If a pointer argument is used for this sort of purpose and the object to which the pointer points is not modified by the function it is a good idea to signal this to the user by adding the const qualifier as shown for function b in Example 10 1 When a pointer argument is qualified in this way any attempt to modify the object within the function will generate an error List arguments The use of a named list as an argument to a function gives a means of supplying a function with a set of variables whose number is unknown when the function is written for example sets of regressors or instruments Within the function the list can be passed on to commands such as ols A list argument can also be unpacked using a foreach loop construct but this requires some care For example suppose you have a list X and want to calculate the standard deviation of each variable in the list You can do loop foreach i X scalar sd_ i sd X i end loop Please note a special piece of syntax is needed in this context If we wanted to perform the above task on a list in a regular script not inside a function we could do loop foreach i X scalar sd_ i sd i end loop Chapter 10 User defined functions 67 Example 10 1 Performance comparison values versus pointer function a matrix X r rows X return scalar r end function function b const matrix X r rows X return scalar r
65. of Stationarity Against the Alternative of a Unit Root How Sure Are We That Economic Time Series Have a Unit Root Journal of Econometrics 54 pp 159 78 Locke C 1976 A Test for the Composite Hypothesis that a Population has a Gamma Distribution Communications in Statistics Theory and Methods A5 4 pp 351 64 Lucchetti R Papi L and Zazzaro A 2001 Banks Inefficiency and Economic Growth A Micro Macro Approach Scottish Journal of Political Economy 48 pp 400 424 McCullough B D and Renfro Charles G 1998 Benchmarks and software standards A case study of GARCH procedures Journal of Economic and Social Measurement 25 pp 59 71 Mackinnon J G 1996 Numerical Distribution Functions for Unit Root and Cointegration Tests Journal of Applied Econometrics 11 pp 601 18 MacKinnon J G and White H 1985 Some Heteroskedasticity Consistent Covariance Matrix Esti mators with Improved Finite Sample Properties Journal of Econometrics 29 pp 305 25 Maddala G S 1992 Introduction to Econometrics 2nd edition Englewood Cliffs NJ Prentice Hall Matsumoto M and Nishimura T 1998 Mersenne twister a 623 dimensionally equidistributed uniform pseudo random number generator ACM Transactions on Modeling and Computer Simulation 8 pp 3 30 Mroz T 1987 The Sensitivity of an Empirical Model of Married Women s Hours of Work to Eco nomic and Statistic
66. of the European Union there is a presumption in favor of fixed effects If it comprises observations on a large number of randomly selected individuals as in many epidemiological and other longitudinal studies there is a presumption in favor of random effects Besides this general heuristic however various statistical issues must be taken into account 1 Some panel data sets contain variables whose values are specific to the cross sectional unit but which do not vary over time If you want to include such variables in the model the fixed effects option is simply not available When the fixed effects approach is implemented using dummy variables the problem is that the time invariant variables are perfectly collinear with the per unit dummies When using the approach of subtracting the group means the issue is that after de meaning these variables are nothing but zeros 2 A somewhat analogous prohibition applies to the random effects estimator This estimator is in effect a matrix weighted average of pooled OLS and the between estimator Suppose we have observations on n units or individuals and there are k independent variables of interest If k gt n the between estimator is undefined since we have only n effective observations and hence so is the random effects estimator If one does not fall foul of one or other of the prohibitions mentioned above the choice between fixed effects and random effects may be express
67. of the log of the sample size The authors argue that their procedure provides a strongly consistent estimation procedure for the order of an autoregression and that compared to other strongly consistent procedures this procedure will underestimate the order to a lesser degree Gretl reports the AIC BIC and HQC calculated as explained above for most sorts of models The key point in interpreting these values is to know whether they are calculated such that smaller values are better or such that larger values are better In gretl smaller values are better one wants to minimize the chosen criterion Chapter 20 Time series models 20 1 Introduction Time series models are discussed in this chapter and the next In this chapter we concentrate on ARIMA models unit root tests and GARCH The following chapter deals with cointegration and error correction 20 2 ARIMA models Representation and syntax The arma command performs estimation of AutoRegressive Integrated Moving Average ARIMA models These are models that can be written in the form b L yt O L e 20 1 where L and L are polynomials in the lag operator L defined such that L x Xt n and t is a white noise process The exact content of y of the AR polynomial and of the MA polynomial 0 will be explained in the following Mean terms The process y as written in equation 20 1 has without further qualifications mean zero If the
68. of the observation string since we are drawing data from the DATA table which is not rectangular The SQL statement stored in the string S produces a table with three columns The ORDER BY clause ensures that the rows will be in chronological order although this is not strictly necessary in this case Appendix B Data import via ODBC 215 Example B 1 Simple query from a rectangular table nulldata 160 setobs 4 1970 1 time open dsn AWM user 0tto password Bingo odbc string Qryl SELECT consump FROM Consump data cons Qry1 odbc string Qry2 SELECT income FROM Consump data inc Qry2 odbc Example B 2 Simple query from a non rectangular table string S select year qtr xval from DATA where varname WLN ORDER BY year qtr data wln obs format d d AS odbc Appendix B Data import via ODBC 216 Example B 3 Handling of missing values for a non rectangular table string foo select year qtr xval from DATA where varname STN AND qtr gt 1 data bar obs format d d foo odbc print bar byobs Example B 3 shows what happens if the rows in the outcome from the SELECT statement do not match the observations in the currently open gretl dataset The query includes a condition which filters out all the data from the first quarter The query result invisible to the user would be something like year qtr xval 1970
69. operations on a list of variables Here is an example of the syntax loop foreach i peach pear plum print i endloop This loop will execute three times printing out peach pear and plum on the respective itera tions The numerical value of the index starts at 1 and is incremented by 1 at each iteration If you wish to loop across a list of variables that are contiguous in the dataset you can give the names of the first and last variables in the list separated by rather than having to type all the names For example say we have 50 variables AK AL WY containing income levels for the states of the US To run a regression of income on time for each of the states we could do genr time loop foreach i AL WY ols i const time endloop This loop variant can also be used for looping across the elements in a named list see chapter 11 For example list ylist yl y2 y3 loop foreach i ylist ols i const x1 x2 endloop Note that if you use this idiom inside a function see chapter 10 looping across a list that has been supplied to the function as an argument it is necessary to use the syntax listname 1 to reference the list member variables In the context of the example above this would mean replacing the third line with ols ylist i const x1 x2 For loop The final form of loop control emulates the for statement in the C programming language The sytax is loop for followed by three compon
70. periods are evenly spaced you may want to use lagged values of variables in a panel regression but see section 15 2 below you may also wish to construct first differences of variables of interest Once a dataset is identified as a panel gretl will handle the generation of such variables correctly For example the command genr x1_1 x1 1 will create a variable that contains the first lag of x1 where available and the missing value code where the lag is not available e g at the start of the time series for each group When you run a regression using such variables the program will automatically skip the missing observations When a panel data set has a fairly substantial time dimension you may wish to include a trend in the analysis The command genr time creates a variable named time which runs from 1 to T for each unit where T is the length of the time series dimension of the panel If you want to create an index that runs consecutively from 1 to m x T where m is the number of units in the panel use genr index Basic statistics by unit Gretl contains functions which can be used to generate basic descriptive statistics for a given vari able on a per unit basis these are pnobs number of valid cases pmin and pmax minimum and maximum and pmean and psd mean and standard deviation As a brief illustration suppose we have a panel data set comprising 8 time series observations on each of N units or groups Then the co
71. present context moreover the expectation theory of interest rates implies that the third equilibrium relationship should include a constant for the liquidity premium However since in this example the system is estimated with the constant term unrestricted the liquidity premium gets merged in the system intercept and disappears from zt Modulo what appear to be a few typos in the article Chapter 21 Cointegration and Vector Error Correction Models 167 Example 21 1 Estimation of a money demand system with constraints on Input open brand_cassola gdt perform a few transformations m_p m_p 100 y y 100 infl inf1 4 rs rs 4 rl rl 4 replicate table 4 page 824 vecm 2 3 m_p infl rl rs y q genr 110 1n1 restrict full b 1 1 b 1 2 b 1 4 b 2 1 b 2 2 b 2 4 bi2 5 b 3 1 bE32 b 3 3 b 3 4 bis 5 0 end restrict genr 111 rinl ROO0O0OO0RrPRO0oOooRr I Partial output 116 60268 Unrestricted loglikelihood lu 115 86451 Restricted loglikelihood lr 2 Clu Ir 1 47635 P Chi Square 3 gt 1 47635 0 68774 beta cointegrating vectors standard errors in parentheses m_p 1 0000 0 0000 0 0000 0 0000 0 0000 0 0000 infl 0 0000 1 0000 0 0000 0 0000 0 0000 0 0000 rl 1 6108 0 67100 1 0000 0 62752 0 049482 0 0000 rs 0 0000 0 0000 1 0000 0 0000 0 0000 0 0000 y 1 3304 0 0000 0 0000 0 030533 0 0000 0 0000
72. probit and logit model are estimated in gretl via maximum likelihood where the log likelihood can be written as L B gt Infl F zi gt MF z 22 6 y 0 yi 1 173 Chapter 22 Discrete and censored dependent variables 174 which is always negative since 0 lt F lt 1 Since the score equations do not have a closed form solution numerical optimization is used However in most cases this is totally transparent to the user since usually only a few iterations are needed to ensure convergence The verbose switch can be used to track the maximization algorithm Example 22 1 Estimation of simple logit and probit models open greenel19_1 logit GRADE const GPA TUCE PSI probit GRADE const GPA TUCE PSI As an example we reproduce the results given in Greene 2000 chapter 21 where the effective ness of a program for teaching economics is evaluated by the improvements of students grades Rumning the code in example 22 1 gives the following output Model 1 Logit estimates using the 32 observations 1 32 Dependent variable GRADE VARIABLE COEFFICIENT STDERROR T STAT SLOPE at mean const 13 0213 4 93132 2 641 GPA 2 82611 1 26294 2 238 0 533859 TUCE 0 0951577 0 141554 0 672 0 0179755 PSI 2 37869 1 06456 2 234 0 449339 Mean of GRADE 0 344 Number of cases correctly predicted 26 81 2 f beta x at mean of independent vars 0 189 McFadden s pseudo R squared 0 374038 Log likelihood 12 8896
73. section we mean observations on a set of units which may be firms countries in dividuals or whatever at a common point in time This is the default interpretation for a data file if gretl does not have sufficient information to interpret data as time series or panel data they are automatically interpreted as a cross section In the unlikely event that cross sectional data are wrongly interpreted as time series you can correct this by selecting the Data Dataset struc ture menu item Click the cross sectional radio button in the dialog box that appears then click Forward Click OK to confirm your selection Time series data When you import data from a spreadsheet or plain text file gretl will make fairly strenuous efforts to glean time series information from the first column of the data if it looks at all plausible that such information may be present If time series structure is present but not recognized again you 2See www estima com Chapter 4 Data files 23 can use the Data Dataset structure menu item Select Time series and click Forward select the appropriate data frequency and click Forward again then select or enter the starting observation and click Forward once more Finally click OK to confirm the time series interpretation if it is correct or click Back to make adjustments if need be Besides the basic business of getting a data set interpret
74. selection is to be made The cols specification works in the same way mutatis mutandis Here are some examples matrix B A 1 matrix B A 2 3 3 5 matrix B A 2 2 matrix idx 1 2 6 matrix B A idx The first example selects row 1 from matrix A the second selects a 2 x 3 submatrix the third selects a scalar and the fourth selects rows 1 2 and 6 from matrix A In addition there is a pre defined index specification diag which selects the principal diagonal of a square matrix as in B diag where B is square You can use selections of this sort on either the right hand side of a matrix generating formula or the left Here is an example of use of a selection on the right to extract a 2 x 2 submatrix B from a 3 x 3 matrix A matrix A matrix B A 1 2 2 3 Il ares rR N w S 5 6 7 8 9 And here are examples of selection on the left The second line below writes a 2 x 2 identity matrix into the bottom right corner of the 3 x 3 matrix A The fourth line replaces the diagonal of A with 1s matrix A 1 2 3 4 5 6 7 8 9 matrix A 2 3 2 3 I 2 matrix d 1 1 1 matrix A diag d Chapter 12 Matrix manipulation 85 12 4 Matrix operators The following binary operators are available for matrices addition subtraction ordinary matrix multiplication pre multiplication by transpose matrix division see below column wise concatenation row wise concatenation
75. source file that already has the relevant apparatus in place The items under Equation options should be self explanatory when printing the model in equa tion form do you want standard errors or t ratios displayed in parentheses under the parameter estimates The default is to show standard errors if you want t ratios select that item Other windows Several other sorts of output windows also have Tpx preview copy and save enabled In the case of windows having a graphical toolbar look for the T X button Figure 24 2 shows this icon second from the right on the toolbar along with the dialog that appears when you press the button Figure 24 2 TeX icon and dialog Variable ENROLL CATHOL PUPIL One aspect of gretl s TeX support that is likely to be particularly useful for publication purposes is the ability to produce a typeset version of the model table see section 3 4 An example of this is shown in Table 24 2 24 3 Fine tuning typeset output There are three aspects to this adjusting the appearance of the output produced by gretl in ETFX preview mode adjusting the formatting of gretl s tabular output for models when using the tabprint command and incorporating gretl s output into your own TEX files Previewing in the GUI As regards preview mode you can control the appearance of gretl s output using a file named gretlpre tex which should be placed in your gretl user directory see the Gretl
76. system If you saw an error message what precisely did it say 26 2 Auxiliary programs As mentioned above gretl calls some other programs to accomplish certain tasks gnuplot for graphing BIEX for high quality typesetting of regression output GNU R If something goes wrong with such external links it is not always easy for gretl to produce an informative error message If such a link fails when accessed from the gretl graphical interface you may be able to get more information by starting gretl from the command prompt rather than via a desktop menu entry or icon On the X window system start gretl from the shell prompt in an xterm on MS Windows start the program gret1w32 exe from a console window or DOS box using the g or debug option flag Additional error messages may be displayed on the terminal window Also please note that for most external calls gretl assumes that the programs in question are available in your path that is that they can be invoked simply via the name of the program without supplying the program s full location Thus if a given program fails try the experiment of typing the program name at the command prompt as shown below Graphing Typesetting GNUR X window system gnuplot latex xdvi R MS Windows wgnuplot exe pdflatex RGui exe If the program fails to start from the prompt it s not a gretl issue but rather that the program s home directory is not in your path or the program is not i
77. target matrix r and c respectively Elements are read from A and written to the target in column major order If A contains fewer elements than n r x c they are repeated cyclically if A has more elements only the first n are used For example matrix a mnormal 2 3 a matrix b mshape a 3 1 b matrix b mshape a 5 2 b produces a a 1 2323 0 99714 0 39078 0 54363 0 43928 0 48467 matrix b mshape a 3 1 Generated matrix b b b 1 2323 0 54363 0 99714 matrix b mshape a 5 2 Replaced matrix b b b 1 2323 0 48467 0 54363 1 2323 0 99714 0 54363 0 43928 0 99714 0 39078 0 43928 Complex multiplication and division Gretl has no native provision for complex numbers However basic operations can be performed on vectors of complex numbers by using the convention that a vector of n complex numbers is represented as a n x 2 matrix where the first column contains the real part and the second the imaginary part Addition and subtraction are trivial the functions cmult and cdiv compute the complex product and division respectively of two input matrices A and B representing complex numbers These matrices must have the same number of rows n and either one or two columns The first column contains the real part and the second if present the imaginary part The return value is an n x 2 matrix or if the result has no imaginary part an n vector For example suppose you have z 1 21 3 4
78. that the iterative calculation of the estimates fails to converge For the GARCH model to make sense there are strong restrictions on the admissible parameter values and it is not always the case that there exists a set of values inside the admissible parameter space for which the likelihood is maximized The restrictions in question can be explained by reference to the simplest and much the most common instance of the GARCH model where p q 1 In the GARCH 1 1 model the conditional variance is Of Ay Qu 6107 20 13 Taking the unconditional expectation of 20 13 we get T ao aio 610 so that a Xo 1 1 1 For this unconditional variance to exist we require that xx 6 lt 1 and for it to be positive we require that Xy gt 0 Oo A common reason for non convergence of GARCH estimates that is a common reason for the non existence of amp and 6 values that satisfy the above requirements and at the same time maximize the likelihood of the data is misspecification of the model It is important to realize that GARCH in itself allows only for time varying volatility in the data If the mean of the series in question is not constant or if the error process is not only heteroskedastic but also autoregressive it is necessary to take this into account when formulating an appropriate model For example it may be necessary to take the first difference of the variable in question and or to add suitable regresso
79. that XML special characters in the function code have to be escaped e g amp must be represented as amp amp Also some elements of the function syntax differ from the standard script representation the parameters and return values if any are represented in XML Basically the function is pre parsed and ready for fast loading using libxml Load a package Why package functions in this way To see what s on offer so far try the next phase of the walk through Close gretl then re open it Now go to File Function files On local machine If the previous stage above has gone OK you should see the file you packaged and saved with its short description If you click on Info you get a window with all the information gretl has gleaned from the function package If you click on the View code icon in the toolbar of this new window you get a script view window showing the actual function code Now back to the Function packages window if you click on the package s name the relevant functions are loaded into gretl s workspace ready to be called by clicking on the Call button After loading the function s from the package open the GUI console Try typing help foo replac ing foo with the name of the public interface from the loaded function package if any help text was provided for the function it should be presented In a similar way you can browse and load the function packages available on the gret
80. the US and ewr is the return price ratio of a fictitious asset constructed Chapter 18 GMM estimation 142 by averaging all the stocks in the NYSE The instrument set contains the constant and two lags of each variable The command set force_hc on on the second line of the script has the sole purpose of replicating the given example as mentioned above it forces gretl to compute the long run variance of the orthogonality conditions according to equation 18 9 rather than 18 10 We run gmm four times one step estimation for each of two initial weights matrices then iterative estimation starting from each set of initial weights Since the number of orthogonality conditions 5 is greater than the number of estimated parameters 2 the choice of intial weights should make a difference and indeed we see fairly substantial differences between the one step estimates Models 1 and 2 On the other hand iteration reduces these differences almost to the vanishing point Models 3 and 4 Part of the output is given in 18 5 It should be noted that the J test leads to a rejection of the hypothesis of correct specification This is perhaps not surprising given the heroic assumptions required to move from the microeconomic principle in equation 18 11 to the aggregate system that is actually estimated 18 6 Caveats A few words of warning are in order despite its ingenuity GMM is possibly the most fragile esti mation method in econometrics Th
81. the larger the value of A the higher is the penalty and hence the smoother the trend series Note that the hpfi 1t function in gretl produces the cyclical component ct of the original series If you want the smoothed trend you can subtract the cycle from the original genr ct hpfilt yt genr gt yt ct Hodrick and Prescott 1997 suggest that a value of A 1600 is reasonable for quarterly data The default value in gretl is 100 times the square of the data frequency which of course yields 1600 for quarterly data The value can be adjusted using the set command with a parameter of hp_lambda For example set hp_lambda 1200 The Baxter and King filter This filter is accessed using the bkfi1t function which again takes the name of the variable to be processed as its single argument Consider the spectral representation of a time series yr TU V e dZ w TU Chapter 5 Special functions in genr 32 To extract the component of y that lies between the frequencies w and w one could apply a bandpass filter TU Es F w e dZ w T where F w 1 for w lt w lt w and O elsewhere This would imply in the time domain applying to the series a filter with an infinite number of coefficients which is undesirable The Baxter and King bandpass filter applies to y a finite polynomial in the lag operator A L ct A L yt where A L is defined as k A L gt ail i k The coefficients a are chosen
82. the number of restrictions than vec f The savvy user will then see what needs to be done The other point to take into account is that if ax is unrestricted the effective length of w is 0 since it is then optimal to compute using Johansen s formula conditional on f equation 21 11 above The example above could be rewritten as open denmark gdt vecm 2 1 LRM LRY IBO IDE rc seasonals Chapter 21 Cointegration and Vector Error Correction Models 172 matrix phi 8 6 set initvals phi restrict Ibfgs b 1 1 b 1 b 2 0 b 3 b 4 0 end restrict In this more economical formulation the initializer specifies only the two free parameters in q 5 elements in 6 minus 3 restrictions There is no call to give values for y since is unrestricted Scale removal Consider a simpler version of the restriction discussed in the previous section namely restrict b 1 1 b 1 b 2 0 end restrict This restriction comprises a substantive testable requirement that f and f2 sum to zero and a normalization or scaling f 1 The question arises might it be easier and more reliable to maximize the likelihood without imposing B 1 9 If so we could record this normalization remove it for the purpose of maximizing the likelihood then reimpose it by scaling the result Unfortunately it is not possible to say in advance whether scale removal of this sort will give better results for any particular
83. to allow the intercept of the regression to differ across the units the latter to allow the intercept to differ across periods Two special functions are available to create such dummies These are found under the Add menu in the GUI or under the genr command in script mode or gretlcli 1 unit dummies script command genr unitdum This command creates a set of dummy variables identifying the cross sectional units The variable du_1 will have value 1 in each row corresponding to a unit 1 observation O otherwise du_2 will have value 1 in each row corresponding to a unit 2 observation 0 otherwise and so on 2 time dummies script command genr timedum This command creates a set of dummy variables identifying the periods The variable dt_1 will have value 1 in each row correspond ing to a period 1 observation 0 otherwise dt_2 will have value 1 in each row corresponding to a period 2 observation 0 otherwise and so on Chapter 5 Special functions in genr 33 If a panel data set has the YEAR of the observation entered as one of the variables you can create a periodic dummy to pick out a particular year e g genr dum YEAR 1960 You can also create periodic dummy variables using the modulus operator For instance to create a dummy with value 1 for the first observation and every thirtieth observation thereafter 0 otherwise do genr index genr dum Cindex 1 30 0 Lags differences trends If the time
84. to understand but gretl s output is intended to be human readable and editable In addition it does not use any non standard style packages Besides the standard BIX document classes the only files needed are as noted above the amsmath dcolumn and longtable packages These should be included in any reasonably full TeX implementation 24 4 Character encodings People using gretl in English speaking locales are unlikely to have a problem with this but if you re generating T X output in a locale where accented characters not in the ASCII character set are employed you may want to pay attention here Gretl generates TeX output using whatever character encoding is standard on the local system If the system encoding is in the ISO 8859 family this will probably be OK wihout any special effort on the part of the user Newer GNU Linux systems however typically use Unicode UTF 8 This is also OK so long as your TeX system can handle UTF 8 input which requires use of the latex ucs package So if you are using gretl to generate TEX in a non English locale where the system encoding is UTF 8 you will need to ensure that the latex ucs package is installed This package may or may not be installed by default when you install Tex For reference if gretl detects a UTF 8 environment the following lines are used in the TeX preamble usepackage ucs usepackage ut f8x inputenc Chapter 24 Gretl and Tex 194 24 5 Installing and learn
85. unit i in period t 6 is a k x 1 vector of parameters and uit is an error or disturbance term specific to unit i in period t The fixed and random effects models have in common that they decompose the unitary pooled error term uit For the fixed effects model we write Uit Qi Eit yielding Vit Xith Qi Eit 15 2 That is we decompose uit into a unit specific and time invariant component and an observation specific error r The s are then treated as fixed parameters in effect unit specific y intercepts which are to be estimated This can be done by including a dummy variable for each cross sectional unit and suppressing the global constant This is sometimes called the Least Squares Dummy Vari ables LSDV method Alternatively one can subtract the group mean from each of variables and estimate a model without a constant In the latter case the dependent variable may be written as Yu Vit Vi The group mean is defined as Ti 1 Ji T 2 vu lIt is possible to break a third component out of uit namely wt a shock that is time specific but common to all the units in a given period In the interest of simplicity we do not pursue that option here 110 Chapter 15 Panel data 111 where T is the number of observations for unit i An exactly analogous formulation applies to the independent variables Given parameter estimates obtained using such de meaned data we can recover estimates o
86. up as an aid to the user if the function is packaged see section 10 5 below and called via gretl s graphical interface The string should be enclosed in double quotes and inserted before the comma that precedes the following parameter or the closing right parenthesis of the function definition in the case of the last parameter as illustrated in the following example function myfun series y dependent variable series x independent variable Void functions You may define a function that has no parameters these are called routines in some programming languages In this case use the keyword void in place of the listing of parameters function myfunc2 void The function body The function body is composed of gretl commands or calls to user defined functions that is function calls may be nested A function may call itself that is functions may be recursive While the function body may contain function calls it may not contain function definitions That is you cannot define a function inside another function For further details see section 10 4 10 2 Calling a function A user function is called by typing its name followed by zero or more arguments enclosed in parentheses If there are two or more arguments these should be separated by commas There are automatic checks in place to ensure that the number of arguments given in a function call matches the number of parameters and that the types of the given argum
87. using as weights the reciprocals of the estimated variances While these methods are still in use an alternative approach has found increasing favor that is use OLS but compute standard errors or more generally covariance matrices that are robust with respect to deviations from the iid assumption This is typically combined with an emphasis on using large datasets large enough that the researcher can place some reliance on the asymptotic consistency property of OLS This approach has been enabled by the availability of cheap computing power The computation of robust standard errors and the handling of very large datasets were daunting tasks at one time but now they are unproblematic The other point favoring the newer 102 Chapter 14 Robust covariance matrix estimation 103 methodology is that while FGLS offers an efficiency advantage in principle it often involves making additional statistical assumptions which may or may not be justified which may not be easy to test rigorously and which may threaten the consistency of the estimator for example the common factor restriction that is implied by traditional FGLS corrections for autocorrelated errors James Stock and Mark Watson s Introduction to Econometrics illustrates this approach at the level of undergraduate instruction many of the datasets they use comprise thousands or tens of thousands of observations FGLS is downplayed and robust standard errors are repor
88. variable as found in the header file followed by one or more spaces followed by the descriptive label Here is an example price New car price index 1982 base year If you want to save data in traditional format use the t flag with the store command either in the command line program or in the console window of the GUI program A 3 Binary database details A gretl database consists of two parts an ASCII index file with filename suffix idx containing information on the series and a binary file suffix bin containing the actual data Two examples of the format for an entry in the idx file are shown below GOM910 Composite index of 11 leading indicators 1987 100 M 1948 01 1995 11 n 575 currbal Balance of Payments Balance on Current Account SA Q 1960 1 1999 4 n 160 The first field is the series name The second is a description of the series maximum 128 charac ters On the second line the first field is a frequency code M for monthly Q for quarterly A for annual B for business daily daily with five days per week and D for daily seven days per week No other frequencies are accepted at present Then comes the starting date N B with two digits following the point for monthly data one for quarterly data none for annual a space a hyphen Appendix A Data file details 210 another space the ending date the string n and the integer number of observations In the case of daily data the starting and
89. whenever possible For this reason gretl provides a way to interact with R and thus enable users to pool the capabilities of the two packages In this chapter we will explain how to exploit R s power from within gretl We assume that the reader has a working installation of R available and a basic grasp of R s syntax Despite several valiant attempts no graphical shell has gained wide acceptance in the R community by and large the standard method of working with R is by writing scripts or by typing commands at the R prompt much in the same way as one would write gretl scripts or work with the gretl console In this chapter the focus will be on the methods available to execute R commands without leaving gretl 25 2 Starting an interactive R session The easiest way to use R from gretl is in interactive mode Once you have your data loaded in gretl you can select the menu item Tools Start GNU R and an interactive R session will be started with your dataset automatically pre loaded A simple example OLS on cross section data For this example we use Ramanathan s dataset data4 1 one of the sample files supplied with gretl We first run in gretl an OLS regression of price on sqft bedrms and baths The basic results are shown in Table 25 1 Table 25 1 OLS house price regression via gretl Variable Coefficient Std Error t statistic p value const 129 062 88 3033 1 4616 0 1746 sqft 0 154800 0 0319404 4 8465 0 0007 bedrms
90. window either by ID number or alphabetically by name NIST test suite Check the numerical accuracy of gretl against the reference results for linear regression made available by the US National Institute of Standards and Technol ogy Preferences Set the paths to various files gretl needs to access Choose the font in which gretl displays text output Activate or suppress gretl s messaging about the availability of program updates and so on See the Gretl Command Reference for further details e Data menu Select all Several menu items act upon those variables that are currently selected in the main window This item lets you select all the variables Display values Pops up a window with a simple not editable printout of the values of the selected variable or variables Edit values Opens a spreadsheet window where you can edit the values of the selected variables Add observations Gives a dialog box in which you can choose a number of observations to add at the end of the current dataset for use with forecasting Remove extra observations Active only if extra observations have been added automati cally in the process of forecasting deletes these extra observations Read info Edit info Read info just displays the summary information for the current data file Edit info allows you to make changes to it if you have permission to do so Print description Opens a window containing a full a
91. write Zt Y B2Y21 Bn nt Which is equivalent to saying that Vi B2V2t PnYnt Zt is a long run equilibrium relationship the deviations z may not be 0 but they are stationary In this case 21 2 can be written as p 1 Ayi pi AB ye Y DAY 1 r 21 3 i 1 If f were known then z would be observable and all the remaining parameters could be estimated via OLS In practice the procedure estimates first and then the rest The rank of IT is investigated by computing the eigenvalues of a closely related matrix whose rank is the same as II however this matrix is by construction symmetric and positive semidefinite As a consequence all its eigenvalues are real and non negative and tests on the rank of II can therefore be carried out by testing how many eigenvalues are 0 If all the eigenvalues are significantly different from 0 then all the processes are stationary If on the contrary there is at least one zero eigenvalue then the yt process is integrated although some linear combination f y might be stationary At the other extreme if no eigenvalues are significantly different from 0 then not only is the process 7 non stationary but the same holds for any linear combination f y in other words no cointegration occurs Estimation typically proceeds in two stages first a sequence of tests is run to determine r the cointegration rank Then for a given rank the parameters in equation 21 3
92. x The function misszero does the opposite of zeromiss that is it converts all missing values to zero It may be worth commenting on the propagation of missing values within genr formulae The general rule is that in arithmetical operations involving two variables if either of the variables has a missing value at observation t then the resulting series will also have a missing value at t The one exception to this rule is multiplication by zero zero times a missing value produces zero since this is mathematically valid regardless of the unknown value 5 8 Retrieving internal variables The genr command provides a means of retrieving various values calculated by the program in the course of estimating models or testing hypotheses The variables that can be retrieved in this way are listed in the Gretl Command Reference here we say a bit more about the special variables test and pvalue These variables hold respectively the value of the last test statistic calculated using an explicit testing command and the p value for that test statistic If no such test has been performed at the time when these variables are referenced they will produce the missing value code The explicit testing commands that work in this way are as follows add joint test for the significance of vari ables added to a model adf Augmented Dickey Fuller test see below arch test for ARCH chow Chow test for a structural break coeffsum test for th
93. y lt Mi where the interval may be left or right unbounded but not both If m Mj we effectively observe yi and no information loss occurs In practice each observation belongs to one of four categories 1 left unbounded when mi 2 right unbounded when M oo 3 bounded when lt m i lt Mi lt and 4 point observations when m Mj It is interesting to note that this model bears similarities to other models in several special cases e When all observations are point observations the model trivially reduces to the ordinary linear regression model lWe assume here that censoring occurs from below at 0 Censoring from above or at a point different from zero can be rather easily handled by re defining the dependent variable appropriately For the more general case of two sided censoring the intreg command may be used see below Chapter 22 Discrete and censored dependent variables 180 e When m i M when yi gt 0 while m o and M 0 otherwise we have the Tobit model see 22 4 e The interval model could be thought of an ordered probit model see 22 2 in which the cut points the coefficients in eq 22 8 are observed and don t need to be estimated The gretl command intreg estimates interval models by maximum likelihood assuming normality of the disturbance term Its syntax is intreg minvar maxvar X where minvar contains the m series with NAs for left unbounded observati
94. year i restrict replace summary 1 2 3 4 endloop Example 9 5 String substitution open bea dat loop i 1987 2001 genr V COMP i genr TC GOC i PBT i genr C TC V ols PBT i const TC V endloop 61 Chapter 10 User defined functions 10 1 Defining a function Since version 1 3 3 gretl has contained a mechanism for defining functions which may be called via the command line in the context of a script or if packaged appropriately see section 10 5 via the program s graphical interface The syntax for defining a function looks like this function function name parameters function body end function function name is the unique identifier for the function Names must start with a letter They have a maximum length of 31 characters if you type a longer name it will be truncated Function names cannot contain spaces You will get an error if you try to define a function having the same name as an existing gretl command The parameters for a function are given in the form of a comma separated list Parameters can be of any of the types shown below Type Description bool scalar variable acting as a Boolean switch int scalar variable acting as an integer scalar scalar variable series data series list named list of series matrix named matrix or vector string named string or string literal Each element in the listing of parameters must include two terms a type specifier and the name by which the parameter s
95. 0 There could be more than one such linear combina tion That is while the ensemble of variables may be free to wander over time nonetheless the variables are tied together in certain ways And it may be possible to interpret these ties or cointegrating vectors as representing equilibrium conditions For example suppose we find some or all of the following variables are I 1 money stock M the price level P the nominal interest rate R and output Y According to standard theories of the demand for money we would nonetheless expect there to be an equilibrium relationship between real balances interest rate and output for example m p yotyiy yer y gt 0 y2 lt 0 where lower case variable names denote logs In equilibrium then m p yiy y2rY Yo Realistically we should not expect this condition to be satisfied each period We need to allow for the possibility of short run disequilibrium But if the system moves back towards equilibrium fol lowing a disturbance it follows that the vector x m p y r is bound by a cointegrating vector B B1 B2 P3 B4 such that B x is stationary with a mean of yo Furthermore if equilibrium is correctly characterized by the simple model above we have B2 B1 B3 lt 0 and B4 gt O These things are testable within the context of cointegration analysis There are typically three steps in this sort of analysis 1 Test to determine the number of cointegr
96. 0 494329 0 50 0 560181 0 487022 0 601989 0 75 0 644014 0 580155 0 690413 0 95 0 709069 0 673900 0 734441 2These correspond to the iid and nid options in R s quantreg package respectively Chapter 23 Quantile regression 186 Coefficient on income 0 75 Quantile estimates with 90 band OLS estimate with 90 band 0 3 L 1 L L tau Figure 23 1 Regression of food expenditure on income Engel s data The gretl GUI has an entry for Quantile Regression under Model Robust estimation and you can select multiple quantiles there too In that context just give space separated numerical values as per the predefined options shown in a drop down list When you estimate a model in this way most of the standard menu items in the model window are disabled but one extra item is available graphs showing the T sequence for a given coef ficient in comparison with the OLS coefficient An example is shown in Figure 23 1 This sort of graph provides a simple means of judging whether quantile regression is redundant OLS is fine or informative In the example shown based on data on household income and food expenditure gathered by Ernst Engel 1821 1896 it seems clear that simple OLS regression is potentially misleading The crossing of the OLS estimate by the quantile estimates is very marked However it is not always clear what implications should be drawn from this sort of conflict With t
97. 119 114 2 103 3 63195 457 82376 451 58941 499 Table 11 1 GDP per capita and population in 3 European countries Source Eurostat After these commands the series xok will have value 1 for observations where none of x1 x2 or x3 has a missing value and value 0 for any observations where this condition is not met The functions max min mean sd sum and var behave horizontally rather than vertically when their argument is a list For instance the following commands list Xlist x1 x2 x3 series m mean Xlist produce a series m whose i th element is the average of x1 X2 i and x34 missing values if any are implicitly discarded In addition gretl provides three functions for weighted operations wmean wsd and wvar Consider as an illustration Table 11 1 the first three columns are GDP per capita for France Germany and Italy columns 4 to 6 contain the population for each country If we want to compute an aggregate indicator of per capita GDP all we have to do is list Ypc YpcFR YpcGE YpcIT list N NFR NGE NIT y wmean Ypc N so for example 114 9 x 59830 635 124 6 x 82034 771 119 3 x 56890 372 39830 635 82034 771 56890 372 Y1996 120 163 See the Gretl Command Reference for more details 11 2 Named strings For some purposes it may be useful to save a string that is a sequence of characters as a named variable that can be reused Versions of gretl higher than 1 6 0 offer this facility but som
98. 140 142 144 144 144 146 146 146 151 153 157 157 158 159 161 162 164 170 Contents v 23 Quantile regression 184 ak MOCO oa Seg AN 184 Zou BASIE SIE reo iea ga ea a ERY A A ES OH Rew eee E 184 23 3 Tontidenes intervals se coda caaaeaee ee a a A e ada 185 234 Multiple QUAantlles eass sebou eles Pe aaa AA AA RR EES AR 185 Zo LOS datasets oo ia p E RR RR RR A RR A Ee ee 186 TI Technical details 188 24 Gretl and Tex 189 24 1 TIMO OUEN osa e el ac don Ros ee a ER Re oR Re ae 189 242 e la memi Itens lt coe oe we ek ee a Bed eee we de wd ae we a 189 24 5 Fie tuining Typeset ouput cocos ti neg rit Ree ee eee ED 191 24 4 Character encodings sory ko ioa ionak aon POE w OEE io a k a a A a AG e a 193 24 5 Installing and learning TPX ie ee ed eee d i a a Haw eRe RE a 194 25 Gretl and R 195 2o MOUCHO AA AA 195 23 2 Starting an interactive R session se ee ooo remeras 195 25 0 RUNGIDS An R SCPE ses occo eV ca a AA 198 25 4 Taking stuff back and forth o 198 25 5 Interacting with R from the command line ooo ooo o 202 26 Troubleshooting gretl 205 2051 BUE TEDOS car ne Sab di e E A A e A A A e 205 26 2 Awdliary e o e eca ad de A a ee we ee ee Se a ee 205 27 The command line interface 206 IV Appendices 207 A Data file details 208 A1 Basic native format o do ica a a a a A a a ii 208 A2 Traditional ESL OTM 80 2 occ a eae a a A A 208 AS Binary database detalla c coiai
99. 2 7 8705000000 1970 3 7 5600000000 1970 4 7 1892000000 1971 2 5 8679000000 1971 3 6 2442000000 1971 4 5 9811000000 1972 2 4 6883000000 1972 3 4 6302000000 Internally gretl fills the variable bar with the corresponding value if it finds a match otherwise NA is used Printing out the variable bar thus produces Obs bar 1970 1 1970 2 7 8705 1970 3 7 5600 1970 4 7 1892 1971 1 1971 2 5 8679 1971 3 6 2442 1971 4 5 9811 1972 1 1972 2 4 6883 1972 3 4 6302 Appendix C Building gretl C 1 Requirements Gretl is written in the C programming language abiding as far as possible by the ISO ANSI C Standard C90 although the graphical user interface and some other components necessarily make use of platform specific extensions The program was developed under Linux The shared library and command line client should compile and run on any platform that supports ISO ANSI C and has the libraries listed in Table C 1 If the GNU readline library is found on the host system this will be used for gretcli providing a much enhanced editable command line See the readline homepage Library purpose website Zlib data compression info zip org libxml2 XML manipulation xmlsoft org LAPACK linear algebra netlib org FFTW3 Fast Fourier Transform fftw org glib 2 0 Numerous utilities gtk org Table C 1 Libraries required for building gretl The graphical client program should compile and run on any system
100. 5548 Adjusted R squared 0 995417 Degrees of freedom 34 Durbin Watson statistic 0 513696 First order autocorrelation coeff 0 768301 Log likelihood 240 616 Akaike information criterion AIC 485 232 Schwarz Bayesian criterion BIC 488 399 Hannan Quinn criterion HQC 486 338 Figure 2 4 Model output window The output window contains menus that allow you to inspect or graph the residuals and fitted values and to run various diagnostic tests on the model For most models there is also an option to print the regression output in BIX format See Chap ter 24 for details To import gretl output into a word processor you may copy and paste from an output window using its Edit menu or Copy button in some contexts to the target program Many not all gretl windows offer the option of copying in RTF Microsoft s Rich Text Format or as BT X If you are pasting into a word processor RTF may be a good option because the tabular formatting of the output is preserved Alternatively you can save the output to a plain text file then import the file into the target program When you finish a gretl session you are given the option of saving all the output from the session to a single file Note that on the gnome desktop and under MS Windows the File menu includes a command to send the output directly to a printer t When pasting or importing plain text gretl output into a word processor select a monospaced or
101. A 1 F Ox 0 T gt gt wift O fr i l 18 10 t k Li k Gretl computes the HAC covariance matrix by default when a GMM model is estimated on time series data You can control the kernel and the bandwidth that is the value of k in 18 10 using the set command See chapter 14 for further discussion of HAC estimation You can also ask gretl not to use the HAC version by saying set force_hc on Chapter 18 GMM estimation 140 Example 18 3 TSLS via GMM partial output Model 1 TSLS estimates using the 48 observations 1 48 Dependent variable lpackpc Instruments rtaxso rtax Heteroskedasticity robust standard errors variant HCO VARIABLE COEFFICIENT STDERROR T STAT P VALUE const 9 89496 0 928758 10 654 lt 0 00001 Travgprs 1 27742 0 241684 5 286 lt 0 00001 Tperinc 0 280405 0 245828 1 141 0 25401 Model 2 1 step GMM estimates using the 48 observations 1 48 e Ipackpc b0 b1 1lravgprs b2 lperinc PARAMETER ESTIMATE STDERROR T STAT P VALUE bO 9 89496 0 928758 10 654 lt 0 00001 b1 1 27742 0 241684 5 286 lt 0 00001 b2 0 280405 0 245828 1 141 0 25401 GMM criterion 0 0110046 18 5 Areal example the Consumption Based Asset Pricing Model To illustrate gretl s implementation of GMM we will replicate the example given in chapter 3 of Hall 2005 The model to estimate is a classic application of GMM and provides an example of a case when orthogonality conditions do not stem from statistical co
102. CME in gretl If you wish to use HC HC or HC you can arrange for this in either of two ways In script mode you can do for example set hc_version 2 In the GUI program you can go to the HCCME configuration dialog as noted above and choose any of these variants to be the default 14 3 Time series data and HAC covariance matrices Heteroskedasticity may be an issue with time series data too but it is unlikely to be the only or even the primary concern One form of heteroskedasticity is common in macroeconomic time series but is fairly easily dealt with That is in the case of strongly trending series such as Gross Domestic Product aggregate consumption aggregate investment and so on higher levels of the variable in question are likely to be associated with higher variability in absolute terms The obvious fix employed in many macroeconometric studies is to use the logs of such series rather than the raw levels Provided the proportional variability of such series remains roughly constant over time the log transformation is effective Other forms of heteroskedasticity may resist the log transformation but may demand a special treatment distinct from the calculation of robust standard errors We have in mind here autore gressive conditional heteroskedasticity for example in the behavior of asset prices where large disturbances to the market may usher in periods of increased volatility Such phenomena call for specific
103. CML method minimizes the sum of squared one step ahead prediction errors generated by the model for the observations to T The starting point ty depends on the orders of the AR polynomials in the model The numerical maximization method used is BHHH and the covariance matrix is computed using a Gauss Newton regression The CML method is nearly equivalent to maximum likelihood under the hypothesis of normality the difference is that the first t 1 observations are considered fixed and only enter the like lihood function as conditioning variables As a consequence the two methods are asymptotically equivalent under standard conditions except for the fact discussed above that our CML imple mentation treats the constant and exogenous variables as per equation 20 3 The two methods can be compared as in the following example open datal0 1 arma 1 1 r arma 1 1 r conditional which produces the estimates shown in Table 20 1 As you can see the estimates of p and 0 are quite similar The reported constants differ widely as expected see the discussion following equations 20 4 and 20 5 However dividing the CML constant by 1 q we get 7 38 which is not far from the ML estimate of 6 93 Chapter 20 Time series models 150 Table 20 1 ML and CML estimates Parameter ML CML H 6 93042 0 923882 1 07322 0 488661 0 855360 0 0511842 0 852772 0 0450252 0 0 588056 0 0986096 0 591838 0 0456662
104. Command Ref erence If such a file is found its contents will be used as the preamble to the TeX source The default value of the preamble is as follows Xdocumentclass 11pt farticlej usepackage latin1 inputenc but see below usepackage amsmath usepackage dcolumn longtable begin document thispagestylef empty Chapter 24 Gretl and Tex 192 Table 24 2 Example of model table output OLS estimates Dependent variable ENROLL Model 1 Model 2 Model 3 const 0 2907 0 2411 0 08557 0 07853 0 06602 0 05794 CATHOL 0 2216 0 2235 0 2065 0 04584 0 04597 0 05160 PUPIL 0 003035 0 003382 0 001697 0 002727 0 002720 0 003025 WHITE 0 1482 0 1526 0 04074 0 04071 ADMEXP 0 1551 0 1342 n 51 51 51 R 0 4502 0 4462 0 2956 L 96 09 95 36 88 69 Standard errors in parentheses indicates significance at the 10 percent level indicates significance at the 5 percent level Note that the amsmath and dcolumn packages are required For some sorts of output the longtable package is also needed Beyond that you can for instance change the type size or the font by al tering the documentclass declaration or including an alternative font package The line Musepackage latin1 finputencj is automatically changed if gretl finds itself running on a system where UTF 8 is the default character encoding see section 24 4 below In addition if you should wish to typeset gretl output in m
105. Discrete and censored dependent variables Example 22 6 Heckit model open mroz87 gdt genr EXP2 AXA2 genr WA2 WAA2 genr KIDS KL6 K618 gt 0 Greene s specification const AX EXP2 WE CIT const WA WA2 FAMINC KIDS WE list X list Z heckit WW X LFP Z two step heckit WW X LFP Z Wooldridge s specification series NWINC FAMINC WW WHRS series lww log Ww list X const WE AX EXP2 list Z X NWINC WA KL6 K618 heckit lww X LFP Z two step 183 Chapter 23 Quantile regression 23 1 Introduction In Ordinary Least Squares OLS regression the fitted values XiB represent the conditional mean of the dependent variable conditional that is on the regression function and the values of the independent variables In median regression by contrast and as the name implies fitted values represent the conditional median of the dependent variable It turns out that the principle of estimation for median regression is easily stated though not so easily computed namely choose B so as to minimize the sum of absolute residuals Hence the method is known as Least Absolute Deviations or LAD While the OLS problem has a straightforward analytical solution LAD is a linear programming problem Quantile regression is a generalization of median regression the regression function predicts the conditional T quantile of the dependent variable for example the first quartile T 25 or the ninth decile T
106. E Send to Send the current data set as an e mail attachment New data set Allows you to create a blank data set ready for typing in values or for importing series from a database See below for more on databases Clear data set Clear the current data set out of memory Generally you don t have to do this since opening a new data file automatically clears the old one but sometimes it s useful Script files A script is a file containing a sequence of gretl commands This item contains entries that let you open a script you have created previously User file open a sample script or open an editor window in which you can create a new script Session files A session file contains a snapshot of a previous gretl session including the data set used and any models or graphs that you saved Under this item you can open a saved session or save the current session Databases Allows you to browse various large databases either on your own computer or if you are connected to the internet on the gretl database server See Section 4 3 for details Function files Handles function packages see Section 10 5 which allow you to access functions written by other users and share the ones written by you Exit Quit the program You ll be prompted to save any unsaved work e Tools menu Statistical tables Look up critical values for commonly used distributions normal or Gaussian t chi square F and Durbin Watso
107. FGS Broyden Fletcher Goldfarb and Shanno method This technique is used in most econometric and statistical packages as it is well established and remarkably powerful Clearly in order to make this technique operational it must be possible to compute the vector g for any value of 0 In some cases this vector can be written explicitly as a function of Y If this is not possible or too difficult the gradient may be evaluated numerically The choice of the starting value 90 is crucial in some contexts and inconsequential in others In general however it is advisable to start the algorithm from sensible values whenever possible If a consistent estimator is available this is usually a safe and efficient choice this ensures that in large samples the starting point will be likely close to O and convergence can be achieved in few iterations The maxmimum number of iterations allowed for the BFGS procedure and the relative tolerance for assessing convergence can be adjusted using the set command the relevant variables are bfgs_maxiter default value 500 and bfgs_toler default value the machine precision to the power 3 4 Covariance matrix and standard errors By default the covariance matrix of the parameter estimates is based on the Outer Product of the Gradient That is Varors 0 G 6 G 6 where G 6 is the T x k matrix of contributions to the gradient Two other options are available If the hessian flag is give
108. Free Software Foundation for his support of free software in general and for agreeing to adopt gretl as a GNU program in particular Many users of gretl have submitted useful suggestions and bug reports In this connection par ticular thanks are due to Ignacio D az Emparanza Tadeusz Kufel Pawel Kufel Alan Isaac Cri Rigamonti Sven Schreiber Talha Yalta Andreas Rosenblad and Dirk Eddelbuettel who maintains the gretl package for Debian GNU Linux 1 3 Installing the programs Linux On the Linux platform you have the choice of compiling the gretl code yourself or making use of a pre built package Building gretl from the source is necessary if you want to access the development version or customize gretl to your needs but this takes quite a few skills most users will want to go for a pre built package Some Linux distributions feature gretl as part of their standard offering Debian for example or Ubuntu in the universe repository If this is the case all you need to do is install gretl through your package manager of choice e g synaptic Ready to run packages are available in rpm format suitable for Red Hat Linux and related systems on the gretl webpage http gret1 sourceforge net However we re hopeful that some users with coding skills may consider gretl sufficiently interest ing to be worth improving and extending The documentation of the libgretl API is by no means complete but you can find some details by
109. Gnu Multiple Precision support yes MPFR support no LAPACK support yes FFTW3 support yes Build with GTK version 2 0 Script syntax highlighting yes Use installed gtksourceview yes Build with gnome support no Build gretl documentation yes Build message catalogs yes Gnome installation prefix NA X 12 ARIMA support yes TRAMO SEATS support yes Experimental audio support no Now type make to build gretl Build and install We are now ready to undertake the compilation proper this is done by running the make command which takes care of compiling all the necessary source files in the correct order All you need to do is type make This step will likely take several minutes to complete a lot of output will be produced on screen Once this is done you can install your freshly baked copy of gretl on your system via make install On most systems the make install command requires you to have administrative privileges Hence either you log in as root before launching make install or you may want to use the sudo utility sudo make install Appendix D Numerical accuracy Gretl uses double precision arithmetic throughout except for the multiple precision plugin in voked by the menu item Model Other linear models High precision OLS which represents floating point values using a number of bits given by the environment variable GRETL_MP_BITS default value 256 The normal equations of Least Squares are by def
110. Gretl User s Guide Gnu Regression Econometrics and Time series Allin Cottrell Department of Economics Wake Forest university Riccardo Jack Lucchetti Dipartimento di Economia Universita Politecnica delle Marche December 2008 Permission is granted to copy distribute and or modify this document under the terms of the GNU Free Documentation License Version 1 1 or any later version published by the Free Software Foundation see http ww gnu org licenses fdl html 1 Contents Introduction 11 ECAUITES at aglon lt a e A A BA RA 1 2 Ackmowledgements bea ona RR RR 1 3 Installng the programs oca a a a a A cd a ee ha a Running the program Getting started 2 1 LES TULA PESTESS ON na o a A eA a o aa a 22 ESTMACON QUIE eeg e sa ora EA 2 3 The main window menus 00000 eee ee 24 Keyboard shortcmis occur gow SE A a a ee ee a a Oe WG ere olat A nae A Modes of working 3 Command SETIPOS sose sa bade eh eee eee Gd ewe ee ew eee Sf SAVING SEND OURO e ies ke eS oe ee ROE RE Re GUE Ee eR ae Se ThE gre COSO sa oe ee ah a eS Me a WE ee A E a a 34 The Session CONCEpt ia he we ER eae A A eK Data files A1 NAVE TOMA cinc ra hee eee ee Rad ea Dae wa aes ae ther dara tile lormats ce ee ke Goh we wee ei Rw GPa eae Or a a a a 43 Binary databases lt sasse ecseri raaa mina a a a a 4 4 Creating a data file from scratch o oo ooo 45 Strmmecturing dataset sia a be ee Pe ee a E 4 6 Missing data V
111. Library command GLIB apt get install libglib2 0 dev GTK 2 0 apt get install libgtk2 0 dev PNG apt get install libpngl2 dev XSLT apt get install libxs1t1 dev LAPACK apt get install lapack3 dev FFTW apt get install fftw3 dev READLINE apt get install libreadline5 dev GMP apt get install libgmp3 dev GMP is optional but recommended The dev packages for these libraries are necessary to compile gretl you ll also need the plain non dev library packages to run gretl but most of these should already be part of a standard installation In order to enable other optional features like audio support you may need to install more libraries Getting the source release or CVS At this point it is possible to build from the source You have two options here obtain the latest released source package or retrieve the current CVS version of gretl CVS Concurrent Versions System The usual caveat applies to the CVS version namely that it may not build correctly and may contain experimental code on the other hand CVS often contains bug fixes relative to the released version If you want to help with testing and to contribute bug reports we recommend using CVS gretl To work with the released source Appendix C Building gretl 219 1 Download the gretl source package from gretl sourceforge net 2 Unzip and untar the package On a system with the GNU utilities available the command would be tar xvfz gretl N tar gz replace N with th
112. MA These pro grams employ a fixed size memory allocation and can t handle series of more than 600 observa tions 4 8 Data file collections If you re using gretl in a teaching context you may be interested in adding a collection of data files and or scripts that relate specifically to your course in such a way that students can browse and access them easily There are three ways to access such collections of files e For data files select the menu item File Open data Sample file or click on the folder icon on the gretl toolbar e For script files select the menu item File Script files Practice file When a user selects one of the items e The data or script files included in the gretl distribution are automatically shown this includes files relating to Ramanathan s Introductory Econometrics and Greene s Econometric Analysis e The program looks for certain known collections of data files available as optional extras for instance the datafiles from various econometrics textbooks Davidson and MacKinnon Gujarati Stock and Watson Verbeek Wooldridge and the Penn World Table PWT 5 6 See the data page at the gretl website for information on these collections If the additional files are found they are added to the selection windows 4genr also offers the inverse function to misszero namely zeromiss which replaces zeros in a given series with the missing observation code Chapter 4 Data files 28
113. The ADF test The Augmented Dickey Fuller ADF test is as implemented in gretl the t statistic on in the following regression p Aye Me PYt 1 Dd ViAM i r 20 8 This test statistic is probably the best known and most widely used unit root test It is a one sided test whose null hypothesis is p O versus the alternative p lt O Under the null y must be differenced at least once to achieve stationarity under the alternative y is already stationary and no differencing is required Hence large negative values of the test statistic lead to the rejection of the null One peculiar aspect of this test is that its limit distribution is non standard under the null hy pothesis moreover the shape of the distribution and consequently the critical values for the test depends on the form of the u term A full analysis of the various cases is inappropriate here Hamilton 1994 contains an excellent discussion but any recent time series textbook covers this topic Suffice it to say that gretl allows the user to choose the specification for u among four different alternatives Ht command option 0 nc Ho C Ho Mit ct Ho pit ut ctt 1See in particular their Program 4 on p 505ff Chapter 20 Time series models 152 These options are not mutually exclusive when they are used together the statistic will be reported separately for each case By default gretl uses by default the combinatio
114. The two functions mread and mwrite can be used for basic matrix input output This can be useful to enable gretl to exchange data with other programs The mread function accepts one string parameter the name of the plain text file from which the matrix is to be read The file in question must conform to the following rules 1 The columns must be separated by spaces or tab characters 2 The decimal separator must be the dot character 3 The first line in the file must contain two integers separated by a space or a tab indicating the number of rows and columns respectively This is not the only definition of the SVD some writers define U as m x m as mx n with k non zero diagonal elements and V asn xn Chapter 12 Matrix manipulation 92 Should an error occur such as the file being badly formatted or inaccessible an empty matrix see section 12 2 is returned The complementary function mwrite produces text files formatted as described above The column separator is the tab character so import into spreadsheets should be straightforward Usage is illustrated in example 12 2 Matrices stored via the mwrite command can be easily read by other programs the following table summarizes the appropriate commands for reading a matrix A from a file called a mat in some widely used programs Program Sample code GAUSS tmp load a mat A reshape tmp 3 rows tmp tmp 1 tmp 2 Octave fd fopen a mat r c f
115. Ut Et Lax JE 2 ees Et 1 which incorporates the intercept into the cointegration vector This is known as the restricted constant case 3 m 0 and k 0 This case is the most restrictive clearly neither x nor y are trended and the mean distance between them is zero The vector uy is also 0 which explains why this case is referred to as no constant In most cases the choice between these three possibilities is based on a mix of empirical obser vation and economic reasoning If the variables under consideration seem to follow a linear trend then we should not place any restriction on the intercept Otherwise the question arises of whether it makes sense to specify a cointegration relationship which includes a non zero intercept One ex ample where this is appropriate is the relationship between two interest rates generally these are not trended but the VAR might still have an intercept because the difference between the two the Chapter 21 Cointegration and Vector Error Correction Models 161 interest rate spread might be stationary around a non zero mean for example because of a risk or liquidity premium The previous example can be generalized in three directions 1 If a VAR of order greater than 1 is considered the algebra gets more convoluted but the conclusions are identical 2 If the VAR includes more than two endogenous variables the cointegration rank r can be greater than 1 In this
116. a A real example the Consumption Based Asset Pricing Model IT NN 19 Model selection criteria 19 1 19 2 IMTOCUCHON ula a a is ad e e cd ee dedo ah a ImfOrmation Criteria 2 iconos morra ee HERE EEE ee ee 20 Time series models 20 1 20 2 20 3 20 4 TOTO cic Ae ea ee ER oe eh ee eee Be ARMA MOULIS ies ic dared eB ah Oe eee ash Bese ela ape ee ae MOE PORES so occ Ses ah es ow aes A ae Sd ey Sh Se A ES eG Sl BO AAR SE de ARCH ANG GARCH o si ia aa ee Be ae OG DE ee OO Fae oe we 21 Cointegration and Vector Error Correction Models 21 1 21 2 21 3 21 4 21 5 21 6 21 7 ir o A RN A Vector Error Correction Models as representation of a cointegrated system Interpretation of the deterministic components o o o The Johansen cointegration tests os eae eR Hee EN eS Identification of the cointegration vectors 10 0 ce nana nwe raso Over identifying restrictions es Numerical solution methods 00000 cc eee ee 22 Discrete and censored dependent variables 22 1 222 22 3 22 4 22 9 22 6 Logit and probit models lt s a s a aa do Ordered response mod ls 2 6 65 bee see eR aa ee ka Ea e ESE ew kee ee Makina logit o a e ee a eRe BR ee ee The Topit madel i a eoa o OE Se i RRS hE A A os Inter val regression a SA A ao ae eS a eS we A ew a sample selection model coco fan iw ee a Ue Pe i ew Ele ca ee ea 125 126 129 130 135 135 136 138 138
117. al Assumptions Econometrica 55 pp 765 99 Nerlove M 1999 Properties of Alternative Estimators of Dynamic Panel Models An Empirical Analysis of Cross Country Data for the Study of Economic Growth in Hsiao C Lahiri K Lee L F and Pesaran M H eds Analysis of Panels and Limited Dependent Variable Models Cambridge Cambridge University Press Neter J Wasserman W and Kutner M H 1990 Applied Linear Statistical Models 3rd edition Boston MA Irwin Bibliography 227 Newey W K and West K D 1987 A Simple Positive Semi Definite Heteroskedasticity and Auto correlation Consistent Covariance Matrix Econometrica 55 pp 703 8 Newey W K and West K D 1994 Automatic Lag Selection in Covariance Matrix Estimation Review of Economic Studies 61 pp 631 53 Pesaran M H and Taylor L W 1999 Diagnostics for IV Regressions Oxford Bulletin of Economics and Statistics 61 2 pp 255 81 Portnoy S and Koenker R 1997 The Gaussian hare and the Laplacian tortoise computability of squared error versus absolute error estimators Statistical Science 12 4 pp 279 300 R Core Development Team 2000 An Introduction to R version 1 1 1 Ramanathan Ramu 2002 Introductory Econometrics with Applications 5th edition Fort Worth Harcourt Schwarz G 1978 Estimating the dimension of a model Annals of Statistics 6 pp 461 64 Shapiro S and Chen L 2001
118. al estimation method which encompasses practically all the parametric estimation techniques used in econometrics It was introduced in Hansen 1982 and Hansen and Singleton 1982 an excellent and thorough treatment is given in Davidson and MacKinnon 1993 chapter 17 The basic principle on which GMM is built is rather straightforward Suppose we wish to estimate a scalar parameter 0 based on a sample x1 x gt xT Let 09 indicate the true value of 0 Theo retical considerations either of statistical or economic nature may suggest that a relationship like the following holds Elx g 0 0 lt 0 0pb 18 1 with g a continuous and invertible function That is to say there exists a function of the data and the parameter with the property that it has expectation zero if and only if itis evaluated at the true parameter value For example economic models with rational expectations lead to expressions like 18 1 quite naturally If the sampling model for the x s is such that some version of the Law of Large Numbers holds then y xt 2 g 00 Mns t 1 hence since g is invertible the statistic 6 g7 X 0o so 6 is a consistent estimator of 0 A different way to obtain the same outcome is to choose as an estimator of 0 the value that minimizes the objective function T 2 F 0 SUD X g o 18 2 t 1 the minimum is trivially reached at 6 g X since the expression in square brackets eq
119. alues ee 4 7 Maximum size of data Sets 1 kuk ERARE RESENA dae Data He colleenons cnc 000 es Hee RRA A whew eed Special functions in genr Del UMUC MOT ris Sha i gor e ee ee ae ae a ee eh a ee Ae ie Sic LONG UNIAN ne ee eel ee Pe ee ee Dw Re eR Pe ee aed 5 3 Time serie Miers wk a a a A ee ee ee 5A Panel data specifics asosi ca be Re ee Re e a Ee a 11 11 13 13 15 15 16 19 19 19 19 20 22 26 27 27 Contents 55 Resampling and bootstrapping lt ss se e ae ee 5 6 Cumulative densities and prales e ceed e ek ae ee A 5 7 Handling missing Valles a cios crasa eRe OE REERERM ERS GOERS EWR RO 5 8 Retrieving internal variables lt coc 65 6 ee a a Aa SO Numerical Procedures 6c ee bk a se Be ee Se A ee ww AR 310 The discrete Fourier transformi cocoa ma ee eRe Re KERK 6 Sub sampling a dataset G1 a s ot i sk i eA ao Apes ee al eR AA aob e a ee ee Re ee 6 2 Seting The SAME sos iaa eee eR AS ee PERE ES Dw EER 6 3 Restricting the sampl 2 6 4 0 ene ae be ae ei AW ee he be a 6 4 Random sampling 6o The Sample mien MES os ee ae ee RR a wa eRe ee ea eS 7 Graphs and plots 7 1 Gmuplot graphs 2 56 ee a ee 2 BOXPDIGTS zi ceip i ai gae a a ec he bd BOER ERR ES OORT RRS 8 Discrete variables 8 1 Declaring variables as discrete 2 o 8 2 Commands for discrete variables s noaaraa ee ee a 9 Loop constructs 91 INTEDOUCION 2 ck beck a a AR a ES a g a 9 2 Loop control variants o ce acea a
120. ample the Cobb Douglas cost function arises when Cf is a linear function of the logarithms of the input prices and the output quantities The stochastic frontier model is a linear model of the form y x B zi in which the error term i is the sum of u and v i A common postulate is that u N 0 0 and v N 0 O If independence between u and v is also assumed then it is possible to show that the density function of e has the form 2 xr 1 Fei qe z 17 4 where and are respectively the distribution and density function of the standard normal o oi oy and Chapter 17 Maximum likelihood estimation 127 As a consequence the log likelihood for one observation takes the form apart form an irrelevant constant 7 Asi E Li log 9 roxio 207 Therefore a Cobb Douglas cost function with stochastic frontier is the model described by the following equations logC logCf ej m n logC c gt Bjlogyij gt ajlog pij j l j 1 E amp i Uit Vi ui N 0 0 vi N 0 0 In most cases one wants to ensure that the homogeneity of the cost function with respect to the prices holds by construction Since this requirement is equivalent to j aj 1 the above equation for Cf can be rewritten as m n log Cj log pin c gt Bjlogyij oj log pi log Pin i 17 5 j l j 2 The above equation could be estimated by OLS but it would suffer from two draw
121. ample sizes Observations Bins 8 lt n lt l6 5 16 lt n lt 50 7 50 lt n lt 850 vr n gt 850 29 discrete TUCE mark TUCE as discrete freq TUCE yields Read datafile usr local share gret1 data greene greenel9_1 gdt periodicity 1 maxobs 32 observations range 1 32 Listing 5 variables 1 GPA 0 const freq TUCE 2 TUCE 3 PSI Frequency distribution for TUCE obs 1 32 number of bins 7 mean 21 9375 sd 3 90151 interval lt 13 13 417 16 16 250 19 19 083 21 21 917 24 24 750 27 gt 27 417 250 083 917 750 583 583 midpt 12 14 29 000 833 17 20 23 26 000 667 500 333 167 frequency NN ODODE F 4 GRADE rel cum 12 3 12 12 6 25 75 25 75 43 75 12 71 88 88 93 75 25 100 Test for null hypothesis of normal distribution Chi square 2 1 872 with p value 0 39211 discrete TUCE mark TUCE as discrete freq TUCE Frequency distribution for TUCE obs 1 32 freque 12 1 14 1 ncy rel 3 12 3 12 cum 3 12 6 25 00 00 we Chapter 8 Discrete variables 52 17 3 9 38 15 62 19 3 9 38 25 00 20 2 6 25 31 25 21 4 12 50 43 75 22 2 6 25 50 00 23 4 12 50 62 50 24 3 9 38 71 88 25 4 12 50 84 38 26 2 6 25 90 62 27 1 3 12 93 75 28 1 3 12 96 88 29 1 3 12 100 00 Test for null hypothesis of normal
122. arame ters from all successful runs in which the gretl estimate agreed with the certified value to at least the 6 significant figures which are printed by default in the gretl regression output Table 16 1 Nonlinear regression the NIST tests Analytical derivatives Numerical derivatives Failures in 54 tests 4 5 Average iterations 32 127 Mean of min correct figures 8 120 6 980 parameters Worst of min correct figures 4 3 parameters Mean of min correct figures 8 000 5 673 standard errors Worst of min correct figures 5 2 standard errors Percent correct to at least 6 figures 96 5 91 9 parameters Percent correct to at least 6 figures 97 7 77 3 standard errors Using analytical derivatives the worst case values for both parameters and standard errors were 3The data shown in the table were gathered from a pre release build of gretl version 1 0 9 compiled with gcc 3 3 linked against glibc 2 3 2 and run under Linux on an i686 PC IBM ThinkPad A21m 4For the standard errors I excluded one outlier from the statistics shown in the table namely Lanczos1 This is an odd case using generated data with an almost exact fit the standard errors are 9 or 10 orders of magnitude smaller than the coefficients In this instance gretl could reproduce the certified standard errors to only 3 figures analytical derivatives and 2 figures numerical derivatives Chapter 16 Nonlinear least squares 122 improved to 6 correct figures
123. as selecting Edit attributes from the Variable menu F2 Same as e Included for compatibility with other programs g Has the same effect as selecting Define new variable from the Variable menu which maps onto the genr command h Opens a help window for gretl commands F1 Same as h Included for compatibility with other programs r Refreshes the variable list in the main window has the same effect as selecting Refresh window from the Data menu t Graphs the selected variable a line graph is used for time series datasets whereas a distribution plot is used for cross sectional data 2 5 The gretl toolbar At the bottom left of the main window sits the toolbar The icons have the following functions reading from left to right 1 Launch a calculator program A convenience function in case you want quick access to a calculator when you re working in gretl The default program is calc exe under MS Win dows or xcalc under the X window system You can change the program under the Tools Preferences General menu Programs tab 2 Start a new script Opens an editor window in which you can type a series of commands to be sent to the program as a batch 3 Open the gretl console A shortcut to the Gretl console menu item Section 2 3 above Chapter 2 Getting started 12 Open the gretl session icon window Open a window displaying available gretl function packa
124. asy to set up an omitted variables test equivalent to the familiar LM test in the context of a linear regression example 22 2 shows how to perform a variable addition test Example 22 2 Variable addition test in a probit model open greenel19_1 probit GRADE const GPA PSI series u uhat ols u const GPA PSI TUCE q printf Variable addition test for TUCE n printf Rsq T g p val g n trsq pvalue X 1 trsq The perfect prediction problem One curious characteristic of logit and probit models is that quite paradoxically estimation is not feasible if a model fits the data perfectly this is called the perfect prediction problem The reason why this problem arises is easy to see by considering equation 22 6 if for some vector B and scalar k it s the case that z lt k whenever y O and z gt k whenever y 1 the same thing is true for any multiple of f Hence L f8 can be made arbitrarily close to 0 simply by choosing enormous values for As a consequence the log likelihood has no maximum despite being bounded Gretl has a mechanism for preventing the algorithm from iterating endlessly in search of a non existent maximum One sub case of interest is when the perfect prediction problem arises because of a single binary explanatory variable In this case the offending variable is dropped from the model and estimation proceeds with the reduced specification Nevertheless it may happen that no single pe
125. at you need to feed a series into triplel as in triplel myseries while triple2 must be supplied a pointer to a series as in triple2 myseries Why make the distinction then There are two main reasons for doing so modularity and perfor mance By modularity we mean the insulation of a function from the rest of the script which calls it One of the many benefits of this approach is that your functions are easily reusable in other contexts To achieve modularity variables created within a function are local to that function and are destroyed when the function exits unless they are made available as return values and these values are picked up or assigned by the caller In addition functions do not have access to variables in outer scope that is variables that exist in the script from which the function is called except insofar as these are explicitly passed to the function as arguments By default when a variable is passed to a function as an argument what the function actually gets is a copy of the outer variable which means that the value of the outer variable is not modified by whatever goes on inside the function But the use of pointers allows a function and its caller to cooperate such that an outer variable can be modified by the function In effect this allows a function to return more than one value although only one variable can be returned directly see below The parameter in question is
126. ata quiet R commands end foreign and achieves the same effect as submitting the enclosed R commands via the GUI in the non interactive mode see section 25 3 above The send data option arranges for auto loading of the data present in the gretl session The quiet option prevents the output from R from being echoed in the gretl output Using this method replicating the example in the previous subsection is rather easy basically all it takes is encapsulating the content of the R script in a foreign end foreign block see example mals gt In future this facility may be extended to handle interaction with other programs but for the present only R commands are accepted Chapter 25 Gretl and R Example 25 1 Estimation of the Basic Structural Model simple open bjg gdt foreign language R send data y lt gretldata lg strmod lt StructTS y compon lt as ts tsSmooth strmod vars lt as matrix strmod coef gretl export compon gretl export vars end foreign append dotdir compon csv rename level 1g_level rename slope 1g_slope rename sea lg_seas vars mread dotdir vars mat Example 25 2 Estimation of the Basic Structural Model via a function function RStructTS series myseries smp1 ok myseries restrict sx argname myseries foreign language R send data quiet sx lt gretldatal l myseries strmod lt StructTS sx compon lt as ts tsSmooth strmod gre
127. ating vectors the cointegrating rank of the system 2 Estimate a VECM with the appropriate rank but subject to no further restrictions 157 Chapter 21 Cointegration and Vector Error Correction Models 158 3 Probe the interpretation of the cointegrating vectors as equilibrium conditions by means of restrictions on the elements of these vectors The following sections expand on each of these points giving further econometric details and explaining how to implement the analysis using gretl 21 2 Vector Error Correction Models as representation of a cointegrated system Consider a VAR of order p with a deterministic part given by u typically a polynomial in time One can write the n variate process y as Yt Ht A1Yt 1 A2Vt 2 ApYt p Et 21 1 But since y 1 Y Ay and Yi Ve 1 AVt 1 AVYt 2 AVt i 1 we can re write the above as p 1 Ay Ue My Y TA Yi Et 21 2 i 1 where II gt _ A and Ik _ Ai This is the VECM representation of 21 1 The interpretation of 21 2 depends crucially on r the rank of the matrix IT e If r 0 the processes are all I 1 and not cointegrated e If r n then IT is invertible and the processes are all I 0 e Cointegration occurs in between when 0 lt r lt n and II can be written as of In this case y is 1 1 but the combination z f 7 is I 0 If for example r 1 and the first element of was 1 then one could
128. ation are used to establish the rank of B in other words how many cointegration vectors the system has These are the A max test for hypotheses on indi vidual eigenvalues and the trace test for joint hypotheses Suppose that the eigenvalues A are sorted from largest to smallest The null hypothesis for the A max test on the i th eigenvalue is that A 0 The corresponding trace test instead considers the hypothesis A 0 for all j gt i The gretl command coint2 performs these two tests The corresponding menu entry in the GUI is Model Time Series Cointegration Test Johansen As in the ADF test the asymptotic distribution of the tests varies with the deterministic component ut one includes in the VAR see section 21 3 above The following code uses the denmark data file supplied with gretl to replicate Johansen s example found in his 1995 book open denmark coint2 2 LRM LRY IBO IDE rc seasonal In this case the vector 7 in equation 21 2 comprises the four variables LRM LRY IBO IDE The number of lags equals p in 21 2 that is the number of lags of the model written in VAR form Part of the output is reported below Chapter 21 Cointegration and Vector Error Correction Models 162 Johansen test Number of equations 4 Lag order 2 Estimation period 1974 3 1987 3 T 53 Case 2 Restricted constant Rank Eigenvalue Trace test p value Lmax test p value 0 0 43317 49 144 0 1284
129. ations does not exceed the value of the internal vari able loop_maxiter By default this equals 250 but you can specify a different value via the set command see the Gretl Command Reference Index loop A third form of loop control uses an index variable for example i In this case you specify starting and ending values for the index which is incremented by one each time round the loop The syntax looks like this loop i 1 20 The index variable may be a pre existing scalar if this is not the case the variable is created automatically and is destroyed on exit from the loop The index may be used within the loop body in either of two ways you can access the integer value of i see Example 9 4 or you can use its string representation 1 see Example 9 5 The starting and ending values for the index can be given in numerical form or by reference to predefined scalar variables In the latter case the variables are evaluated once at the start of the loop In addition with time series data you can give the starting and ending values in the form of dates as in loop 1 1950 1 1999 4 This form of loop control is intended to be quick and easy and as such it is subject to certain limitations You cannot do arithmetic within the loop control expression as in loop i k 2 k won t work But one extension is permitted for convenience you can inflect a loop control variable with a minus sign as in loop k lag lag OK Also no
130. ault solved via Cholesky decomposition which is highly accurate provided the matrix of cross products of the regressors X X is not very ill conditioned If this problem is detected gretl automatically switches to use QR decomposition The program has been tested rather thoroughly on the statistical reference datasets provided by NIST the U S National Institute of Standards and Technology and a full account of the results may be found on the gretl website follow the link Numerical accuracy To date two published reviews have discussed gretl s accuracy Giovanni Baiocchi and Walter Dis taso 2003 and Talha Yalta and Yasemin Yalta 2007 We are grateful to these authors for their careful examination of the program Their comments have prompted several modifications includ ing the use of Stephen Moshier s cephes code for computing p values and other quantities relating to probability distributions see netlib org changes to the formatting of regression output to en sure that the program displays a consistent number of significant digits and attention to compiler issues in producing the MS Windows version of gretl which at one time was slighly less accurate than the Linux version Gretl now includes a plugin that runs the NIST linear regression test suite You can find this under the Tools menu in the main window When you run this test the introductory text explains the expected result If you run this test and see
131. backs first the OLS estimator for the intercept c is inconsistent because the disturbance term has a non zero expected value second the OLS estimators for the other parameters are consistent but inefficient in view of the non normality of i Both issues can be addressed by estimating 17 5 by maximum likelihood Nevertheless OLS estimation is a quick and convenient way to provide starting values for the MLE algorithm Example 17 1 shows how to implement the model described so far The banks91 file contains part of the data used in Lucchetti Papi and Zazzaro 2001 17 4 GARCH models GARCH models are handled by gretl via a native function However it is instructive to see how they can be estimated through the mle command The following equations provide the simplest example of a GARCH 1 1 model Yt UTE Ee Ur Ot ut N 0 1 he w ae Phi Since the variance of y depends on past values writing down the log likelihood function is not simply a matter of summing the log densities for individual observations As is common in time series models y cannot be considered independent of the other observations in our sample and consequently the density function for the whole sample the joint density for all observations is not just the product of the marginal densities Maximum likelihood estimation in these cases is achieved by considering conditional densities so what we maximize is a conditional likelihood function
132. between alternative models based on a formal hy pothesis test For example one might choose a more general model over a more restricted one if the restriction in question can be formulated as a testable null hypothesis and the null is rejected on an appropriate test In other contexts one sometimes seeks a criterion for model selection that somehow measures the balance between goodness of fit or likelihood on the one hand and parsimony on the other The balancing is necessary because the addition of extra variables to a model cannot reduce the degree of fit or likelihood and is very likely to increase it somewhat even if the additional variables are not truly relevant to the data generating process The best known such criterion for linear models estimated via least squares is the adjusted R SSR n k 52 a gt TSS n 1 where n is the number of observations in the sample k denotes the number of parameters esti mated and SSR and TSS denote the sum of squared residuals and the total sum of squares for the dependent variable respectively Compared to the ordinary coefficient of determination or unadjusted R SSR TSS the adjusted calculation penalizes the inclusion of additional parameters other things equal R 1 19 2 Information criteria A more general criterion in a similar spirit is Akaike s 1974 Information Criterion AIC The original formulation of this measure is AIC 2 0
133. ble you to understand your options better The first step is the creation of a plain text source file containing the text or mathematics to be typset interspersed with mark up that defines how it should be formatted The second step is to run the source through a processing engine that does the actual formatting Typically this is either e a program called latex that generates so called DVI device independent output or e a program called pdflatex that generates PDF output For previewing one uses either a DVI viewer typically xdvi on GNU Linux systems or a PDF viewer for example Adobe s Acrobat Reader or xpdf depending on how the source was processed If the DVI route is taken there s then a third step to produce printable output typically using the program dvips to generate a PostScript file If the PDF route is taken the output is ready for printing without any further processing On the MS Windows and Mac OS X platforms gretl calls pdflatex to process the source file and expects the operating system to be able to find the default viewer for PDF output DVI is not supported On GNU Linux the default is to take the DVI route but if you prefer to use PDF you can do the following select the menu item Tools Preferences General then the Programs tab Find the item titled Command to compile TeX files and set this to pdflatex Make sure the Command to view PDF files is set to something appropriate
134. bles that is variables from which we subtract a fraction 0 of their average Notice that for o gt 0 0 1 while for Ge 0 0 0 This means that if all the variance is attributable to the individual effects then the fixed effects estimator is optimal if on the other hand individual effects are negligible then pooled OLS turns out unsurprisingly to be the optimal estimator To implement the GLS approach we need to calculate 0 which in turn requires estimates of the variances 0 and 0 These are often referred to as the within and between variances respec tively since the former refers to variation within each cross sectional unit and the latter to variation between the units Several means of estimating these magnitudes have been suggested in the liter ature see Baltagi 1995 gretl uses the method of Swamy and Arora 1972 oe is estimated by the residual variance from the fixed effects model and the sum a T 0 is estimated as T times the residual variance from the between estimator Vi XiB ei The latter regression is implemented by constructing a data set consisting of the group means of all the relevant variables Choice of estimator Which panel method should one use fixed effects or random effects One way of answering this question is in relation to the nature of the data set If the panel comprises observations on a fixed and relatively small set of units of interest say the member states
135. cAleer and L Oxley Practical Issues in Cointegration Analysis Oxford Blackwell 1999 Doornik J A and Hansen H 1994 An Omnibus Test for Univariate and Multivariate Normality working paper Nuffield College Oxford Edgerton D and Wells C 1994 Critical Values for the Cusumsq Statistic in Medium and Large Sized Samples Oxford Bulletin of Economics and Statistics 56 pp 355 65 Elliott G Rothenberg T J and Stock J H 1996 Efficient Tests for an Autoregressive Unit Root Econometrica 64 pp 813 36 Fiorentini G Calzolari G and Panattoni L 1996 Analytic Derivatives and the Computation of GARCH Estimates Journal of Applied Econometrics 11 pp 399 417 Frigo M and Johnson S G 2005 The Design and Implementation of FFTW3 Proceedings of the IEEE 93 2 pp 216 231 Invited paper Special Issue on Program Generation Optimization and Platform Adaptation Godfrey L G 1994 Testing for Serial Correlation by Variable Addition in Dynamic Models Esti mated by Instrumental Variables The Review of Economics and Statistics 76 3 pp 550 59 Golub G H and Van Loan C F 1996 Matrix Computations 3rd edition Baltimore and London The John Hopkins University Press Goossens M Mittelbach F and Samarin A 2004 The BT X Companion 2nd edition Boston Addison Wesley Gourieroux C Monfort A Renault E and Trognon A 1987 Generalized Residua
136. ccount of the current dataset in cluding the summary information and any specific information on each of the variables Add case markers Prompts for the name of a text file containing case markers short strings identifying the individual observations and adds this information to the data set See Chapter 4 Remove case markers Active only if the dataset has case markers identifying the obser vations removes these case markers Dataset structure invokes a series of dialog boxes which allow you to change the struc tural interpretation of the current dataset For example if data were read in as a cross section you can get the program to interpret them as time series or as a panel See also section 4 5 Compact data For time series data of higher than annual frequency gives you the option of compacting the data to a lower frequency using one of four compaction methods average sum start of period or end of period Expand data For time series data gives you the option of expanding the data to a higher frequency Transpose data Turn each observation into a variable and vice versa or in other words each row of the data matrix becomes a column in the modified data matrix can be useful with imported data that have been read in sideways e View menu Chapter 2 Getting started 10 Icon view Opens a window showing the content of the current session as a set of icons see section 3 4 Graph specifie
137. ce spanned by A are meaningful Legal operations on empty matrices are listed in Table 12 1 All other matrix operations gener ate an error when an empty matrix is given as an argument In line with the above interpreta tion some matrix functions return an empty matrix under certain conditions the functions diag vec vech unvech when the arguments is an empty matrix the functions I ones zeros mnormal muniform when one or more of the arguments is 0 and the function nullspace when its argument has full column rank 12 3 Selecting sub matrices You can select sub matrices of a given matrix using the syntax A rows cols where rows can take any of these forms Chapter 12 Matrix manipulation Function A transp A rows A cols A rank A det A ldet A tr A onenorm A infnorm A rcond A 84 Return value Table 12 1 Valid functions on an empty matrix A empty a single integer two integers separated by a colon the name of a matrix IES selects all rows selects the single specified row selects a range of rows selects the specified rows With regard to option 2 the integer value can be given numerically as the name of an existing scalar variable or as an expression that evaluates to a scalar With the option 4 the index matrix given in the rows field must be either p x 1 or 1 x p and should contain integer values in the range 1 to n where n is the number of rows in the matrix from which the
138. ce matrices 14 4 Special issues with panel data 15 Panel data 15 1 Estimation of panel models 15 2 Dynamic panel models 15 3 Panel illustration the Penn World Table o o o 16 Nonlinear least squares 16 1 Introduction and examples 16 2 Initializing the parameters 16 3 NLS dialog window 16 4 Analytical and numerical derivatives o ooo o 16 5 Controlling termination 16 6 Details on the code 16 7 Numerical accuracy 17 Maximum likelihood estimation 17 1 Generic ML estimation with gretl 83 85 86 86 93 94 94 94 95 95 96 97 97 98 99 101 102 102 103 104 108 110 110 114 116 118 118 118 119 119 120 120 120 123 Contents 17 2 17 3 17 4 17 5 17 6 G nuna estimatio s sasad ssd Ye Mee a ee e Re ee e ex Stochastic frontier cost function is cui ee ea a a OE Bw GARCOMOJELS 2520424 2482 CH a EEE ED DH ERDRRRD ES ROE RM ES o Analytical derivatives 2 260 664 e bh boo eee ee ewe eee ee eed Depugeine ML EUIPIS xs cada OH ke AAA A AR OE Ee eR OR 17 7 USING TURCHONS o i ew ek a Rw ea ew ROR ew we 18 GMM estimation 18 1 18 2 18 3 18 4 18 5 18 6 Introduction and terminology 2 e es ESAS GMM aran AS SA Chee ee SHADE EGS ROGERS GHEE WSUS aS GMM giclee Gee Sele eee a Geena eee ew ele bb a Covariance matrix DPUONS lt os ioaea HR ea nae RA A n
139. ce xtp Bo x1 181 Xt 282 Appending the conditional switch as in arma p q y const x1 x2 conditional would estimate the following model Vt Xty P Vt 1 PpYt p Et OJEr 1 OqEt q Ideally the issue broached above could be made moot by writing a more general specification that nests the alternatives that is P L Yi xtp zty 0 L er 20 6 we would like to generalize the arma command so that the user could specify for any estimation method whether certain exogenous variables should be treated as x s or zts but we re not yet at that point and neither are most other software packages Chapter 20 Time series models 148 Seasonal models A more flexible lag structure is desirable when analyzing time series that display strong seasonal patterns Model 20 1 can be expanded to p L L y O L O L et 20 7 For such cases a fuller form of the syntax is available namely arma pqa PQ y where p and q represent the non seasonal AR and MA orders and P and Q the seasonal orders For example arma 1 1 11 y would be used to estimate the following model 1 PL L yt H 1 OL 1 01 If y is a quarterly series and therefore s 4 the above equation can be written more explicitly as Yi H P Mt 1 H O Mte 4 H P P VYt 5 H Et OEt 1 O r 4 0 O Er 5 Such a model is known as a multiplicative seasonal ARMA model Gaps i
140. ckspace over the whole line erasing as you go Just hop to the start and add or delete characters If you type the first letters of a command name then press the Tab key readline will attempt to complete the command name for you If there s a unique completion it will be put in place automatically If there s more than one completion pressing Tab a second time brings up a list Probably the most useful mode for heavy duty work with gretlcli is batch non interactive mode in which the program reads and processes a script and sends the output to file For example greticli b scriptfile gt outputfile Note that scriptfile is treated as a program argument only the output file requires redirection gt Don t forget the b batch switch otherwise the program will wait for user input after executing the script and if output is redirected the program will appear to hang l Actually the key bindings shown below are only the defaults they can be customized See the readline manual 206 Part IV Appendices Appendix A Data file details A l Basic native format In gretl s native data format a data set is stored in XML extensible mark up language Data files correspond to the simple DTD document type definition given in gretldata dtd which is supplied with the gretl distribution and is installed in the system data directory e g usr share gret1 data on Linux Data files may be plain text or gzipped They contai
141. columns equal to the chosen cointegration rank therefore the product matrix Pi jalpha jbeta returns the reduced rank estimate of A 1 Since f is automatically identified via the Phillips nor malization see section 21 5 its unrestricted elements do have a proper covariance matrix which can be retrieved through the jvbeta accessor Chapter 12 Matrix manipulation 94 12 8 Namespace issues Matrices share a common namespace with data series and scalar variables In other words no two objects of any of these types can have the same name It is an error to attempt to change the type of an existing variable for example 3 ones 2 2 wrong scalar x matrix x It is possible however to delete or rename an existing variable then reuse the name for a variable of a different type scalar x 3 delete x matrix x ones 2 2 OK 12 9 Creating a data series from a matrix Section 12 1 above describes how to create a matrix from a data series or set of series You may sometimes wish to go in the opposite direction that is to copy values from a matrix into a regular data series The syntax for this operation is series sname mspec where sname is the name of the series to create and mspec is the name of the matrix to copy from possibly followed by a matrix selection expression Here are two examples series S X series ul U 1 It is assumed that x and U are pre existing matrices In the second example the s
142. commands print and store and certain estimation commands in a manner that may be useful with Monte Carlo analyses see Section 9 3 The following sections explain the various forms of the loop control expression and provide some examples of use of loops t If you are carrying out a substantial Monte Carlo analysis with many thousands of repetitions memory capacity and processing time may be an issue To minimize the use of computer resources run your script using the command line program gretlcli with output redirected to a file 9 2 Loop control variants Count loop The simplest form of loop control is a direct specification of the number of times the loop should be repeated We refer to this as a count loop The number of repetitions may be a numerical constant as in loop 1000 or may be read from a scalar variable as in loop replics 54 Chapter 9 Loop constructs 55 In the case where the loop count is given by a variable say replics in concept replics is an integer if the value is not integral it is converted to an integer by truncation Note that replics is evaluated only once when the loop is initially compiled While loop A second sort of control expression takes the form of the keyword while followed by a boolean expression For example loop while essdiff gt 00001 Execution of the commands within the loop will continue so long as a the specified condition evaluates as true and b the number of iter
143. conditions are not very restrictive is in this context a diagonal matrix whose non zero elements may be estimated using squared OLS residuals White referred to 14 5 as a heteroskedasticity consistent covariance matrix estimator HCCME Davidson and MacKinnon 2004 chapter 5 offer a useful discussion of several variants on White s HCCME theme They refer to the original variant of 14 5 in which the diagonal elements of are estimated directly by the squared OLS residuals e as HCo The associated standard errors are often called White s standard errors The various refinements of White s proposal share a lIn some specialized contexts spatial autocorrelation may be an issue Gretl does not have any built in methods to handle this and we will not discuss it here Chapter 14 Robust covariance matrix estimation 104 common point of departure namely the idea that the squared OLS residuals are likely to be too small on average This point is quite intuitive The OLS parameter estimates satisfy by design the criterion that the sum of squared residuals gt a gt y By is minimized for given X and y Suppose that B B This is almost certain to be the case even is OLS is not biased it would be a miracle if the 6 calculated from any finite sample were exactly equal to f But in that case the sum of squares of the true unobserved errors Zu X y Xb is bound to be greater than gt a The e
144. consumption function open greenel1_3 gdt run initial OLS ols CO Y genr essbak ess genr essdiff 1 genr beta coeff Y genr gamma 1 iterate OLS till the error sum of squares converges loop while essdiff gt 00001 form the linearized variables genr CO C gamma beta YAgamma log Y genr x1 YAgamma genr x2 beta YAgamma log Y run OLS ols CO 0 x1 x2 print final no df corr vcv genr beta coeff x1 genr gamma coeff x2 genr ess ess genr essdiff abs ess essbak essbak genr essbak ess endloop print parameter estimates using their proper names noecho printf alpha g n coeff 0 printf beta g n beta printf gamma g n gamma Indexed loop examples Example 9 4 shows an indexed loop in which the smp1 is keyed to the index variable i Suppose we have a panel dataset with observations on a number of hospitals for the years 1991 to 2000 where the year of the observation is indicated by a variable named year We restrict the sample to each of these years in turn and print cross sectional summary statistics for variables 1 through 4 Example 9 5 illustrates string substitution in an indexed loop The first time round this loop the variable V will be set to equal COMP1987 and the dependent variable for the ols will be PBT1987 The next time round V will be redefined as equal to COMP1988 and the dependent variable in the regression will be PBT1988 And so on Chapter 9 Loop c
145. ction Chapter 5 Special functions in genr Example 5 3 Delta Method function MPC matrix param matrix Y beta param 2 gamma param 3 y Y 1 matrix ret beta gamma yA gamma 1 return matrix ret end function William Greene Econometric Analysis 5e Chapter 9 set echo off set messages off open greene5_1 gdt Use OLS to initialize the parameters ols realcons O realdpi quiet genr a coeff 0 genr b coeff realdpi genr g 0 Run NLS with analytical derivatives nls realcons a b CrealdpiAg deriva 1 deriv b realdpiAg deriv g b realdpiAg log realdpi end nls matrix Y realdpi 2000 4 matrix theta coeff matrix V vcv mpc MPC theta amp Y matrix Jac fdjac theta MPC amp theta amp Y Sigma qform Jac V printf nmpc g std err g n mpc sqrtCSigma scalar teststat mpc 1 sqrt Sigma printf nTest for MPC 1 g p value g n teststat pvalue n abs teststat 41 Chapter 5 Special functions in genr Example 5 4 Periodogram via the Fourier transform nulldata 50 generate an AR 1 process series e normal series x 0 x 0 9 x 1 e compute the periodogram scale 2 pi nobs X x F fft X S sumr F A2 S S 2 nobs 2 1 scale omega seq 1 nobs 2 2 pi nobs omega omega S compare the built in command pergm x print omega Chapter 6 Sub sampling a dataset 6 1 Introduction Som
146. ction we review some aspects of genr functions that apply specifically to matrices A full account of each function is available in the Gretl Command Reference Creation and I O colnames diag I Tower makemask mnormal mread muniform mwrite ones seq unvech upper vec vech zeros Shape size arrangement cols dsort mshape msortby rows selifc selifr sort trimr Matrix algebra cdiv cholesky cmult det eigengen eigensym fft ffti ginv infnorm inv invpd Tdet mexp nullspace onenorm polroots qform qrdecomp rank rcond svd toepsolv tr transp Statistics transformations cdemean cum imaxc Tmaxr iminc iminr maxc maxr mcorr mcov mcovg meanc meanr minc minr mlag mols mpols mxtab princomp quantile resample sdc sumc sumr values Numerical methods BFGSmax fdjac Transformations lincomb Table 12 3 Matrix functions by category 1Note that to find the matrix square root you need the cholesky function see below moreover the exp function computes the exponential element by element and therefore does not return the matrix exponential unless the matrix is diagonal to get the matrix exponential use mexp Chapter 12 Matrix manipulation 88 Matrix reshaping In addition to the methods discussed in sections 12 1 and 12 3 a matrix can also be created by re arranging the elements of a pre existing matrix This is accomplished via the mshape function It takes three arguments the input matrix A and the rows and columns of the
147. culate some value having to do with a regression but are not interested in the full results of the regression you may wish to use the quiet flag with the estimation command as shown above A second example shows how to write a function call that assigns a return value to a variable in the caller function definition function get_uhat series y list xvars ols y O xvars quiet series uh uhat return series uh end function main script open data4 1 list xlist 2 3 4 function call series resid get_uhat price xlist 10 3 Deleting a function If you have defined a function and subsequently wish to clear it out of memory you can do so using the keywords delete or clear as in function myfunc delete function get_uhat clear Note however that if myfunc is already a defined function providing a new definition automatically overwrites the previous one so it should rarely be necessary to delete functions explicitly 10 4 Function programming details Variables versus pointers Series scalar and matrix arguments to functions can be passed in two ways as they are or as pointers For example consider the following function triplel series x series ret 3 x return series ret end function Chapter 10 User defined functions 65 function triple2 series x series ret 3 x return series ret end function These two functions are nearly identical and yield the same result the only difference is th
148. d by variable Suppose we have data on two variables x1 and x2 for each of 50 states in each of 5 years giving a total of 250 observations per variable One textual representation of such a data set would start with a block for x1 with 50 rows corresponding to the states and 5 columns corresponding to the years This would be followed vertically by a block with the same structure for variable x2 A fragment of such a data file is shown below with quinquennial observations 1965 1985 Imagine the table continued for 48 more states followed by another 50 rows for variable x2 x1 1965 1970 1975 1980 1985 AR 100 0 110 5 118 7 131 2 160 4 AZ 100 0 104 3 113 8 120 9 140 6 If a datafile with this sort of structure is read into gretl the program will interpret the columns as distinct variables so the data will not be usable as is But there is a mechanism for correcting the situation namely the stack function within the genr command Consider the first data column in the fragment above the first 50 rows of this column constitute a cross section for the variable x1 in the year 1965 If we could create a new variable by stacking the 3Note that you will have to modify such a datafile slightly before it can be read at all The line containing the variable name in this example x1 will have to be removed and so will the initial row containing the years otherwise they will be taken as numerical data Chapter 4 Data files 25 f
149. d sequence with unit variance X is a matrix of regressors or in the simplest case just a vector of 1s allowing for a non zero mean of y Note that if p 0 GARCH collapses to ARCH q the generalization is embodied in the 6 terms that multiply previous values of the error variance In principle the underlying innovation e could follow any suitable probability distribution and besides the obvious candidate of the normal or Gaussian distribution the t distribution has been used in this context Currently gretl only handles the case where e is assumed to be Gaussian However when the robust option to the garch command is given the estimator gretl uses for the covariance matrix can be considered Quasi Maximum Likelihood even with non normal distur bances See below for more on the options regarding the GARCH covariance matrix Example garch p q y const x where p gt 0 and q gt 0 denote the respective lag orders as shown in equation 20 12 These values can be supplied in numerical form or as the names of pre defined scalar variables GARCH estimation Estimation of the parameters of a GARCH model is by no means a straightforward task Con sider equation 20 12 the conditional variance at any point in time o depends on the conditional variance in earlier periods but o is not observed and must be inferred by some sort of Maxi mum Likelihood procedure Gretl uses the method proposed by Fiorentini Calzolari and Panatton
150. d table home jack gret1 Rdata tmp header TRUE gretldata lt ts gretldata start c 1949 1 frequency 12 Since our data were defined in gretl as time series we use an R time series object ts for short for the transfer In this way we can retain in R useful information such as the periodicity of the data and the sample limits The downside is that the names of individual series as defined in gretl are not valid identifiers In order to extract the variable 1g one needs to use the syntax lg lt gretldata lg ARIMA estimation can be carried out by issuing the following two R commands Chapter 25 Gretl and R 198 lg lt gretldata lg arima lg c 0 1 1 seasonal c 0 1 1 which yield Coefficients mal smal 0 4018 0 5569 s e 0 0896 0 0731 sigmaA2 estimated as 0 001348 log likelihood 244 7 aic 483 4 Happily the estimates again coincide 25 3 Running an R script Opening an R window and keying in commands is a convenient method when the job is small In some cases however it would be preferable to have R execute a script prepared in advance One way to do this is via the source command in R Alternatively gretl offers the facility to edit an R script and run it having the current dataset pre loaded automatically This feature can be accessed via the File Script Files menu entry By selecting User file one can load a pre existing R script if you want to create a new script instead
151. d vars Gives a choice between a time series plot a regular X Y scatter plot an X Y plot using impulses vertical bars an X Y plot with factor separation i e with the points colored differently depending to the value of a given dummy variable boxplots and a 3 D graph Serves up a dialog box where you specify the variables to graph See Chapter 7 for details Multiple graphs Allows you to compose a set of up to six small graphs either pairwise scatter plots or time series graphs These are displayed together in a single window Summary statistics Shows a full set of descriptive statistics for the variables selected in the main window Correlation matrix Shows the pairwise correlation coefficients for the selected variables Cross Tabulation Shows a cross tabulation of the selected variables This works only if at least two variables in the data set have been marked as discrete see Chapter 8 Principal components Produces a Principal Components Analysis for the selected vari ables Mahalonobis distances Computes the Mahalonobis distance of each observation from the centroid of the selected set of variables Cross correlogram Computes and graphs the cross correlogram for two selected vari ables e Add menu Offers various standard transformations of variables logs lags squares etc that you may wish to add to the data set Also gives the option of adding random variables and for time series data adding seasona
152. data are contained in the bjg sample dataset The following gretl code Chapter 25 Gretl and R 197 gt model lt Im price sqft bedrms baths gt summary model Call Im formula price sqft bedrms baths Residuals Min 10 Median 30 Max 55 533 16 219 6 093 22 432 68 703 Coefficients Estimate Std Error t value Pritil Intercept 129 06163 988 30326 1 462 0 174559 sqft 0 15480 0 03194 4 847 0 000675 bedrms 21 58752 27 02933 0 799 0 443037 baths 12 19276 43 25000 0 282 0 783758 Signif codes 0 0 001 0 01 0 05 7 0 1 7 1 Residual standard error 40 87 on 10 degrees of freedom Multiple R squared 0 836 Adjusted R squared 0 7868 F statistict 16 99 on 3 and 10 DF p value 0 0002986 gt t Figure 25 2 OLS regression on house prices via R open bjg arima 0 11 011 1g nc produces the estimates shown in Table 25 2 Table 25 2 Airline model from Box and Jenkins 1976 selected portion of gretl s estimates Variable Coefficient Std Error t statistic p value 01 0 401824 0 0896421 4 4825 0 0000 01 0 556936 0 0731044 7 6184 0 0000 Variance of innovations 0 00134810 Log likelihood 244 696 Akaike information criterion 483 39 If we now open an R session as described in the previous subsection the data passing mechanism is slightly different The R commands that read the data from gretl are in this case load data from gretl gretldata lt rea
153. data file in the GUI Explore the data generate graphs run regressions perform tests Then open the Command log edit out any redundant commands and save it under a specific name Run the script to generate a single file containing a concise record of your work e Start by establishing a new script file Type in any commands that may be required to set up transformations of the data see the genr command in the Gretl Command Reference Typically this sort of thing can be accomplished more efficiently via commands assembled with forethought rather than point and click Then save and run the script the GUI data window will be updated accordingly Now you can carry out further exploration of the data via the GUI To revisit the data at a later point open and rerun the preparatory script first Scripts and data files One common way of doing econometric research with gretl is as follows compose a script execute the script inspect the output modify the script run it again with the last three steps repeated as many times as necessary In this context note that when you open a data file this clears out most of gretl s internal state It s therefore probably a good idea to have your script start with an open command the data file will be re opened each time and you can be confident you re getting fresh results One further point should be noted When you go to open a new data file via the graphical interface you are alwa
154. data on career choice from Keane and Wolpin 1997 The dependent variable is the occupational status of an individual 0 in school 1 not in school and not working 2 working and the explanatory variables are education and work experience linear and square plus a black binary variable The full data set is a panel here the analysis is confined to a cross section for 1987 For explanations of the matrix methods employed in the script see chapter 12 Example 22 4 Multinomial logit function mlogitlogprobs series y matrix X matrix theta scalar n max y scalar k cols X matrix b mshape theta k n matrix tmp X b series ret In 1 sumr exp tmp loop for i 1 n quiet series x tmp i ret y 1 x 0 end loop return series ret end function open Keane gdt status status 1 dep var must be O based smpl year 87 ok status restrict matrix X educ exper expersq black const scalar k cols X matrix theta zeros 2 k 1 mle loglik mlogitlogprobs status X theta params theta end mle verbose hessian Chapter 22 Discrete and censored dependent variables 179 22 4 The Tobit model The Tobit model is used when the dependent variable of a model is censored Assume a latent variable y can be described as k Vi Dd xijbj i j 1 where e N 0 0 If y were observable the model s parameters could be estimated via ordinary least squares On the contrary sup
155. distribution Chi square 2 1 872 with p value 0 39211 As can be seen from the sample output a Doornik Hansen test for normality is computed auto matically This test is suppressed for discrete variables where the number of distinct values is less than 10 This command accepts two options quiet to avoid generation of the histogram when invoked from the command line and gamma for replacing the normality test with Locke s nonparametric test whose null hypothesis is that the data follow a Gamma distribution If the distinct values of a discrete variable need to be saved the values matrix construct can be used see chapter 12 The xtab command The xtab command cab be invoked in either of the following ways First xtab ylist xlist where ylist and xlist are lists of discrete variables This produces cross tabulations two way frequencies of each of the variables in ylist by row against each of the variables in xlist by column Or second xtab xlist In the second case a full set of cross tabulations is generated that is each variable in xlist is tabu lated against each other variable in the list In the graphical interface this command is represented by the Cross Tabulation item under the View menu which is active if at least two variables are selected Here is an example of use open greene22_2 discrete Z mark Z1 Z8 as discrete xtab Z1 Z4 Z5 Z6 which produces Cross tabulation of Z1 rows agai
156. e genr alpha 10 genr beta 5 genr gamma 1 C alpha beta Y gamma deriv alpha 1 deriv beta Y gamma deriv gamma beta Y gamma log Y Show details of iterations Robust standard errors 5 Help amp Clear Q cancel Figure 16 1 NES dialog box 16 4 Analytical and numerical derivatives If you are able to figure out the derivatives of the regression function with respect to the para meters it is advisable to supply those derivatives as shown in the examples above If that is not possible gretl will compute approximate numerical derivatives The properties of the NLS algo rithm may not be so good in this case see Section 16 7 If analytical derivatives are supplied they are checked for consistency with the given nonlinear function If the derivatives are clearly incorrect estimation is aborted with an error message If the derivatives are suspicious a warning message is issued but estimation proceeds This warning may sometimes be triggered by incorrect derivatives but it may also be triggered by a high degree of collinearity among the derivatives Chapter 16 Nonlinear least squares 120 Note that you cannot mix analytical and numerical derivatives you should supply expressions for all of the derivatives or none 16 5 Controlling termination The NLS estimation procedure is an iterative process Iteration is terminated when the criterion for convergence is met or whe
157. e Note that the params statement has been replaced by a series of deriv statements these have the double function of identifying the parameters over which to optimize and providing an analytical expression for their respective score elements 17 6 Debugging ML scripts We have discussed above the main sorts of statements that are permitted within an mle block namely e auxiliary commands to generate helper variables e deriv statements to specify the gradient with respect to each of the parameters and e a params statement to identify the parameters in case analytical derivatives are not given For the purpose of debugging ML estimators one additional sort of statement is allowed you can print the value of a relevant variable at each step of the iteration This facility is more restricted then the regular print command The command word print should be followed by the name of just one variable a scalar series or matrix Chapter 17 Maximum likelihood estimation 131 In the last example above a key variable named tmp was generated forming the basis for the analytical derivatives To track the progress of this variable one could add a print statement within the ML block as in series tmp dnorm ndx y P 1 y 1 P print tmp 17 7 Using functions The mle command allows you to estimate models that gretl does not provide natively in some cases it may be a good idea to wrap up the mle block in a user defined function see Chapt
158. e Using functions we can simplify this task considerably and eventually be able to write something easy like list X const x zipty X Let s see how this can be done First we need to define a function called zip that will take two arguments a dependent variable y and a list of explanatory variables X An example of such function can be seen in script 17 2 By inspecting the function code you can see that the actual estimation does not happen here rather the zipO function merely formats and prints out the results coming from another user written function namely zip_estimateQ 3The actual ZIP model is in fact a bit more general than the one presented here The specialized version discussed in this section was chosen for the sake of simplicity For futher details see Greene 2003 Chapter 17 Maximum likelihood estimation 132 Example 17 2 Zero inflated Poisson Model user level function user level function estimate the model and print out the results function zip series y list X matrix ret zip_estimate y X matrix coef ret 1 matrix vcv ret 2 cols ret printf nZero inflated Poisson model n n scalar c coef 1 scalar se sqrt vcv 1 1 scalar zs c se scalar pv 2 pvalue n zs printf alpha 9 4f 9 4f 8 3f 8 3f n C se ZS pv k 2 loop foreach i X q sprintf s i scalar c coef k scalar se sqrt vcev k k scalar zs c se scalar pv 2 pvalue n zs printf 10s 9 4f 9
159. e e a Re ee ea RG Oe ie woi ai al a 9 3 Progressive MIDE sagir bee AAA A RO Pa O4 Loop examples Liar a A a E RA RS ae RR SS A ee E 10 User defined functions 10 1 Defining a function lt ea a ee 10 2 Calling a TURCO ea eck a a ew a a a ee ae 10 3 Deleting a fumetion conca ee ee ee ee ee de 10 4 Function programming details 1 0 ee 10 5 Function packages 2 si a ee Ew we 11 Named lists and strings 11 1 Named lists 11 2 Named strings 12 Matrix manipulation AL Beane MALES spa ee oe ee a ee eee ee ee ee 12 2 Empty matrices 34 35 35 36 37 39 43 43 43 44 45 45 46 46 47 49 49 50 54 54 54 57 58 62 62 63 64 64 70 75 75 78 Contents 12 3 Selecting sub matrices 12 4 Matrix operators 12 5 Matrix scalar operators 12 6 Matrix functions 12 7 Matrix accessors 12 8 Namespace issues 12 9 Creating a data series from a Matrix coco daa a a a A ee a ee 12 10Matrices and lists 12 11 Deleting a matrix 12 12Printing a matrix 12 13Example OLS using matrices 13 Cheat sheet 13 1 Dataset handling 13 2 Creating modifying variables 13 3 Neal tricks 22 0662 eae ee a II Econometric methods 14 Robust covariance matrix estimation 14 1 Introduction 14 2 Cross sectional data and the HCCME 14 3 Time series data and HAC covarian
160. e example may be rewritten more cleanly as Gretl side matrix A mshape seq 3 14 4 3 err mwrite A dotdir mymatfile mat R side fname lt paste gret dotdir mymatfile mat sep A lt as matrix read table fname skip 1 Passing data from R to gretl For passing data in the opposite direction gretl defines a special function that can be used in the R environment An R object will be written as a temporary file in gretl s dotdir directory from where it can be easily retrieved from within gretl Chapter 25 Gretl and R 200 R output A ROO SEERE load data from gretl gretldata lt read table home jack gretl Rdata tmp header TRUE gretldata lt ts gretldata start c 1949 1 frequency 12 load script from gretl lg lt gretldata lg gt arima lg c 0 1 1 seasonal c 0 1 1 Call arima x lg order c 0 1 1 seasonal c 0 1 1 Coefficients mal smal 0 4018 0 5569 s e 0 0896 0 0731 sigma 2 estimated as 0 001348 log likelihood 244 7 aic 483 4 gt Figure 25 5 Output from a non interactive R run The name of this function is gret1 export and it accepts one argument the object to be ex ported At present the objects that can be exported with this method are matrices data frames and time series objects The function creates a text file with the same name as the exported object in gretl s temporary directory Data frames and t
161. e free parameter vectors associated with f and a respectively We may refer to the free parameters collectively as 0 the column vector formed by concatenating and y Gretl uses this representation internally when testing the restrictions If the list of restrictions that is passed to the restrict command contains more constraints than necessary to achieve identification then an LR test is performed moreover the restrict com mand can be given the ful1 switch in which case full estimates for the restricted system are printed including the T terms and the system thus restricted becomes the current model for the purposes of further tests Thus you are able to carry out cumulative tests as in Chapter 7 of Johansen 1995 Syntax The full syntax for specifying the restriction is an extension of the one exemplified in the previous section Inside a restrict end restrict block valid statements are of the form parameter linear combination scalar where a parameter linear combination involves a weighted sum of individual elements of 6 or but not both in the same combination the scalar on the right hand side must be 0 for combina tions involving amp but can be any real number for combinations involving 8 Below we give a few examples of valid restrictions b 1 1 1 618 b 1 4 2 b 2 5 0 a 1 3 0 a 1 1 a 1 2 0 A special syntax is reserved for the case when a certain constraint should be applied to a
162. e number of non obvious choices one has to make when using GMM is high and in finite samples each of these can have dramatic consequences on the eventual output Some of the factors that may affect the results are 1 Orthogonality conditions can be written in more than one way for example if E x u 0 then E x u 1 0 holds too It is possible that a different specification of the moment conditions leads to different results 2 As with all other numerical optimization algorithms weird things may happen when the ob jective function is nearly flat in some directions or has multiple minima BFGS is usually quite good but there is no guarantee that it always delivers a sensible solution if one at all 3 The 1 step and to a lesser extent the 2 step estimators may be sensitive to apparently trivial details like the re scaling of the instruments Different choices for the initial weights matrix can also have noticeable consequences 4 With time series data there is no hard rule on the appropriate number of lags to use when computing the long run covariance matrix see section 18 4 Our advice is to go by trial and error since results may be greatly influenced by a poor choice Future versions of gretl will include more options on covariance matrix estimation One of the consequences of this state of things is that replicating various well known published studies may be extremely difficult Any non trivial result is virtually i
163. e of the refinements noted below are available only in gretl 1 7 2 and higher To define a string variable you can use either of two commands string or sprintf The string command is simpler you can type for example some stuff I want to save getenv HOME s1 11 string s1 string s2 string s3 Chapter 11 Named lists and strings 79 The first field after string is the name under which the string should be saved then comes an equals sign then comes a specification of the string to be saved This can be the keyword null to produce an empty string or may take any of the following forms a string literal enclosed in double quotes or the name of an existing string variable or a function that returns a string see below or any of the above followed by and an integer offset The role of the integer offset is to use a substring of the preceding element starting at the given character offset An empty string is returned if the offset is greater than the length of the string in question To add to the end of an existing string you can use the operator as in string s1 some stuff I want to string s1 save or you can use the operator to join two or more strings as in string s1 sweet string s2 Home s1 home The sprintf command is more flexible It works exactly as gretl s printf command except that the format string must be preceded by the name of a string variable For examp
164. e specific version number of the file you downloaded at step 1 3 Change directory to the gretl source directory created at step 2 e g gret1 1 6 6 4 Proceed to the next section Configure and make To work with CVS you ll first need to install the cvs client program if it s not already on your sys tem Relevant resources you may wish to consult include the CVS website at www nongnu org cvs general information on sourceforge CVS on the SourceForge CVS page and instructions specific to gretl at the SF gretl CVS page When grabbing the CVS sources for the first time you should first decide where you want to store the code For example you might create a directory called cvs under your home directory Open a terminal window cd into this directory and type the following commands cvs d pserver anonymous gret1 cvs sourceforge net cvsroot gretl login cvs z3 d pserver anonymousQgretl cvs sourceforge net cvsroot gretl co P gretl After the first command you will be prompted for a password just hit the Enter key After the second command cvs should create a subdirectory named gret1 and fill it with the current sources When you want to update the source this is very simple just move into the gret1 directory and type cvs update d P Assuming you re now in the CVS gret1 directory you can proceed in the same manner as with the released source package Configure the source The next command you need is configure this
165. e subtle issues can arise here This chapter attempts to explain the issues A sub sample may be defined in relation to a full data set in two different ways we will refer to these as setting the sample and restricting the sample respectively 6 2 Setting the sample By setting the sample we mean defining a sub sample simply by means of adjusting the starting and or ending point of the current sample range This is likely to be most relevant for time series data For example one has quarterly data from 1960 1 to 2003 4 and one wants to run a regression using only data from the 1970s A suitable command is then smp1 1970 1 1979 4 Or one wishes to set aside a block of observations at the end of the data period for out of sample forecasting In that case one might do smpl 2000 4 where the semicolon is shorthand for leave the starting observation unchanged The semicolon may also be used in place of the second parameter to mean that the ending observation should be unchanged By unchanged here we mean unchanged relative to the last smp1 setting or relative to the full dataset if no sub sample has been defined up to this point For example after smp1 1970 1 2003 4 smp1 2000 4 the sample range will be 1970 1 to 2000 4 An incremental or relative form of setting the sample range is also supported In this case a relative offset should be given in the form of a signed integer or a semicolon to indicate
166. e sum of specified coefficients cusum the Harvey Collier t statistic kpss KPSS stationarity test no p value available Imtest see below meantest test for difference of means omit joint test for the significance of variables omitted from a model reset Ramsey s RESET restrict general linear restriction runs runs test for randomness testuhat test for normality of residual and vartest test for difference of vari ances In most cases both a test and a pvalue are stored the exception is the KPSS test for which a p value is not currently available An important point to notice about this mechanism is that the internal variables test and pvalue are over written each time one of the tests listed above is performed If you want to reference these values you must do so at the correct point in the sequence of gretl commands A related point is that some of the test commands generate by default more than one test statistic and p value in these cases only the last values are stored To get proper control over the retrieval of values via test and pvalue you should formulate the test command in such a way that the result is unambiguous This comment applies in particular to the adf and Imtest commands e By default the adf command generates three variants of the Dickey Fuller test one based on a regression including a constant one using a constant and linear trend and one using a constant and a quadratic trend When you
167. e the covariance matrix of the parameter estimates is printed 16 2 Initializing the parameters The parameters of the regression function must be given initial values prior to the nls command This can be done using the genr command or in the GUI program via the menu item Variable Define new variable In some cases where the nonlinear function is a generalization of or a restricted form of a linear model it may be convenient to run an ols and initialize the parameters from the OLS coefficient 118 Chapter 16 Nonlinear least squares 119 estimates In relation to the first example above one might do ols CO Y genr alpha coeff 0 genr beta coeff Y genr gamma 1 And in relation to the second example one might do ols y O x1 x2 genr alpha coeff 0 genr beta coeff x1 16 3 NLS dialog window It is probably most convenient to compose the commands for NLS estimation in the form of a gretl script but you can also do so interactively by selecting the item Nonlinear Least Squares under the Model Nonlinear models menu This opens a dialog box where you can type the function specification possibly prefaced by genr lines to set the initial parameter values and the derivatives if available An example of this is shown in Figure 16 1 Note that in this context you do not have to supply the nls and end nls tags NLS Specify function and derivatives if possible Please refer to Help for guidanc
168. eadsheet 4 Select data series from a suitable database 5 Use your favorite text editor or other software tools to a create data file in gretl format inde pendently Here are a few comments and details on these methods Common points on imported data Options 1 and 2 involve using gretl s import mechanism For gretl to read such data success fully certain general conditions must be satisfied e The first row must contain valid variable names A valid variable name is of 15 characters maximum starts with a letter and contains nothing but letters numbers and the underscore character _ Longer variable names will be truncated to 15 characters Qualifications to the above First in the case of an ASCII or CSV import if the file contains no row with variable names the program will automatically add names v1 v2 and so on Second by the first row is meant the first relevant row In the case of ASCII and CSV imports blank rows and rows beginning with a hash mark are ignored In the case of Excel and Gnumeric imports you are presented with a dialog box where you can select an offset into the spreadsheet so that gretl will ignore a specified number of rows and or columns Chapter 4 Data files 21 e Data values these should constitute a rectangular block with one variable per column and one observation per row The number of variables data columns must match the number of variable names given See also sect
169. eal m xn matrix A Let k min m n The decomposition is A UV where U is an m x k orthogonal matrix gt is an k x k diagonal matrix and V is an k x n orthogonal matrix The diagonal elements of X are the singular values of A they are real and non negative and are returned in descending order The first k columns of U and V are the left and right singular vectors of A The svd function returns the singular values in a vector of length k The left and or right singu lar vectors may be obtained by supplying non null values for the second and or third arguments respectively For example matrix s svd A QU amp V matrix s svd A null null matrix s svd A null amp V In the first case both sets of singular vectors are obtained in the second case only the singular values are obtained and in the third the right singular vectors are obtained but U is not computed Please note when the third argument is non null it is actually V that is provided To reconstitute the original matrix from its SVD one can do matrix s svd A amp U amp V matrix B U s V Finally the syntax for mols is matrix B mols Y X Qu This function returns the OLS estimates obtained by regressing the T x n matrix Y on the T x k matrix X that is a k x n matrix holding X X X Y The Cholesky decomposition is used The matrix U if not nu11 is used to store the residuals Reading and writing matrices from to text files
170. eate a package Start the GUI program and take a look at the File Function files menu This menu contains four items On local machine On server Edit package New package Select New package This will produce an error message unless at least one user defined function is currently loaded in memory see the previous point In the first dialog you get to select e A public function to package e Zero or more private helper functions Public functions are directly available to users private functions are part of the behind the scenes mechanism in a function package On clicking OK a second dialog should appear see Figure 10 2 where you get to enter the package information author version date and a short description You can also enter help text for the public interface You have a further chance to edit the code of the function s to be packaged by clicking on Edit function code If the package contains more than one function a drop down selector will be shown And you get to add a sample script that exercises your package This will be helpful for potential users and also for testing A sample script is required if you want to upload the package to the gretl server for which a check box is supplied You won t need it right now but the button labeled Save as script allows you to reverse engineer a function package writing out a script that contains all
171. ecification one can estimate Transfer Function models which generalize ARMA by adding the effects of exogenous variable distributed across time Gretl provides a way to estimate both forms Models written as in 20 2 are estimated by maximum likelihood models written as in 20 3 are estimated by conditional maximum likelihood For more on these options see the section on Estimation below In the special case when x Z 1 that is the models include a constant but no exogenous variables the two specifications discussed above reduce to P L yt u O L et 20 4 and P L ye K O L es 20 5 respectively These formulations are essentially equivalent but if they represent one and the same process u and ox are fairly obviously not numerically identical rather a 1 di bp u The gretl syntax for estimating 20 4 is simply arma pq y The AR and MA lag orders p and q can be given either as numbers or as pre defined scalars The parameter p can be dropped if necessary by appending the option nc no constant to the command If estimation of 20 5 is needed the switch conditional must be appended to the command as in arma p q y conditional Generalizing this principle to the estimation of 20 2 or 20 3 you get that arma p q y const x1 x2 would estimate the following model Mt XB p Yer Xt 18 Pp Yi p xi pB Et O1Et 1 Og t g where in this instan
172. ed as time series further issues may arise relating to the frequency of time series data In a gretl time series data set all the series must have the same frequency Suppose you wish to make a combined dataset using series that in their original state are not all of the same frequency For example some series are monthly and some are quarterly Your first step is to formulate a strategy Do you want to end up with a quarterly or a monthly data set A basic point to note here is that compacting data from a higher frequency e g monthly to a lower frequency e g quarterly is usually unproblematic You lose information in doing so but in general it is perfectly legitimate to take say the average of three monthly observations to create a quarterly observation On the other hand expanding data from a lower to a higher frequency is not in general a valid operation In most cases then the best strategy is to start by creating a data set of the lower frequency and then to compact the higher frequency data to match When you import higher frequency data from a database into the current data set you are given a choice of compaction method average sum start of period or end of period In most instances average is likely to be appropriate You can also import lower frequency data into a high frequency data set but this is generally not recommended What gretl does in this case is simply replicate the values of the lower fr
173. ed by a k vector of unknown parameters 0 which we assume is contained in a set O and which can be used to evaluate the probability of observing a sample with any given characteristics After observing the data the values Y are given and this function can be evaluated for any legiti mate value of 6 In this case we prefer to call it the likelihood function the need for another name stems from the fact that this function works as a density when we use the yrs as arguments and 0 as parameters whereas in this context 0 is taken as the function s argument and the data Y only have the role of determining its shape In standard cases this function has a unique maximum The location of the maximum is unaffected if we consider the logarithm of the likelihood or log likelihood for short this function will be denoted as 0 log f Y 0 The log likelihood functions that gretl can handle are those where 0 can be written as T 0 gt 0 t 1 which is true in most cases of interest The functions 4 0 are called the log likelihood contribu tions Moreover the location of the maximum is obviously determined by the data Y This means that the value 0 Y Argmax 0 17 1 050 is some function of the observed data a statistic which has the property under mild conditions of being a consistent asymptotically normal and asymptotically efficient estimator of 0 Sometimes it is possible to write down explicitly t
174. ed in terms of the two econometric desiderata efficiency and consistency From a purely statistical viewpoint we could say that there is a tradeoff between robustness and efficiency In the fixed effects approach we do not make any hypotheses on the group effects that is the time invariant differences in mean between the groups beyond the fact that they exist Chapter 15 Panel data 113 and that can be tested see below As a consequence once these effects are swept out by taking deviations from the group means the remaining parameters can be estimated On the other hand the random effects approach attempts to model the group effects as drawings from a probability distribution instead of removing them This requires that individual effects are representable as a legitimate part of the disturbance term that is zero mean random variables uncorrelated with the regressors As a consequence the fixed effects estimator always works but at the cost of not being able to estimate the effect of time invariant regressors The richer hypothesis set of the random effects estimator ensures that parameters for time invariant regressors can be estimated and that esti mation of the parameters for time varying regressors is carried out more efficiently These advan tages though are tied to the validity of the additional hypotheses If for example there is reason to think that individual effects may be correlated with some of the ex
175. ed of Fortran code written by Roger Koenker this is accompanied by various driver and auxiliary functions written in the R language by Koenker and Martin Machler The latter functions have been re worked in C for gretl We have added some guards against potential numerical problems in small samples 184 Chapter 23 Quantile regression 185 By default standard errors are computed according to the asymptotic formula given by Koenker and Bassett 1978 Alternatively if the robust option is given we use the sandwich estimator developed in Koenker and Zhao 1994 2 23 3 Confidence intervals An option intervals is available When this is given we print confidence intervals for the para meter estimates instead of standard errors These intervals are computed using the rank inversion method and in general they are asymmetrical about the point estimates that is they are not simply plus or minus so many standard errors The specifics of the calculation are inflected by the robust option without this the intervals are computed on the assumption of IID errors Koenker 1994 with it they use the heteroskedasticity robust estimator developed by Koenker and Machado 1999 By default 90 percent intervals are produced You can change this by appending a confidence value expressed as a decimal fraction to the intervals option as in quantreg tau reglist intervals 95 When the confidence intervals option is selected the parame
176. ee an icon representing your graph Right click on that and select Edit plot commands from the pop up menu This opens an editing window with the actual gnuplot commands displayed You can edit these commands and either save them for future processing or send them to gnuplot directly using the Execute cogwheel button on the toolbar in the plot commands editing window To find out more about gnuplot visit www gnuplot info This site has documentation for the current version of the program in various formats See also the entry for gnuplot in the Gretl Command Reference and the graph and plot com mands for quick and dirty ASCII graphs Main x axis Y axis Lines Labels Title of plot price versus sqft with least squares fit key position left top v fitted line linear y a b x gt Show full border TrueType font Vera 8 Color 1 mi Color 2 mi Color 3 El Reset to defautt ok X Close Figure 7 1 gretl s gnuplot controller 7 2 Boxplots These plots after Tukey and Chambers display the distribution of a variable The central box encloses the middle 50 percent of the data i e it is bounded by the first and third quartiles The whiskers extend to the minimum and maximum values A line is drawn across the box at the median and a sign identifies the mean In the case of boxplots with confidence intervals dotted lines show the limits of an approximate 90 percent confidence inte
177. efined by ye k Xt ut Chapter 21 Cointegration and Vector Error Correction Models 160 where again k is a real number and u is a white noise process Since u is stationary by definition xt and y cointegrate that is their difference Ze Me Xp k uty is a stationary process For k 0 Z is simple zero mean white noise whereas for k 0 the process Zt is white noise with a non zero mean After some simple substitutions the two equations above can be represented jointly as a VAR 1 system k 1 a ye a A ie Yt 1 Urte Xt m O 1 Xt 1 Et or in VECM form El lesa e e a A E a a red fe Yt 1 Ho AB x Nt Ho amp Zt 1 Nt where is the cointegration vector and is the loadings or adjustments vector We are now ready to consider three possible cases 1 m 0 In this case x is trended as we just saw it follows that y also follows a linear trend because on average it keeps at a fixed distance k from x The vector uy is unrestricted 2 m O and k 4 0 In this case x is not trended and as a consequence neither is y However the mean distance between y and x is non zero The vector uy is given by 5 which is not null and therefore the VECM shown above does have a constant term The constant however is subject to the restriction that its second element must be 0 More generally Uo is a multiple of the vector Note that the VECM could also be written as Yt 1 Ayt 1
178. em File Databases For details on the format of gretl databases see Appendix A Online access to databases As of version 0 40 gretl is able to access databases via the internet Several databases are available from Wake Forest University Your computer must be connected to the internet for this option to work Please see the description of the data command under gretl s Help menu t Visit the gretl data page for details and updates on available data Foreign database formats Thanks to Thomas Doan of Estima who made available the specification of the database format used by RATS 4 Regression Analysis of Time Series gretl can handle such databases or at least a subset of same namely time series databases containing monthly and quarterly series Gretl can also import data from PcGive databases These take the form of a pair of files one containing the actual data with suffix bn7 and one containing supplementary information 1n7 4 4 Creating a data file from scratch There are several ways of doing this 1 Find or create using a text editor a plain text data file and open it with gretl s Import ASCII option 2 Use your favorite spreadsheet to establish the data file save it in Comma Separated Values format if necessary this should not be necessary if the spreadsheet format is MS Excel Gnu meric or Open Document then use one of gretl s Import options 3 Use gretl s built in spr
179. em of autocorrelation by including relevant lagged variables in a time series model or in other words by specifying the dynamics of the model more fully HAC estimation should not be seen as the first resort in dealing with an autocorrelated error process That said the obvious extension of White s HCCME to the case of autocorrelated errors would seem to be this estimate the off diagonal elements of Q that is the autocovariances E u u using once again the appropriate OLS residuals 0 s 4 s This is basically right but demands an important amendment We seek a consistent estimator one that converges towards the true Q as the sample size tends towards infinity This can t work if we allow unbounded serial depen dence Bigger samples will enable us to estimate more of the true wts elements that is for t and s more widely separated in time but will not contribute ever increasing information regarding the maximally separated wrs pairs since the maximal separation itself grows with the sample size To ensure consistency we have to confine our attention to processes exhibiting temporally limited dependence or in other words cut off the computation of the ts values at some maximum value of p t s where p is treated as an increasing function of the sample size T although it cannot increase in proportion to T The simplest variant of this idea is to truncate the computation at some finite lag order p where p grows as
180. ement we nonetheless allow the annealing to randomize the starting point Experiments indicated that the latter effect can be helpful Besides annealing a further alternative is manual initialization This is done by passing a prede fined vector to the set command with parameter initvals as in set initvals myvec The details depend on whether the switching algorithm or LBFGS is used For the switching algo rithm there are two options for specifying the initial values The more user friendly one for most people we suppose is to specify a matrix that contains vec B followed by vec c For example open denmark gdt vecm 2 1 LRM LRY IBO IDE rc seasonals matrix BA 1 1 6 6 6 0 2 0 1 0 02 0 03 set initvals BA restrict b 1 1 b 1 b 2 0 b 3 b 4 0 end restrict In this example from Johansen 1995 the cointegration rank is 1 and there are 4 variables However the model includes a restricted constant the rc flag so that B has 5 elements The a matrix has 4 elements one per equation So the matrix BA may be read as Bi B2 B3 Ba Bs X1 X2 X3 4 The other option which is compulsory when using LBFGS is to specify the initial values in terms of the free parameters p and y Getting this right is somewhat less obvious As mentioned above the implicit form restriction Rvec B q has explicit form vec fB Hd ho where H R the right nullspace of R The vector q is shorter by
181. ements it is numerically more efficient to group them into a matrix rather than invoking fft for each vector separately As an example consider the multiplication of two polynomals a x 1 0 5x b x 1 0 3x 0 8x c x a x b x 1 0 8x 0 65x 0 4x The coefficients of the polynomial c x are the convolution of the coefficents of a x and b x the following gretl code fragment illustrates how to compute the coefficients of c x define the two polynomials a 1 0 5 0 O F b 1 0 3 0 8 0 perform the transforms fa fft a fb fft b complex multiply the two transforms fc cmult fa fb compute the coefficients of c via the inverse transform c ffti fo Maximum efficiency would have been achieved by grouping a and b into a matrix The computa tional advantage is so little in this case that the exercise is a bit silly but the following alternative may be preferable for a large number of rows columns define the two polynomials 1 0 5 0 0 1 0 3 0 8 0 perform the transforms jointly fft a b complex multiply the two transforms fc cmult f 1 2 f 3 41 compute the coefficients of c via the inverse transform c ffti fc h Oo WM Traditionally the Fourier tranform in econometrics has been mostly used in time series analysis the periodogram being the best known example Example script 5 4 shows how to compute the periodogram of a time series via the fft fun
182. end function nulldata 10 set echo off set messages off X zeros 2000 2000 ff 0 set stopwatch loop 100 r a X end loop fa stopwatch set stopwatch loop 100 r b amp X end loop fb stopwatch printf Elapsed time n twithout pointers copy twith pointers no copy g seconds n g seconds n fa fb where i gets the name of the variable at position i in the list and sd i gets its standard deviation But inside a function working on a list supplied as an argument if we want to reference an individual variable in the list we must use the syntax listname varname Hence in the example above we write sd X 7 This is necessary to avoid possible collisions between the name space of the function and the name space of the caller script For example suppose we have a function that takes a list argument and that defines a local variable called y Now suppose that this function is passed a list containing a variable named y If the two name spaces were not separated either we d get an error or the external variable y would be silently over written by the local one It is important therefore that list argument variables should not be visible by name within functions To get hold of such variables you need to use the form of identification just mentioned the name of the list followed by a dot followed by the name of the variable t The treatment of list argument variables described ab
183. ending dates should be given in the form YYYY MM DD This format must be respected exactly Optionally the first line of the index file may contain a short comment up to 64 characters on the source and nature of the data following a hash mark For example Federal Reserve Board interest rates The corresponding binary database file holds the data values represented as floats that is single precision floating point numbers typically taking four bytes apiece The numbers are packed by variable so that the first n numbers are the observations of variable 1 the next m the observations on variable 2 and so on Appendix B Data import via ODBC Since version 1 7 5 gretl provides a method for retrieving data from databases which support the ODBC standard Most users won t be interested in this but there may be some for whom this feature matters a lot typically those who work in an environment where huge data collections are accessible via a Data Base Management System DBMS ODBC is the de facto standard for interacting with such systems In the next section we provide some background information on how ODBC works What you actually need to do to have gretl retrieve data from a database is explained in section B 2 B 1 ODBC base concepts ODBC is short for Open DataBase Connectivity a group of software methods that enable a client to interact with a database server The most common operation is when the client fetch
184. ent expressions separated by semicolons and sur rounded by parentheses The three components are as follows Chapter 9 Loop constructs 57 1 Initialization This is evaluated only once at the start of the loop Common example setting a scalar control variable to some starting value 2 Continuation condition this is evaluated at the top of each iteration including the first If the expression evaluates as true non zero iteration continues otherwise it stops Common example an inequality expressing a bound on a control variable 3 Modifier an expression which modifies the value of some variable This is evaluated prior to checking the continuation condition on each iteration after the first Common example a control variable is incremented or decremented Here s a simple example loop for r 0 01 r lt 991 r 01 In this example the variable r will take on the values 0 01 0 02 0 99 across the 99 iterations Note that due to the finite precision of floating point arithmetic on computers it may be necessary to use a continuation condition such as the above r lt 991 rather than the more natural r lt 99 Using double precision numbers on an x86 processor at the point where you would expect r to equal 0 99 it may in fact have value 0 990000000000001 Any or all of the three expressions governing a for loop may be omitted the minimal form is C 3 If the continuation test is omitted it is implicitly true
185. entry Start GNU R This writes out an R version of the current gretl data set in the user s gretl directory and sources it into a new R session The particular way R is invoked depends on the internal gretl variable Rcommand whose value may be set under the Tools Preferences menu The default command is RGui exe under MS Windows Under X it is xterm e R Please note that at most three space separated elements in this command string will be processed any extra elements are ignored Appendix F Listing of URLs Below is a listing of the full URLs of websites mentioned in the text Estima RATS http ww estima com FFTW3 http ww fftw org Gnome desktop homepage http www gnome org GNU Multiple Precision GMP library http swox com gmp GNU Octave homepage http ww octave org GNU R homepage http ww r project org GNU R manual http cran r project org doc manuals R intro pdf Gnuplot homepage http www gnuplot info Gnuplot manual http ricardo ecn wfu edu gnuplot html Gretl data page http gretl sourceforge net gretl_data html Gretl homepage http gretl sourceforge net GTK homepage http ww gtk org GTK port for win32 http www gimp org tml gimp win32 Gtkextra homepage http gtkextra sourceforge net InfoZip homepage http ww info zip org pub infozip zlib JMulTi homepage http www jmulti de JRSoftware http ww jrsoftware org Mingw gcc for win32 homepage http ww ming
186. ents match the types specified in the definition of the function An error is flagged if either of these conditions is violated One qualification allowance is made for omitting arguments at the end of the list provided that default values are specified in the function definition To be precise the check is that the number of arguments is at least equal to the number of required parameters and is no greater than the total number of parameters A scalar series or matrix argument to a function may be given either as the name of a pre existing variable or as an expression which evaluates to a variable of the appropriate type Scalar arguments may also be given as numerical values List arguments must be specified by name The following trivial example illustrates a function call that correctly matches the function defini tion Chapter 10 User defined functions 64 function definition function ols_ess series y list xvars ols y 0 xvars quiet scalar myess ess printf ESS g n myess return scalar myess end function main script open data4 1 list xlist 2 3 4 function call the return value is ignored here ols_ess price xlist The function call gives two arguments the first is a data series specified by name and the second is a named list of regressors Note that while the function offers the variable myess as a return value it is ignored by the caller in this instance As a side note here if you want a function to cal
187. equency series as many times as required For example suppose we have a quarterly series with the value 35 5 in 1990 1 the first quarter of 1990 On expansion to monthly the value 35 5 will be assigned to the observations for January February and March of 1990 The expanded variable is therefore useless for fine grained time series analysis outside of the special case where you know that the variable in question does in fact remain constant over the sub periods When the current data frequency is appropriate gretl offers both Compact data and Expand data options under the Data menu These options operate on the whole data set compacting or exanding all series They should be considered expert options and should be used with caution Panel data Panel data are inherently three dimensional the dimensions being variable cross sectional unit and time period For example a particular number in a panel data set might be identified as the observation on capital stock for General Motors in 1980 A note on terminology we use the terms cross sectional unit unit and group interchangeably below to refer to the entities that compose the cross sectional dimension of the panel These might for instance be firms countries or persons For representation in a textual computer file and also for gretl s internal calculations the three dimensions must somehow be flattened into two This flattening inv
188. er On database server You can browse these remotely you also have the option of installing them onto your own computer The initial remote databases window has an item showing for each file whether it is already installed locally and if so if the local version is up to date with the version at Wake Forest Assuming you have managed to open a database you can import selected series into gretl s workspace by using the Series Import menu item in the database window or via the popup menu that ap pears if you click the right mouse button or by dragging the series into the program s main window Creating a gretl data file independently It is possible to create a data file in one or other of gretl s own formats using a text editor or software tools such as awk sed or perl This may be a good choice if you have large amounts of data already in machine readable form You will of course need to study the gretl data formats XML format or traditional format as described in Appendix A 4 5 Structuring a dataset Once your data are read by gretl it may be necessary to supply some information on the nature of the data We distinguish between three kinds of datasets 1 Cross section 2 Time series 3 Panel data The primary tool for doing this is the Data Dataset structure menu entry in the graphical inter face or the setobs command for scripts and the command line interface Cross sectional data By a cross
189. er 10 so as to extend gretl s capabilities in a modular and flexible way As an example we will take a simple case of a model that gretl does not yet provide natively the zero inflated Poisson model or ZIP for short In this model we assume that we observe a mixed population for some individuals the variable y is conditionally on a vector of exogenous covariates x distributed as a Poisson random variate for some others y is identically 0 The trouble is we don t know which category a given individual belongs to For instance suppose we have a sample of women and the variable y represents the number of children that woman t has There may be a certain proportion of women for whom y 0 with certainty maybe out of a personal choice or due to physical impossibility But there may be other women for whom y O just as a matter of chance they haven t happened to have any children at the time of observation In formulae u P ye k xt adi 1 a eE y Uut exp x B 1 for y 0 di O for y gt 0 Writing a mle block for this model is not difficult mle 11 Togprob series xb exp b0 b1 x series d y 0 series poiprob exp xb xbAy gamma y 1 series logprob alpha gt 0 amp amp alpha lt i log alpha d 1 alpha poiprob NA params alpha b0 b1 end mle v However the code above has to be modified each time we change our specification by say adding an explanatory variabl
190. eries ul is formed from the first column of the matrix U For this operation to work the matrix or matrix selection must be a vector with length equal to either the full length of the current dataset n or the length of the current sample range n If n lt n then only n elements are drawn from the matrix if the matrix or selection comprises n elements the n values starting at element t are used where t represents the starting observation of the sample range Any values in the series that are not assigned from the matrix are set to the missing code 12 10 Matrices and lists To facilitate the manipulation of named lists of variables see Chapter 11 it is possible to convert between matrices and lists In section 12 1 above we mentioned the facility for creating a matrix from a list of variables as in matrix M listname That formulation with the name of the list enclosed in braces builds a matrix whose columns hold the variables referenced in the list What we are now describing is a different matter if we say matrix M listname without the braces we get a row vector whose elements are the ID numbers of the variables in the list This special case of matrix generation cannot be embedded in a compound expression The syntax must be as shown above namely simple assignment of a list to a matrix To go in the other direction you can include a matrix on the right hand side of an expression that defines a list
191. ernative is p 4 T 100 9 as in Wooldridge 2002b In each case one takes the integer part of the result These variants are labeled nw1 and nw2 respectively in the context of the set command with the hac_lag parameter That is you can switch to the version given by Wooldridge with Chapter 14 Robust covariance matrix estimation 107 set hac_lag nw2 As shown in Table 14 1 the choice between nw1 and nw2 does not make a great deal of difference T p nw1 p nw2 50 2 3 100 3 4 150 3 4 200 4 4 300 5 3 400 5 5 Table 14 1 HAC bandwidth two rules of thumb You also have the option of specifying a fixed numerical value for p as in set hac_lag 6 In addition you can set a distinct bandwidth for use with the Quadratic Spectral kernel since this need not be an integer For example set qs_bandwidth 3 5 Prewhitening and data based bandwidth selection An alternative approach is to deal with residual autocorrelation by attacking the problem from two sides The intuition behind the technique known as VAR prewhitening Andrews and Monahan 1992 can be illustrated by a simple example Let x be a sequence of first order autocorrelated random variables Xt PXt 1 Ut The long run variance of x can be shown to be Vir ut a p In most cases u is likely to be less autocorrelated than x so a smaller bandwidth should suffice Estimation of Vzgr xt can therefore proceed in three steps 1 estimate p 2 obtain a HAC esti
192. erties hold under the alternative The test itself is based on the following statistic Sas n T252 oa where St ia es and G is an estimate of the long run variance of e y 9 Under the null this statistic has a well defined nonstandard asymptotic distribution which is free of nuisance parameters and has been tabulated by simulation Under the alternative the statistic diverges As a consequence it is possible to construct a one sided test based on n where Ho is rejected if n is bigger than the appropriate critical value gretl provides the 90 95 97 5 and 99 quantiles Usage example kpss m y where mis an integer representing the bandwidth or window size used in the formula for estimating the long run variance m 2 _ ll 7 2 1 m i i m The y terms denote the empirical autocovariances of e from order m through m For this estimator to be consistent m must be large enough to accommodate the short run persistence of et but not too large compared to the sample size T In the GUI interface of gretl this value defaults q 1 4 to the integer part of 4 00 The above concept can be generalized to the case where y is thought to be stationary around a deterministic trend In this case formula 20 9 remains unchanged but the series e is defined as the residuals from an OLS regression of y on a constant and a linear trend This second form of the test is obtained by appending the trend op
193. es in analytical derivatives mode two were due to non convergence of the Levenberg Marquardt algorithm after the maximum number of iterations on MGHO9 and Bennett5 both described by NIST as of Higher difficulty and two were due to generation of range errors out of bounds floating point values when computing the Jacobian on BoxBOD and MGH17 described as of Higher difficulty and Average difficulty respectively The additional failure in numerical approximation mode was on MGH10 Higher diffi culty maximum number of iterations reached The table gives information on several aspects of the tests the number of outright failures the average number of iterations taken to produce a solution and two sorts of measure of the accuracy of the estimates for both the parameters and the standard errors of the parameters For each of the 54 runs in each mode if the run produced a solution the parameter estimates obtained by gretl were compared with the NIST certified values We define the minimum correct figures for a given run as the number of significant figures to which the least accurate gretl esti mate agreed with the certified value for that run The table shows both the average and the worst case value of this variable across all the runs that produced a solution The same information is shown for the estimated standard errors The second measure of accuracy shown is the percentage of cases taking into account all p
194. es some data from the server ODBC acts as an intermediate layer between client and server so the client talks to ODBC rather than accessing the server directly see Figure B 1 Figure B 1 Retrieving data via ODBC For the above mechanism to work it is necessary that the relevant ODBC software is installed and working on the client machine contact your DB administrator for details At this point the database or databases that the server provides will be accessible to the client as a data source with a specific identifier a Data Source Name or DSN in most cases a username and a password are required to connect to the data source Once the connection is established the user sends a query to ODBC which contacts the database manager collects the results and sends them back to the user The query is almost invariably formulated in a special language used for the purpose namely SOL We will not provide here an SQL tutorial there are many such tutorials on the Net besides each database manager tends to support its own SQL dialect so the precise form of an SQL query may vary slightly if the DBMS on the other end is Oracle MySOL PostgreSQL or something else Suffice it to say that the main statement for retrieving data is the SELECT statement Within a DBMS data are organized in tables which are roughly equivalent to spreadsheets The SELECT statement returns a subset of a table which is itself a
195. estimation strategies such as GARCH see chapter 20 Chapter 14 Robust covariance matrix estimation 105 Despite the points made above some residual degree of heteroskedasticity may be present in time series data the key point is that in most cases it is likely to be combined with serial correlation autocorrelation hence demanding a special treatment In White s approach the estimated covariance matrix of the u remains conveniently diagonal the variances E u may differ by t but the covariances E utus are all zero Autocorrelation in time series data means that at least some of the the off diagonal elements of should be non zero This introduces a substantial complication and requires another piece of terminology estimates of the covariance matrix that are asymptotically valid in face of both heteroskedasticity and autocorrelation of the error process are termed HAC heteroskedasticity and autocorrelation consistent The issue of HAC estimation is treated in more technical terms in chapter 18 Here we try to convey some of the intuition at a more basic level We begin with a general comment residual autocorrelation is not so much a property of the data as a symptom of an inadequate model Data may be persistent though time and if we fit a model that does not take this aspect into account properly we end up with a model with autocorrelated disturbances Conversely it is often possible to mitigate or even eliminate the probl
196. et initvals start arma 1 1 y The specified matrix should have just as many parameters as the model in the example above there are three parameters since the model implicitly includes a constant The constant if present is always given first otherwise the order in which the parameters are expected is the same as the order of specification in the arma or arima command In the example the constant is set to zero q to 0 85 and 0 to 0 34 You can get gretl to revert to automatic initialization via the command set initvals auto Estimation via X 12 ARIMA As an alternative to estimating ARMA models using native code gretl offers the option of using the external program X 12 ARIMA This is the seasonal adjustment software produced and main tained by the U S Census Bureau it is used for all official seasonal adjustments at the Bureau Gretl includes a module which interfaces with X 1 2 ARIMA it translates arma commands using the syntax outlined above into a form recognized by X 12 ARIMA executes the program and retrieves the results for viewing and further analysis within gretl To use this facility you have to install X 12 ARIMA separately Packages for both MS Windows and GNU Linux are available from the gretl website http gretl sourceforge net Chapter 20 Time series models 151 To invoke X 12 ARIMA as the estimation engine append the flag x 12 arima as in arma p q y x 12 arima As with native estimation the de
197. eter obs format is absent as in the above example the SOL query should return exactly one column of data which is used to fill the variable x sequentially It may be necessary to include a smp1 command before the data command to set up the right window for the incoming data In addition if one cannot assume that the data will be delivered in the correct order typically chronological order the SOL query should contain an appropriate ORDER BY clause The optional format string is used for those cases when there is no certainty that the data from the query will arrive in the same order as the gretl dataset This may happen when missing values are interspersed within a column or with data that do not have a natural ordering e g cross sectional data In this case the SQL statement should return a table with n columns where the first n 1 columns are used to identify which observation the value in the n th column belongs to The format string is used to translate the first n 1 fields into a string which matches the string gretl uses to identify observations in the currently open dataset At present n should be between 2 and 4 which should cover most if not all cases For example consider the following fictitious case we have a 5 days per week dataset to which we want to add the stock index for the Verdurian market it so happens that in Verduria Saturdays are working days but Wednesdays are not We want a column which does not c
198. ethod based on Boswijk 1995 but two further options are available the initialization may be adjusted using simulated annealing or the user may supply an explicit initial value for 0 The default initialization method is 1 Calculate the unrestricted ML B using the Johansen procedure 2 If the restriction on f is non homogeneous use the method proposed by Boswijk 1995 po Ur e 1 H Ir B ho 21 9 where B 0 and A denotes the Moore Penrose inverse of A Otherwise do H H H vec B 21 10 3 vec Bo H ho ho 4 Calculate the unrestricted ML amp conditional on Bo as per Johansen amp So1bo BoS11b0 7 21 11 5 If is restricted by vec a Gy then Wo G G G vec amp and vec a Gpo 7The exception is restrictions that are homogeneous common to all or all in case r gt 1 and involve either B only or only Such restrictions are handled via the modified eigenvalues method set out by Johansen 1995 We solve directly for the ML estimator without any need for iterative methods SIn developing gretl s VECM testing facilities we have considered a fair number of tricky cases from various sources We d like to thank Luca Fanelli of the University of Bologna and Sven Schreiber of Goethe University Frankfurt for their help in devising torture tests for gretl s VECM code Chapter 21 Cointegration and Vector Error Correction Models 171 Alternative initializat
199. f 10 3f Id 1 000 0 000 0 000 1 000 For presentation purposes you may wish 96 to give titles to the columns of a matrix For this you can use the colnames function the first argument is a matrix and the second is either a named list of variables whose names will be used as he adings or a string that contains as many space separated substrings as the matrix has columns For example matrix M mnormal 3 3 colnames M foo bar baz print M M 3 x 3 foo bar 1 7102 0 76072 0 089 0 99780 1 9003 0 25 0 91762 0 39237 1 6 baz 406 123 114 12 13 Example OLS using matrices Example 12 4 shows how matrix methods can be used to replicate gretl s built in OLS functionality Example 12 4 OLS via matrix methods open data4 1 matrix X const sqft matrix y price matrix b invpd X X X y print estimated coefficient vector b matrix u scalar SSR u scalar s2 SSR Crows X rows b matrix V s2 inv X X V matrix se sqrt diag V print estimated standard errors se compare with built in function ols price const sqft vcv y X b u u Chapter 13 Cheat sheet This chapter explains how to perform some common and some not so common tasks in gretl s scripting language Some but not all of the techniques listed here are also available through the graphical interface Although the graphical interface may be more intuitive and less
200. f X 50 30 Generated scalar pl ID 2 0 0111648 genr p2 pvalue X 50 30 Generated scalar p2 CID 3 0 988835 genr test 1 p2 Generated scalar test ID 4 0 0111648 But the moral is that if you want to examine extreme values you should be careful in selecting the function you need in the knowledge that values very close to zero can be represented as doubles while values very close to 1 cannot 5 7 Handling missing values Four special functions are available for the handling of missing values The boolean function missingO takes the name of a variable as its single argument it returns a series with value 1 for each observation at which the given variable has a missing value and value O otherwise that is if the given variable has a valid value at that observation The function ok is complementary to missing itis just a shorthand for missing where is the boolean NOT operator For example one can count the missing values for variable x using Chapter 5 Special functions in genr 36 genr nmiss_x sum missing x The function zeromiss which again takes a single series as its argument returns a series where all zero values are set to the missing code This should be used with caution one does not want to confuse missing values and zeros but it can be useful in some contexts For example one can determine the first valid observation for a variable x using genr time genr xO min zeromiss time ok
201. f new variables is by default suppressed when a function is being executed If you want more verbose output from a particular function you can use either or both of the following commands within the function set echo on set messages on Alternatively you can achieve this effect for all functions via the command set debug 1 Usually when you set the value of a state variable using the set command the effect applies only to the current level of function execution For instance if you do set messages on within function f1 which in turn calls function f2 then messages will be printed for f1 but not f2 The debug variable however acts globally all functions become verbose regardless of their level Further you can do set debug 2 in addition to command echo and the printing of messages this is equivalent to setting max_verbose which produces verbose output from the BFGS maximizer at all levels of function execution 10 5 Function packages As of gretl 1 6 0 there is a mechanism to package functions and make them available to other users of gretl Here is a walk through of the process Load a function in memory There are several ways to load a function e If you have a script file containing function definitions open that file and run it e Create a script file from scratch Include at least one function definition and run the script e Open the GUI console and type a function definition interactively This method is not partic
202. f the as using LE i T 2 vit XitB These two methods LSDV and using de meaned data are numerically equivalent Gretl takes the approach of de meaning the data If you have a small number of cross sectional units a large num ber of time series observations per unit and a large number of regressors it is more economical in terms of computer memory to use LSDV If need be you can easily implement this manually For example genr unitdum ols y x du_ See Chapter 5 for details on unitdum The amp estimates are not printed as part of the standard model output in gretl there may be a large number of these and typically they are not of much inherent interest However you can retrieve them after estimation of the fixed effects model if you wish In the graphical interface go to the Save menu in the model window and select per unit constants In command line mode you can do genr newname ahat where newname is the name you want to give the series For the random effects model we write Uit Vi it SO the model becomes Vit XitB Vi Eit 15 3 In contrast to the fixed effects model the v s are not treated as fixed parameters but as random drawings from a given probability distribution The celebrated Gauss Markov theorem according to which OLS is the best linear unbiased esti mator BLUE depends on the assumption that the error term is independently and identically distributed IID In the panel
203. f the variable reasonably round where this is taken to mean that they are all integer multiples of 0 25 If this criterion is met we then ask whether the variable takes on a fairly small set of distinct values where fairly small is defined as less than or equal to 8 If both conditions are satisfied the variable is automatically considered discrete To mark a variable as discrete you have two options 1 From the graphical interface select Variable Edit Attributes from the menu A dialog box will appear and if the variable seems suitable you will see a tick box labeled Treat this variable as discrete This dialog box can also be invoked via the context menu right click on a variable or by pressing the F2 key 2 From the command line interface via the discrete command The command takes one or more arguments which can be either variables or list of variables For example list xlist x1 x2 x3 discrete z1 xlist z2 This syntax makes it possible to declare as discrete many variables at once which cannot presently be done via the graphical interface The switch reverse reverses the declaration of a variable as discrete or in other words marks it as continuous For example discrete foo now foo is discrete discrete foo reverse now foo is continuous The command line variant is more powerful in that you can mark a variable as discrete even if it does not seem to be suitable for this treatment
204. f1 JMulTi data files When you import data from the ASCII or CSV formats gretl opens a diagnostic window report ing on its progress in reading the data If you encounter a problem with ill formatted data the messages in this window should give you a handle on fixing the problem As of version 1 7 5 gretl also offers ODBC connctivity Be warned this is a recent feature meant for somewhat advanced users it may still have a few rough edges and there is no GUI interface for this yet Interested readers will find more info in appendix B For the convenience of anyone wanting to carry out more complex data analysis gretl has a facility for writing out data in the native formats of GNU R Octave JMulTi and PcGive see Appendix E In the GUI client this option is found under the File Export data menu in the command line client use the store command with the appropriate option flag 4 3 Binary databases For working with large amounts of data gretl is supplied with a database handling routine A database as opposed to a data file is not read directly into the program s workspace A database 1See http www ecn wfu edu eviews_format 19 Chapter 4 Data files 20 can contain series of mixed frequencies and sample ranges You open the database and select series to import into the working dataset You can then save those series in a native format data file if you wish Databases can be accessed via gretl s menu it
205. fault is to use exact ML but there is the option of using conditional ML with the conditional flag However please note that when X 1 2 ARIMA is used in conditional ML mode the comments above regarding the variant treatments of the mean of the process y do not apply That is when you use X 12 ARIMA the model that is estimated is 20 2 regardless of whether estimation is by exact ML or conditional ML Forecasting ARMA models are often used for forecasting purposes The autoregressive component in particu lar offers the possibility of forecasting a process out of sample over a substantial time horizon Gretl supports forecasting on the basis of ARMA models using the method set out by Box and Jenkins 1976 The Box and Jenkins algorithm produces a set of integrated AR coefficients which take into account any differencing of the dependent variable seasonal and or non seasonal in the ARIMA context thus making it possible to generate a forecast for the level of the original variable By contrast if you first difference a series manually and then apply ARMA to the differenced series forecasts will be for the differenced series not the level This point is illustrated in Example 20 1 The parameter estimates are identical for the two models The forecasts differ but are mutually consistent the variable fcdiff emulates the ARMA forecast static one step ahead within the sample range and dynamic out of sample 20 3 Unit root tests
206. fferenced before performing the analysis the model is known as ARIMA I for Integrated for this reason gretl provides the arima command as an alias for arma Seasonal differencing is handled similarly with the syntax armap dq PDQ y where D is the order for seasonal differencing Thus the command arma 100 111 y would produce the same parameter estimates as genr dsy sdiff y arma 10 1 1 dsy where we use the sdiff function to create a seasonal difference e g for quarterly data y VYt 4 Estimation The default estimation method for ARMA models is exact maximum likelihood estimation under the assumption that the error term is normally distributed using the Kalman filter in conjunc tion with the BFGS maximization algorithm The gradient of the log likelihood with respect to the parameter estimates is approximated numerically This method produces results that are directly comparable with many other software packages The constant and any exogenous variables are treated as in equation 20 2 The covariance matrix for the parameters is computed using a nu merical approximation to the Hessian at convergence The alternative method invoked with the conditional switch is conditional maximum like lihood CML also known as conditional sum of squares see Hamilton 1994 p 132 This method was exemplified in the script 9 3 and only a brief description will be given here Given a sample of size T the
207. following the link Libgretl API docs on the gretl home page People interested in the gretl development are welcome to subscribe to the gretl devel mailing list If you prefer to compile your own or are using a unix system for which pre built packages are not available instructions on building gretl can be found in Appendix C MS Windows The MS Windows version comes as a self extracting executable Installation is just a matter of downloading gretl_instal1l exe and running this program You will be prompted for a location to install the package ln this manual we use Linux as shorthand to refer to the GNU Linux operating system What is said herein about Linux mostly applies to other unix type systems too though some local modifications may be needed Chapter 1 Introduction 3 Updating If your computer is connected to the Internet then on start up gretl can query its home website at Wake Forest University to see if any program updates are available if so a wndow will open up informing you of that fact If you want to activate this feature check the box marked Tell me about gretl updates under gretl s Tools Preferences General menu The MS Windows version of the program goes a step further it tells you that you can update gretl automatically if you wish To do this follow the instructions in the popup window close gretl then run the program titled gretl updater you should find this along with
208. g length or bandwidth p of the HAC estimator And how exactly are the weights w to be determined We will return to the difficult question of the bandwidth shortly As regards the weights Gretl offers three variants The default is the Bartlett kernel as used by Newey and West This sets melie TSP f 0 j gt p so the weights decline linearly as j increases The other two options are the Parzen kernel and the Quadratic Spectral QS kernel For the Parzen kernel 1 6af 6a 0 lt aj lt 0 5 wj 2 1 aj 0 5 lt aj lt 1 0 aj gt 1 where aj j p 1 and for the QS kernel EN ETA A 12245 mj J where dj j p and mj 6rrd 5 Figure 14 1 shows the weights generated by these kernels for p 4 and j 1 to 9 Figure 14 1 Three HAC kernels AL Bartlett Parzen Qs In gretl you select the kernel using the set command with the hac_kernel parameter set hac_kernel parzen set hac_kernel qs set hac_kernel bartlett Selecting the HAC bandwidth The asymptotic theory developed by Newey West and others tells us in general terms how the HAC bandwidth p should grow with the sample size T that is p should grow in proportion to some fractional power of T Unfortunately this is of little help to the applied econometrician working with a given dataset of fixed size Various rules of thumb have been suggested and gretl implements two such The default is p 0 75T 3 as recommended by Stock and Watson 2003 An alt
209. ges Open this manual in PDF format Open the help item for script commands syntax i e a listing with details of all available commands Open the dialog box for defining a graph Open the dialog box for estimating a model using ordinary least squares 10 Open a window listing the sample datasets supplied with gretl and any other data file collec tions that have been installed Chapter 3 Modes of working 3 1 Command scripts As you execute commands in gretl using the GUI and filling in dialog entries those commands are recorded in the form of a script or batch file Such scripts can be edited and re run using either gretl or the command line client gretlcli To view the current state of the script at any point in a gretl session choose Command log under the Tools menu This log file is called session inp and it is overwritten whenever you start a new session To preserve it save the script under a different name Script files will be found most easily using the GUI file selector if you name them with the extension inp To open a script you have written independently use the File Script files menu item to create a script from scratch use the File Script files New script item or the new script toolbar button In either case a script window will open see Figure 3 1 gretl mrw inp Replicate Table 1 Estimation of the Textbook Solow model in Mankiw Romer a
210. hall be known within the function An example follows function myfunc series y list xvars bool verbose Each of the type specifiers with the exception of 1ist and string may be modified by prepending an asterisk to the associated parameter name as in function myfunc series y scalar b The meaning of this modification is explained below see section 10 4 it is related to the use of pointer arguments in the C programming language Function parameters optional refinements Besides the required elements mentioned above the specification of a function parameter may include some additional fields 62 Chapter 10 User defined functions 63 For a parameter of type scalar or int a minimum maximum and default value may be specified These values should directly follow the name of the parameter enclosed in square brackets and with the individual elements separated by colons For example suppose we have an integer para meter order for which we wish to specify a minimum of 1 a maximum of 12 and a default of 4 We can write int order 1 12 4 If you wish to omit any of the three specifiers leave the corresponding field empty For example 1 4 would specify a minimum of 1 and a default of 4 while leaving the maximum unlimited For a parameter of type bool you can specify a default of 1 true or 0 false as in bool verbose 0 Finally for a parameter of any type you can append a short descriptive string This will show
211. he Engel data there are two issues to consider First Engel s famous law claims an income elasticity of food consumption that is less than one and talk of elasticities suggests a logarithmic formulation of the model Second there are two apparently anomalous observations in the data set household 105 has the third highest income but unexpectedly low expenditure on food as judged from a simple scatter plot while household 138 which also has unexpectedly low food consumption has much the highest income almost twice that of the next highest With n 235 it seems reasonable to consider dropping these observations If we do so and adopt a log log formulation we get the plot shown in Figure 23 2 The quantile estimates still cross the OLS estimate but the evidence against OLS is much less compelling the 90 percent confidence bands of the respective estimates overlap at all the quantiles considered 23 5 Large datasets As noted above when you give the intervals option with the quantreg command which calls for estimation of confidence intervals via rank inversion gretl switches from the default Frisch Newton algorithm to the Barrodale Roberts simplex method Chapter 23 Quantile regression 187 Coefficient on log income 0 96 T T T T 0 94 4 0 92 0 9 0 88 0 86 0 84 7 0 82 4 0 8 7 0 78 4 Quantile estimates with 90 band OLS estimate with 90 band a 0 0 2 0 4 0 6 0 8 1 tau
212. he function Y in general it need not be so In these circumstances the maximum can be found by means of numerical techniques These often rely on the fact that the log likelihood is a smooth function of 0 and therefore on the maximum its partial derivatives should all be 0 The gradient vector or score vector is a function that enjoys many interesting statistical properties in its own right it will be denoted here as g 0 It is a lWe are supposing here that our data are a realization of continuous random variables For discrete random variables everything continues to apply by referring to the probability function instead of the density In both cases the distribution may be conditional on some exogenous variables 123 Chapter 17 Maximum likelihood estimation 124 k vector with typical element 0010 y 00 0 00 00 gi 0 t 1 Gradient based methods can be shortly illustrated as follows 1 pick a point 09 O evaluate g 00 if g 0p is small stop Otherwise compute a direction vector d g 0p Ae WwW N evaluate 01 0o d g 00 al substitute 09 with 01 6 restart from 2 Many algorithms of this kind exist they basically differ from one another in the way they compute the direction vector d g 09 to ensure that 0 gt 00 so that we eventually end up on the maximum The method gretl uses to maximize the log likelihood is a gradient based algorithm known as the B
213. i 1996 which was adopted as a benchmark in the study of GARCH results by McCullough and Renfro 1998 It employs analytical first and second derivatives of the log likelihood and uses a mixed gradient algorithm exploiting the information matrix in the early iterations and then switch ing to the Hessian in the neighborhood of the maximum likelihood This progress can be observed if you append the verbose option to gretl s garch command 2The algorithm is based on Fortran code deposited in the archive of the Journal of Applied Econometrics by the authors and is used by kind permission of Professor Fiorentini Chapter 20 Time series models 155 Several options are available for computing the covariance matrix of the parameter estimates in connection with the garch command At a first level one can choose between a standard and a robust estimator By default the Hessian is used unless the robust option is given in which case the QML estimator is used A finer choice is available via the set command as shown in Table 20 2 Table 20 2 Options for the GARCH covariance matrix command effect set garch_vcv hessian Use the Hessian set garch_vcv im Use the Information Matrix set garch_vcv op Use the Outer Product of the Gradient set garch_vcv qml QML estimator set garch_vcv bw Bollerslev Wooldridge sandwich estimator It is not uncommon when one estimates a GARCH model for an arbitrary time series to find
214. ile contains data pertaining to a classic econometric chestnut the consumption function The data window should now display the name of the current data file the overall data range and sample range and the names of the variables along with brief descriptive tags see Figure 2 2 File Tools Data View Add Sample Variable Model Help data3 6 gdt 10 variable name Descriptive label auto generated constant Personal consumption expenditures 1992 dollars Per capita disposable personal income 1992 dollars Annual Full range 1959 1994 JA AR B Figure 2 2 Main window with a practice data file open OK what can we do now Hopefully the various menu options should be fairly self explanatory For now we ll dip into the Model menu a brief tour of all the main window menus is given in Section 2 3 below gretl s Model menu offers numerous various econometric estimation routines The simplest and most standard is Ordinary Least Squares OLS Selecting OLS pops up a dialog box calling for a model specification see Figure 2 3 gretl specify model OLS Dependent variable Choose gt ct Set as default Independent variables TEPES const lt _ lt Remove O Robust standard errors configure e J Help amp dear X cancel Figure 2 3 Model specification dialog To select the dependent variable highlight the variable you want in the list on the left and click the Choose butt
215. imation to the relevant Jacobian to construct a covariance matrix for your estimates Another example is the delta method if you have a consistent estimator of a vector of parameters 0 and a consistent estimate of its covariance matrix you may need to compute estimates for a nonlinear continuous transformation y g In this case a standard result in asymptotic theory is that 00 0 40 w 8 VT 0 0 N 0 5 VT 9 w E NO JEJ g x ox x 0 where T is the sample size and J is the Jacobian Chapter 5 Special functions in genr 39 Script 5 3 exemplifies such a case the example is taken from Greene 2003 section 9 3 1 The slight differences between the results reported in the original source and what gretl returns are due to the fact that the Jacobian is computed numerically rather than analytically as in the book 5 10 The discrete Fourier transform The discrete Fourier transform can be best thought of as a linear invertible transform of a complex vector Hence if x is an n dimensional vector whose k th element is xx ax iby then the output of the discrete Fourier transform is a vector f F x whose k th element is n 1 fe y a OU y j 0 where w j k 2rritt Since the transformation is invertible the vector x can be recovered from f via the so called i inverse transform gt a x iodo f The Fourier transform is used in many diverse situations on account of this key property the convolut
216. ime series objects are stored as CSV files and can be retrieved by using gretl s append command Matrices are stored in a special text format that is understood by gretl see section 12 6 the file suffix is in this case mat and to read the matrix in gretl you must use the mread function As an example we take the airline data and use them to estimate a structural time series model a la Harvey 1989 The model we will use is the Basic Structural Model BSM in which a time series is decomposed into three terms Vt Ht Yt Et where ut is a trend component y is a seasonal component and e is a noise term In turn the following is assumed to hold Aht Bi1 Nt Abt Ct Asyt Awt where A is the seasonal differencing operator 1 L5 and nt Gt and w are mutually uncorre lated white noise processes The object of the analysis is to estimate the variances of the noise components which may be zero and to recover estimates of the latent processes urt the level B the slope and yr Gretl does not provide yet a command for estimating this class of models so we will use R s StructTS command and import the results back into gretl Once the bjg dataset is loaded in gretl we pass the data to R and execute the following script extract the log series Chapter 25 Gretl and R 201 y lt gretldata lg estimate the model strmod lt StructTS y save the fitted components smoothed compon
217. ing Tex This is not the place for a detailed exposition of these matters but here are a few pointers So far as we know every GNU Linux distribution has a package or set of packages for TEX and in fact these are likely to be installed by default Check the documentation for your distribution For MS Windows several packaged versions of T X are available one of the most popular is MiKT X at http www miktex org For Mac OS X a nice implementation is TfXMac at http itexmac sourceforge net An essential starting point for online TX resources is the Comprehensive TeX Archive Network CTAN at http www ctan org As for learning TpX many useful resources are available both online and in print Among online guides Tony Roberts BT X from quick and dirty to style and finesse is very helpful at http ww sci usq edu au staff robertsa LaTeX latexintro html An excellent source for advanced material is The BTX Companion Goossens et al 2004 Chapter 25 Gretl and R 25 1 Introduction R is by far the largest free statistical project Like gretl it is a GNU project and the two have a lot in common however gretl s approach focuses on ease of use much more than R which instead aims to encompass the widest possible range of statistical procedures As is natural in the free software ecosystem we don t view ourselves as competitors to R but rather as projects sharing a common goal who should support each other
218. inside the possibly truncated sample range for a regression the result depends on the character of the dataset and the estimator chosen In many cases the program will automatically skip the missing observations when calculating the regression results In this situation a message is printed stating how many observations were dropped On the other hand the skipping of missing observations is not supported for all procedures exceptions include all autoregressive estimators system estimators such as SUR and nonlinear least squares In the case of panel data the skipping of missing observations is supported only if their omission leaves a balanced panel If missing observations are found in cases where they are not supported gretl gives an error message and refuses to produce estimates In case missing values in the middle of a dataset present a problem the misszero function use with care is provided under the genr command By doing genr foo misszero bar you can produce a series foo which is identical to bar except that any missing values become zeros Then Chapter 4 Data files 27 you can use carefully constructed dummy variables to in effect drop the missing observations from the regression while retaining the surrounding sample range 4 7 Maximum size of data sets Basically the size of data sets both the number of variables and the number of observations per variable is limited only by the characteristics of your computer G
219. invoke a popup menu which enables you to add a new variable column to add an observation append a row at the foot of the sheet or to insert an observation at the selected point move the data down and insert a blank row Once you have entered data into the spreadsheet you import these into gretl s workspace using the spreadsheet s Apply changes button Please note that gretl s spreadsheet is quite basic and has no support for functions or formulas Data transformations are done via the Add or Variable menus in the main gretl window Selecting from a database Another alternative is to establish your dataset by selecting variables from a database Chapter 4 Data files 22 Begin with gretl s File Databases menu item This has four forks Gretl native RATS 4 PcGive and On database server You should be able to find the file fedst1 bin in the file selector that opens if you choose the Gretl native option this file which contains a large collection of US macroeconomic time series is supplied with the distribution You won t find anything under RATS 4 unless you have purchased RATS data If you do possess RATS data you should go into gretl s Tools Preferences General dialog select the Databases tab and fill in the correct path to your RATS files If your computer is connected to the internet you should find several databases at Wake Forest University und
220. ion 4 6 Numeric data are expected but in the case of importing from ASCII CSV the program offers limited handling of character string data if a given column contains character data only consecutive numeric codes are substituted for the strings and once the import is complete a table is printed showing the correspondence between the strings and the codes Dates or observation labels Optionally the first column may contain strings such as dates or labels for cross sectional observations Such strings have a maximum of 8 characters as with variable names longer strings will be truncated A column of this sort should be headed with the string obs or date or the first row entry may be left blank For dates to be recognized as such the date strings must adhere to one or other of a set of specific formats as follows For annual data 4 digit years For quarterly data a 4 digit year followed by a separator either a period a colon or the letter Q followed by a 1 digit quarter Examples 1997 1 2002 3 194701 For monthly data a 4 digit year followed by a period or a colon followed by a two digit month Examples 1997 01 2002 10 CSV files can use comma space or tab as the column separator When you use the Import CSV menu item you are prompted to specify the separator In the case of Import ASCII the program attempts to auto detect the separator that was used If you use a spreadsheet to prepare your data you are ab
221. ion methods As mentioned above gretl offers the option of adjusting the initialization using simulated anneal ing This is invoked by adding the jitter option to the restrict command The basic idea is this we start at a certain point in the parameter space and for each of n iterations currently n 4096 we randomly select a new point within a certain radius of the previous one and determine the likelihood at the new point If the likelihood is higher we jump to the new point otherwise we jump with probability P and remain at the previous point with probability 1 P As the iterations proceed the system gradually cools that is the radius of the random perturbation is reduced as is the probability of making a jump when the likelihood fails to increase In the course of this procedure many points in the parameter space are evaluated starting with the point arrived at by the deterministic method which we ll call 09 One of these points will be best in the sense of yielding the highest likelihood call it 0 This point may or may not have a greater likelihood than 07 And the procedure has an end point 0 which may or may not be best The rule followed by gretl in selecting an initial value for O based on simulated annealing is this use 0 if 0 gt Oo otherwise use On That is if we get an improvement in the likelihood via annealing we make full use of this on the other hand if we fail to get an improv
222. ion of the use of named string arguments with printf string vstr variance Generated string vstr printf vstr 12s n vstr vstr variance Note that vstr should not be put in quotes in this context Similarly with string vstr_copy vstr Built in strings Apart from any strings that the user may define some string variables are defined by gretl itself These may be useful for people writing functions that include shell commands The built in strings are as shown in Table 11 2 gretldir the gretl installation directory workdir user s current gretl working directory dotdir the directory gretl uses for temporary files gnuplot path to or name of the gnuplot executable tramo path to or name of the tramo executable x12a path to or name of the x 12 arima executable tramodir tramo data directory x12adir x 12 arima data directory Table 11 2 Built in string variables Reading strings from the environment In addition it is possible to read into gretl s named strings values that are defined in the external environment To do this you use the function getenv which takes the name of an environment variable as its argument For example string user getenv USER Saved string as user string home getenv HOME Saved string as home print user s home directory is Ahome cottrell s home directory is home cottrel1 To check whether you got a non empty value from a given call to getenv you ca
223. ion of two vectors can be performed efficiently by multiplying the elements of their Fourier transforms and inverting the result If n Zk gt XjVk js j l then F z F x O Fly That is F Z r F MORF ye For computing the Fourier transform gretl uses the external library fftw3 see Frigo and Johnson 2003 This guarantees extreme speed and accuracy In fact the CPU time needed to perform the transform is O n logn for any n This is why the array of numerical techniques employed in fftw3 is commonly known as the Fast Fourier Transform Gretl provides two matrix functions for performing the Fourier transform and its inverse fft and ffti In fact gretl s implementation of the Fourier transform is somewhat more specialized the input to the fft function is understood to be real Conversely ffti takes a complex argument and delivers a real result For example 1 2 3 perform the transform f fft a perform the inverse transform ffticf yields 1 6 0 1 x 2 f 1 5 0 866 x2 2 3 1 5 0 866 3 where the first column of f holds the real part and the second holds the complex part In general if the input to fft has n columns the output has 2n columns where the real parts are stored in 1See chapter 12 Chapter 5 Special functions in genr 40 the odd columns and the complex parts in the even ones Should it be necessary to compute the Fourier transform on several vectors with the same number of el
224. ional details regarding covariance matrix estimation in the context of GMM are given in chapter 18 We close this introduction with a brief statement of what robust standard errors can and cannot achieve They can provide for asymptotically valid statistical inference in models that are basically correctly specified but in which the errors are not iid The asymptotic part means that they may be of little use in small samples The correct specification part means that they are not a magic bullet if the error term is correlated with the regressors so that the parameter estimates themselves are biased and inconsistent robust standard errors will not save the day 14 2 Cross sectional data and the HCCME With cross sectional data the most likely departure from iid errors is heteroskedasticity non constant variance In some cases one may be able to arrive at a judgment regarding the likely form of the heteroskedasticity and hence to apply a specific correction The more common case however is where the heteroskedasticity is of unknown form We seek an estimator of the covari ance matrix of the parameter estimates that retains its validity at least asymptotically in face of unspecified heteroskedasticity It is not obvious a priori that this should be possible but White 1980 showed that Varn B XX I X X X X 14 5 does the trick As usual in statistics we need to say under certain conditions but the
225. ions in the norm Communications of the ACM 17 pp 319 320 Baxter M and King R G 1999 Measuring Business Cycles Approximate Band Pass Filters for Economic Time Series The Review of Economics and Statistics 81 4 pp 575 593 Beck N and Katz J N 1995 What to do and not to do with Time Series Cross Section Data The American Political Science Review 89 pp 634 47 Belsley D Kuh E and Welsch R 1980 Regression Diagnostics New York Wiley Berndt E Hall B Hall R and Hausman J 1974 Estimation and Inference in Nonlinear Structural Models Annals of Economic and Social Measurement 3 4 pp 653 65 Blundell R and Bond S 1998 Initial Conditions and Moment Restrictions in Dynamic Panel Data Models Journal of Econometrics 87 pp 115 43 Bollerslev T and Ghysels E 1996 Periodic Autoregressive Conditional Heteroscedasticity Jour nal of Business and Economic Statistics 14 pp 139 51 Boswijk H Peter 1995 Identifiability of Cointegrated Systems Tinbergen Institute Discussion Paper 95 78 http ww ase uva nl pp bin 258ful ltext pdf Boswijk H Peter and Doornik Jurgen A 2004 Identifying estimating and testing restricted coin tegrated systems An overview Statistica Neerlandica 58 4 pp 440 465 Box G E P and Jenkins G 1976 Time Series Analysis Forecasting and Control San Franciso Holden Day Box G E P and Mul
226. irst 50 entries in the second column underneath the first 50 entries in the first we would be on the way to making a data set by observation in the first of the two forms mentioned above stacked cross sections That is we d have a column comprising a cross section for x1 in 1965 followed by a cross section for the same variable in 1970 The following gretl script illustrates how we can accomplish the stacking for both x1 and x2 We assume that the original data file is called panel txt and that in this file the columns are headed with variable names p1 p2 p5 The columns are not really variables but in the first instance we pretend that they are open panel txt genr x1 stack pl p5 length 50 genr x2 stack pl p5 offset 50 length 50 setobs 50 1 1 stacked cross section store panel gdt x1 x2 The second line illustrates the syntax of the stack function The double dots within the parenthe ses indicate a range of variables to be stacked here we want to stack all 5 columns for all 5 years The full data set contains 100 rows in the stacking of variable x1 we wish to read only the first 50 rows from each column we achieve this by adding length 50 Note that if you want to stack a non contiguous set of columns you can put a comma separated list within the parentheses as in genr x stack p1 p3 p5 On line 3 we do the stacking for variable x2 Again we want a length of 50 for the components of
227. is code available under the GNU General Public Licence and for helping to steer gretl s early development Chapter 1 Introduction 2 We are also grateful to the authors of several econometrics textbooks for permission to package for gretl various datasets associated with their texts This list currently includes William Greene au thor of Econometric Analysis Jeffrey Wooldridge Introductory Econometrics A Modern Approach James Stock and Mark Watson Introduction to Econometrics Damodar Gujarati Basic Economet rics Russell Davidson and James MacKinnon Econometric Theory and Methods and Marno Ver beek A Guide to Modern Econometrics GARCH estimation in gretl is based on code deposited in the archive of the Journal of Applied Econometrics by Professors Fiorentini Calzolari and Panattoni and the code to generate p values for Dickey Fuller tests is due to James MacKinnon In each case we are grateful to the authors for permission to use their work With regard to the internationalization of gretl thanks go to Ignacio Diaz Emparanza Spanish Michel Robitaille and Florent Bresson French Cristian Rigamonti Italian Tadeusz Kufel and Pawel Kufel Polish Markus Hahn and Sven Schreiber German H lio Guilherme Portuguese and Susan Orbe Basque Gretl has benefitted greatly from the work of numerous developers of free open source software for specifics please see Appendix C Our thanks are due to Richard Stallman of the
228. l dummy variables e g quarterly dummy variables for quarterly data e Sample menu Set range Select a different starting and or ending point for the current sample within the range of data available Restore full range self explanatory Define based on dummy Given a dummy indicator variable with values O or 1 this drops from the current sample all observations for which the dummy variable has value 0 Restrict based on criterion Similar to the item above except that you don t need a pre defined variable you supply a Boolean expression e g sqft gt 1400 and the sample is restricted to observations satisfying that condition See the entry for genr in the Gretl Command Reference for details on the Boolean operators that can be used Random sub sample Draw a random sample from the full dataset Drop all obs with missing values Drop from the current sample all observations for which at least one variable has a missing value see Section 4 6 Count missing values Give a report on observations where data values are missing May be useful in examining a panel data set where it s quite common to encounter missing values Set missing value code Set a numerical value that will be interpreted as missing or not available This is intended for use with imported data when gretl has not recognized the missing value code used e Variable menu Most items under here operate on a single variable at a time The
229. l hypothesis is that the variance of v in equation 15 3 equals zero if this hypothesis is not rejected then again we conclude that the simple pooled model is adequate The Hausman test probes the consistency of the GLS estimates The null hypothesis is that these estimates are consistent that is that the requirement of orthogonality of the v and the X is satisfied The test is based on a measure H of the distance between the fixed effects and random effects estimates constructed such that under the null it follows the x distribution with degrees of freedom equal to the number of time varying regressors in the matrix X If the value of H is large this suggests that the random effects estimator is not consistent and the fixed effects model is preferable There are two ways of calculating H the matrix difference method and the regression method The procedure for the matrix difference method is this e Collect the fixed effects estimates in a vector B and the corresponding random effects esti mates in f then form the difference vector f f e Form the covariance matrix of the difference vector as Var B B Var B Var Y where Var f and Var fB are estimated by the sample variance matrices of the fixed and random effects models respectively 2 Hausman 1978 showed that the covariance of the difference takes this simple form when B is an efficient estimator Chapter 15 Panel data 114 e Com
230. l is available to anyone who wants to critique it patch it or extend it See Appendix C Sophisticated Gretl offers a full range of least squares based estimators either for single equations and for systems including vector autoregressions and vector error correction models Sev eral specific maximum likelihood estimators e g probit ARIMA GARCH are also provided natively more advanced estimation methods can be implemented by the user via generic maximum likelihood or nonlinear GMM Extensible Users can enhance gretl by writing their own functions and procedures in gretl s script ing language which includes a reasonably wide range of matrix functions Accurate Gretl has been thoroughly tested on several benchmarks among which the NIST refer ence datasets See Appendix D Internet ready Gretl can access and fetch databases from a server at Wake Forest University The MS Windows version comes with an updater program which will detect when a new version is available and offer the option of auto updating International Gretl will produce its output in English French Italian Spanish Polish Portuguese German or Basque depending on your computer s native language setting 1 2 Acknowledgements The gretl code base originally derived from the program ESL Econometrics Software Library written by Professor Ramu Ramanathan of the University of California San Diego We are much in debt to Professor Ramanathan for making th
231. l look like for example AR 1965 where the two letter state code and the year of the observation are spliced together with a colon 4 6 Missing data values These are represented internally as DBL_MAX the largest floating point number that can be repre sented on the system which is likely to be at least 10 to the power 300 and so should not be confused with legitimate data values In a native format data file they should be represented as NA When importing CSV data gretl accepts several common representations of missing values in cluding 999 the string NA in upper or lower case a single dot or simply a blank cell Blank cells should of course be properly delimited e g 120 6 5 38 in which the middle value is presumed missing As for handling of missing values in the course of statistical analysis gretl does the following e In calculating descriptive statistics mean standard deviation etc under the summary com mand missing values are simply skipped and the sample size adjusted appropriately e In running regressions gretl first adjusts the beginning and end of the sample range trun cating the sample if need be Missing values at the beginning of the sample are common in time series work due to the inclusion of lags first differences and so on missing values at the end of the range are not uncommon due to differential updating of series and possibly the inclusion of leads If gretl detects any missing values
232. l of the one given by Greene Gretl began by using the Ramanathan variant but since version 1 3 1 the program has used the original Akaike formula 19 1 and more specifically 19 3 for models estimated via least squares Although the Akaike criterion is designed to favor parsimony arguably it does not go far enough in that direction For instance if we have two nested models with k 1 and k parameters respec tively and if the null hypothesis that parameter k equals 0 is true in large samples the AIC will nonetheless tend to select the less parsimonious model about 16 percent of the time see Davidson and MacKinnon 2004 chapter 15 An alternative to the AIC which avoids this problem is the Schwarz 1978 Bayesian information criterion BIC The BIC can be written in line with Akaike s formulation of the AIC as BIC 2 6 klogn The multiplication of k by logn in the BIC means that the penalty for adding extra parameters grows with the sample size This ensures that asymptotically one will not select a larger model over a correctly specified parsimonious model A further alternative to AIC which again tends to select more parsimonious models than AIC is the Hannan Quinn criterion or HOC Hannan and Quinn 1979 Written consistently with the formulations above this is 7 HQC 2 0 2kloglogn The Hannan Quinn calculation is based on the law of the iterated logarithm note that the last term is the log
233. l server by selecting File Function files On server Once your package is installed on your local machine you can use the function it contains via the graphical interface as described above or by using the CLI namely in a script or through the Chapter 10 User defined functions 73 console In the latter case you load the function via the include command specifying the package file as the argument complete with the gfn extension Call to function pc Arguments selection Series to process prod Assign return value optional type selection or new variable series foo v fe Help Qcancel Figure 10 3 Using your package To continue with our example load the file np gdt supplied with gretl among the sample datasets Suppose you want to compute the rate of change for the variable iprod via your new function and store the result in a series named foo Go to File Function files On local machine You will be shown a list of the installed packages including the one you have just created If you select it and click on Execute or double click on the name of the function package a window similar to the one shown in figure 10 3 will appear Notice that the description string Series to process supplied with the function definition appears to the left of the top series chooser Click Ok and the series foo will be generated see figure 10 4 You may have to g
234. lable again if you re open the session later gretl current session Ho Data info Data set Notes Summary P Correlations Model table Graph page Session xB E WM Model 1 Graph 1 Figure 3 2 Icon view one model and one graph have been added to the default icons If you start gretl and open a data set then select Icon view from the View menu you should see the basic default set of icons these give you quick access to information on the data set if any correlation matrix Correlations and descriptive summary statistics Summary All of these are activated by double clicking the relevant icon The Data set icon is a little more complex double clicking opens up the data in the built in spreadsheet but you can also right click on the icon for a menu of other actions To add a model to the Icon view first estimate it using the Model menu Then pull down the File menu in the model window and select Save to session as icon or Save as icon and close Simply hitting the S key over the model window is a shortcut to the latter action To add a graph first create it under the View menu Graph specified vars or via one of gretl s other graph generating commands Click on the graph window to bring up the graph menu and select Save to session as icon Once a model or graph is added its icon will appear in the Icon view window Double clicking on the icon redisplays the object
235. laborated variants on HCo take this point on board as follows e HC Applies a degrees of freedom correction multiplying the HCy matrix by T T k e HC gt Instead of using e for the diagonal elements of Q uses 1 ht where hy Xt X X X the t diagonal element of the projection matrix P which has the property that P y The relevance of h is that if the variance of all the urt is 02 the expectation of a is 02 1 hy or in other words the ratio ed h has expectation o As Davidson and MacKinnon show 0 lt h lt 1 for all t so this adjustment cannot reduce the the diagonal elements of Q and in general revises them upward e HC3 Uses a2 1 h The additional factor of 1 h in the denominator relative to HC2 may be justified on the grounds that observations with large variances tend to exert a lot of influence on the OLS estimates so that the corresponding residuals tend to be under estimated See Davidson and MacKinnon for a fuller explanation The relative merits of these variants have been explored by means of both simulations and the oretical analysis Unfortunately there is not a clear consensus on which is best Davidson and MacKinnon argue that the original HCo is likely to perform worse than the others nonetheless White s standard errors are reported more often than the more sophisticated variants and there fore for reasons of comparability HCo is the default HC
236. le scalar x 8 sprintf foo var d x To use the value of a string variable in a command give the name of the variable preceded by the at sign This notation is treated as a macro That is if a sequence of characters in a gretl command following the symbol is recognized as the name of a string variable the value of that variable is sustituted literally into the command line before the regular parsing of the command is carried out This is illustrated in the following interactive session scalar x 8 scalar x 8 Generated scalar x ID 2 8 sprintf foo var d x Saved string as foo print foo var8 Note the effect of the quotation marks in the line print foo The line print foo would not print a literal var8 as above After pre processing the line would read print var8 It would therefore print the value s of the variable var 8 if such a variable exists or would generate an error otherwise In some contexts however one wants to treat string variables as variables in their own right to do this give the name of the variable without a leading symbol This is the way to handle such variables in the following contexts Chapter 11 Named lists and strings 80 e When they appear among the arguments to the commands printf and sprintf e On the right hand side of a string assignment e When they appear as an argument to the function taking a string argument Here is an illustrat
237. le to carry out various transformations of the raw data with ease adding things up taking percentages or whatever note however that you can also do this sort of thing easily perhaps more easily within gretl by using the tools under the Add menu Appending imported data You may wish to establish a gretl dataset piece by piece by incremental importation of data from other sources This is supported via the File Append data menu items gretl will check the new data for conformability with the existing dataset and if everything seems OK will merge the data You can add new variables in this way provided the data frequency matches that of the existing dataset Or you can append new observations for data series that are already present in this case the variable names must match up correctly Note that by default that is if you choose Open data rather than Append data opening a new data file closes the current one Using the built in spreadsheet Under gretl s File New data set menu you can choose the sort of dataset you want to establish e g quarterly time series cross sectional You will then be prompted for starting and ending dates or observation numbers and the name of the first variable to add to the dataset After supplying this information you will be faced with a simple spreadsheet into which you can type data values In the spreadsheet window clicking the right mouse button will
238. left most column of the model table and add it to the table either by dragging its icon onto the Model table icon or by right clicking on the model icon and selecting Add to model table from the pop up menu Repeat step 4 for the other models you wish to include in the table The second model selected will appear in the second column from the left and so on When you are finished composing the model table display it by double clicking on its icon Under the Edit menu in the window which appears you have the option of copying the table to the clipboard in various formats If the ordering of the models in the table is not what you wanted right click on the model table icon and select Clear table Then go back to step 4 above and try again A simple instance of gretl s model table is shown in Figure 3 3 gretl model table OLS estimates Dependent variable price Model 1 Model 2 Model 3 const 129 1 121 2 52 35 88 30 80 18 37 29 sqft 0 1548 0 1483 0 1388 0 03194 0 02121 0 01873 bedrms 21 59 23 91 27 03 24 64 baths 12 19 43 25 n 14 14 14 Adj R 2 0 7868 0 8046 0 8056 Standard errors in parentheses x indicates significance at the 10 percent level xx indicates significance at the 5 percent level Close Figure 3 3 Example of model table 2The model table can also be built non interactively in script mode For details on how to do this see the entry for model
239. ler M E 1958 A Note on the Generation of Random Normal Deviates Annals of Mathematical Statistics 29 pp 610 11 Brand C and Cassola N 2004 A money demand system for euro area M3 Applied Economics 36 8 pp 817 838 Breusch T S and Pagan A R 1979 A Simple Test for Heteroscedasticity and Random Coefficient Variation Econometrica 47 pp 1287 94 Cameron A C and Trivedi P K 2005 Microeconometrics Methods and Applications Cambridge Cambridge University Press 224 Bibliography 225 Chesher A and Irish M 1987 Residual Analysis in the Grouped and Censored Normal Linear Model Journal of Econometrics 34 pp 33 61 Cureton E 1967 The Normal Approximation to the Signed Rank Sampling Distribution when Zero Differences are Present Journal of the American Statistical Association 62 pp 1068 1069 Davidson R and MacKinnon J G 1993 Estimation and Inference in Econometrics New York Oxford University Press Davidson R and MacKinnon J G 2004 Econometric Theory and Methods New York Oxford University Press Doornik Jurgen A 1995 Testing general restrictions on the cointegrating space Discussion Paper Nuffield College http www doornik com research coigen pdf Doornik J A 1998 Approximations to the Asymptotic Distribution of Cointegration Tests Jour nal of Economic Surveys 12 pp 573 93 Reprinted with corrections in M M
240. ll be replaced by the ML estimates The starting value is 1 for both this is arbitrary and does not matter much in this example more on this later The above code can be made more readable and marginally more efficient by defining a variable to hold x This command can be embedded into the mle block as follows scalar alpha 1 scalar p 1 mle log p In Cax Ingamma p ln x ax series ax alpha x params alpha p end mle In this case it is necessary to include the line params alpha p to set the symbols p and alpha apart from ax which is a temporarily generated variable and not a parameter to be estimated In a simple example like this the choice of the starting values is almost inconsequential the algo rithm is likely to converge no matter what the starting values are However consistent method of moments estimators of p and can be simply recovered from the sample mean m and variance V since it can be shown that E x pl V x p ar Chapter 17 Maximum likelihood estimation 126 it follows that the following estimators amp mV m a Y ll are consistent and therefore suitable to be used as starting point for the algorithm The gretl script code then becomes scalar m mean x scalar alpha m var x scalar p m alpha mle logl p In ax Ingamma p ln x ax series ax alpha x params alpha p end mle Another thing to note is that sometimes parameters are constrained within certai
241. ll columns of f in this case one index is given for each b term and the square brackets are dropped Hence the following syntax restrict b1 b2 0 end restrict corresponds to Bu Baa 6u Bat d Biz B23 Bia Ba The same convention is used for when only one index is given for each a term the restriction is presumed to apply to all r rows of or in other words the given variables are weakly exogenous For instance the formulation Chapter 21 Cointegration and Vector Error Correction Models 166 restrict a3 0 a4 0 end restrict specifies that variables 3 and 4 do not respond to the deviation from equilibrium in the previous period Finally a short cut is available for setting up complex restrictions but currently only in relation to B you can specify Rp and q as in Ryvec B q by giving the names of previously defined matrices For example matrix 14 I 4 matrix vR 14 I4 zeros 4 1 matrix vq mshape 14 16 1 restrict R vR q vq end restrict which manually imposes Phillips normalization on the estimates for a system with cointegrating rank 4 An example Brand and Cassola 2004 propose a money demand system for the Euro area in which they postu late three long run equilibrium relationships money demand m Bil pyy Fisher equation T ol Expectation theory of l s interest rates where m is real money demand l and s are long and short term interest rates y is outp
242. ls Journal of Econometrics 34 pp 5 32 Greene William H 2000 Econometric Analysis 4th edition Upper Saddle River NJ Prentice Hall Greene William H 2003 Econometric Analysis 5th edition Upper Saddle River NJ Prentice Hall Gujarati Damodar N 2003 Basic Econometrics 4th edition Boston MA McGraw Hill Hall Alastair D 2005 Generalized Method of Moments Oxford Oxford University Press Hamilton James D 1994 Time Series Analysis Princeton NJ Princeton University Press Hannan E J and Quinn B G 1979 The Determination of the Order of an Autoregression Journal of the Royal Statistical Society B 41 pp 190 95 Hansen L P 1982 Large Sample Properties of Generalized Method of Moments Estimation Econometrica 50 pp 1029 1054 Hansen L P and Singleton K J 1982 Generalized Instrumental Variables Estimation of Nonlinear Rational Expectations Models Econometrica 50 pp 1269 86 Harvey Andrew C 1989 Forecasting Structural Time Series Models and the Kalman Filter Cam bridge Cambridge University Press Hausman J A 1978 Specification Tests in Econometrics Econometrica 46 pp 1251 71 Bibliography 226 Heckman J 1979 Sample Selection Bias as a Specification Error Econometrica 47 pp 153 61 Hodrick Robert and Prescott Edward C 1997 Postwar U S Business Cycles An Empirical Inves tigation Journal of Money Credit and Banking
243. ls from the graph pop up menu If you want the labels to be affixed permanently so they will show up when the graph is printed or copied you have two options e To affix the labels currently shown on the graph select Freeze data labels from the graph pop up menu lFor an example of such a dataset see the Ramanathan file data4 10 this contains data on private school enrollment for the 50 states of the USA plus Washington DC the case markers are the two letter codes for the states 46 Chapter 7 Graphs and plots 47 e To affix labels for all points in the graph select Edit from the graph pop up and check the box titled Show all data labels This option is available only if there are less than 55 data points and it is unlikely to produce good results if the points are tightly clustered since the labels will tend to overlap To remove labels that have been affixed in either of these ways select Edit from the graph pop up and uncheck Show all data labels Advanced options If you know something about gnuplot and wish to get finer control over the appearance of a graph than is available via the graphical controller Edit option here s what to do In the graph display window right click and choose Save to session as icon Then open the icon view window either via the menu item View Icon view or by clicking the session icon view button on the main window toolbar You should s
244. lt as ts tsSmooth strmod save the estimated variances vars lt as matrix strmod coef export into gretl s temp dir gretl export compon gretl export vars In this case running the above in R produces nothing more that the echoing of commands gt load data from gretl gt gretldata lt read table home jack gret1 Rdata tmp header TRUE gt gretldata lt ts gretldata start c 1949 1 frequency 12 gt load script from gretl gt extract the log series gt y lt gretidata lg gt estimate the model gt strmod lt StructTS y gt save the fitted components smoothed gt compon lt as ts tsSmooth strmod gt save the estimated variances gt vars lt as matrix strmod coef gt export into gretl s temp dir gt gretl export compon gt gretl export vars However we see from the output that the two gretl export commands ran without errors Hence we are ready to pull the results back into gretl by executing the following commands either from the console or by creating a small script append dotdir compon csv vars mread dotdir vars mat The first command reads the estimated time series components from a CSV file which is the format that the passing mechanism employs for series The matrix vars is read from the file vars mat After the above commands have been executed three new series will have appeared in the gretl workspace namely the estima
245. lts You can however arrange for models estimated in a script to be captured so that you can examine them interactively when the script is finished Here is an example of the syntax for achieving this effect Model1 lt ols Ct 0 Yt That is you type a name for the model to be saved under then a back pointing assignment arrow then the model command You may use names that have embedded spaces if you like but such names must be wrapped in double quotes Model 1 lt ols Ct O Yt Models saved in this way will appear as icons in the gretl icon view window see Section 3 4 after the script is executed In addition you can arrange to have a named model displayed in its own window automatically as follows Model11 show Again if the name contains spaces it must be quoted Model 1 show The same facility can be used for graphs For example the following will create a plot of Ct against Yt save it under the name CrossPlot it will appear under this name in the icon view window and have it displayed CrossPlot lt gnuplot Ct Yt CrossPlot show You can also save the output from selected commands as named pieces of text again these will appear in the session icon window from where you can open them later For example this com mand sends the output from an augmented Dickey Fuller test to a text object named ADF1 and displays it in a window ADF1 lt adf 2 x1 ADF1 show Objects saved in this way
246. mate of t xt PX1 1 and 3 divide the result by 1 p Vir Xt The application of the above concept to our problem implies estimating a finite order Vector Au toregression VAR on the vector variables amp X U In general the VAR can be of any order but in most cases 1 is sufficient the aim is not to build a watertight model for but just to mop up a substantial part of the autocorrelation Hence the following VAR is estimated Et Agr 1 Et Then an estimate of the matrix X QX can be recovered via U A Sar A where is any HAC estimator applied to the VAR residuals You can ask for prewhitening in gretl using set hac_prewhiten on Chapter 14 Robust covariance matrix estimation 108 There is at present no mechanism for specifying an order other than 1 for the initial VAR A further refinement is available in this context namely data based bandwidth selection It makes intuitive sense that the HAC bandwidth should not simply be based on the size of the sample but should somehow take into account the time series properties of the data and also the kernel chosen A nonparametric method for doing this was proposed by Newey and West 1994 a good concise account of the method is given in Hall 2005 This option can be invoked in gretl via set hac_lag nw3 This option is the default when prewhitening is selected but you can override it by giving a specific numerical value for hac_lag Even
247. mmand genr pmx pmean x creates a series of this form the first 8 values corresponding to unit 1 contain the mean of x for unit 1 the next 8 values contain the mean for unit 2 and so on The psd function works in a similar manner The sample standard deviation for group i is computed as Si gt a an where T denotes the number of valid observations on x for the given unit x denotes the group mean and the summation is across valid observations for the group If T lt 2 however the standard deviation is recorded as 0 One particular use of psd may be worth noting If you want to form a sub sample of a panel that contains only those units for which the variable x is time varying you can either use smpl pmin x lt pmax x restrict or smpl psd x gt 0 restrict Chapter 5 Special functions in genr 34 Special functions for data manipulation Besides the functions discussed above there are some facilities in genr designed specifically for manipulating panel data in particular for the case where the data have been read into the program from a third party source and they are not in the correct form for panel analysis These facilities are explained in Chapter 4 5 5 Resampling and bootstrapping Another specialized function is the resampling with replacement of a series Given an original data series x the command genr xr resample x creates a new series each of whose elements is drawn at rand
248. model is to be applied to real data it is necessary to include some term to handle the possibility that y has non zero mean There are two possible ways to represent processes with nonzero mean one is to define u as the unconditional mean of yt namely the central value of its marginal distribution Therefore the series Y y ut has mean 0 and the model 20 1 applies to yi In practice assuming that pr is a linear function of some observable variables x the model becomes P L y xtp O L et 20 2 This is sometimes known as a regression model with ARMA errors its structure may be more apparent if we represent it using two equations Yi XxiPB u p L ut 0 L e The model just presented is also sometimes known as ARMAX ARMA eXogenous variables It seems to us however that this label is more appropriately applied to a different model another way to include a mean term in 20 1 is to base the representation on the conditional mean of yt that is the central value of the distribution of y given its own past Assuming again that this can be represented as a linear combination of some observable variables z the model would expand to P L y zty O L er 20 3 The formulation 20 3 has the advantage that y can be immediately interpreted as the vector of marginal effects of the z variables on the conditional mean of y And by adding lags of z to 146 Chapter 20 Time series models 147 this sp
249. model y x B ur Although most of us are used to read it as the sum of a hazily defined systematic part plus an equally hazy disturbance a more rigorous interpretation of this familiar expression comes from the hypothesis that the conditional mean E y x is linear and the definition of ut as y E y x From the definition of ut it follows that E us x 0 The following orthogonality condition is therefore available E f B 1 0 18 7 where f B y x B x The definitions given in the previous section therefore specialize here to e Bis B e the instrument is xt e fijt 0 is y x1B x1 utxt the orthogonality condition is interpretable as the requirement that the regressors should be uncorrelated with the disturbances Chapter 18 GMM estimation 137 e W can be any symmetric positive definite matrix since the number of parameters equals the number of orthogonality conditions Let s say we choose I e The function F 0 W is in this case qe i F 0 W E Eao t 1 and it is easy to see why OLS and GMM coincide here the GMM objective function has the same minimizer as the objective function of OLS the residual sum of squares Note however that the two functions are not equal to one another at the minimum F 0 W O while the minimized sum of squared residuals is zero only in the special case of a perfect linear fit The code snippet contained in Example 18 1 uses gretl s gmm comma
250. mpossible to reproduce unless all details of the estimation procedure are carefully recorded Chapter 18 GMM estimation 143 Example 18 5 Estimation of the Consumption Based Asset Pricing Model output Model 1 1 step GMM estimates using the 465 observations 1959 04 1997 12 e d ewr consratA alpha 1 1 PARAMETER ESTIMATE STDERROR T STAT P VALUE alpha 3 14475 6 84439 0 459 0 64590 d 0 999215 0 0121044 82 549 lt 0 00001 GMM criterion 2778 08 Model 2 1 step GMM estimates using the 465 observations 1959 04 1997 12 e d ewr consratA alpha 1 1 PARAMETER ESTIMATE STDERROR T STAT P VALUE alpha 0 398194 2 26359 0 176 0 86036 d 0 993180 0 00439367 226 048 lt 0 00001 GMM criterion 14 247 Model 3 Iterated GMM estimates using the 465 observations 1959 04 1997 12 e d ewr consratA alpha 1 1 PARAMETER ESTIMATE STDERROR T STAT P VALUE alpha 0 344325 2 21458 0 155 0 87644 d 0 991566 0 00423620 234 070 lt 0 00001 GMM criterion 5491 78 J test Chi square 3 11 8103 p value 0 0081 Model 4 Iterated GMM estimates using the 465 observations 1959 04 1997 12 e d ewr consratA alpha 1 1 PARAMETER ESTIMATE STDERROR T STAT P VALUE alpha 0 344315 2 21359 0 156 0 87639 d 0 991566 0 00423469 234 153 lt 0 00001 GMM criterion 5491 78 J test Chi square 3 11 8103 p value 0 0081 Chapter 19 Model selection criteria 19 1 Introduction In some contexts the econometrician chooses
251. n P value finder Look up p values from the Gaussian t chi square F gamma binomial or Poisson distributions See also the pvalue command in the Gretl Command Reference Distribution graphs Produce graphs of various probability distributions In the resulting graph window the pop up menu includes an item Add another curve which enables you to superimpose a further plot for example you can draw the t distribution with various different degrees of freedom Test statistic calculator Calculate test statistics and p values for a range of common hy pothesis tests population mean variance and proportion difference of means variances and proportions Nonparametric tests Calculate test statistics for various nonparametric tests Sign test Wilcoxon rank sum test Wilcoxon signed rank test Runs test Chapter 2 Getting started 9 Seed for random numbers Set the seed for the random number generator by default this is set based on the system time when the program is started Command log Open a window containing a record of the commands executed so far Gretl console Open a console window into which you can type commands as you would using the command line program gretlcli as opposed to using point and click Start Gnu R Start R if it is installed on your system and load a copy of the data set currently open in gretl See Appendix E Sort variables Rearrange the listing of variables in the main
252. n c ct ctt For each case approximate p values are calculated by means of the algorithm developed in MacKinnon 1996 The gretl command used to perform the test is adf for example adf 4 x1 c ct would compute the test statistic as the t statistic for y in equation 20 8 with p 4 in the two cases Ht Mo and pr Uo pit The number of lags p in equation 20 8 should be chosen as to ensure that 20 8 is a parame trization flexible enough to represent adequately the short run persistence of Ay Setting p too low results in size distortions in the test whereas setting p too high would lead to low power As a convenience to the user the parameter p can be automatically determined Setting p toa negative number triggers a sequential procedure that starts with p lags and decrements p until the t statistic for the parameter y exceeds 1 645 in absolute value The KPSS test The KPSS test Kwiatkowski Phillips Schmidt and Shin 1992 is a unit root test in which the null hypothesis is opposite to that in the ADF test under the null the series in question is stationary the alternative is that the series is 1 1 The basic intuition behind this test statistic is very simple if y can be written as y H ut where ur is some zero mean stationary process then not only does the sample average of the y s provide a consistent estimator of u but the long run variance of ur is a well defined finite number Neither of these prop
253. n please see the documentation for the arbond command in the Gretl Command Reference and the arbond91 example file supplied with gretl 15 3 Panel illustration the Penn World Table The Penn World Table homepage at pwt econ upenn edu is a rich macroeconomic panel dataset spanning 152 countries over the years 1950 1992 The data are available in gretl format please see the gretl data site this is a free download although it is not included in the main gretl package Example 15 2 opens pwt56_60_89 gdt a subset of the PWT containing data on 120 countries 1960 89 for 20 variables with no missing observations the full data set which is also supplied in the pwt package for gretl has many missing observations Total growth of real GDP 1960 89 is calculated for each country and regressed against the 1960 level of real GDP to see if there is evidence for convergence i e faster growth on the part of countries starting from a low base Chapter 15 Panel data Example 15 2 Use of the Penn World Table open pwt56_60_89 gdt for 1989 the last obs lag 29 gives 1960 the first obs genr gdp60 RGDPL 29 find total growth of real GDP over 30 years genr gdpgro RGDPL gdp60 gdp60 restrict the sample to a 1989 cross section smpl restrict YEAR 1989 convergence did countries with a lower base grow faster ols gdpgro const gdp60 result No Try an inverse relationship genr gdp60inv 1 gdp60 ols gdpgro co
254. n the covariance matrix is computed from a numerical approximation to the Hessian at convergence If the robust option is selected the quasi ML sandwich estimator is used Varom H 0 1G 6 G 6 H 6 where H denotes the numerical approximation to the Hessian Chapter 17 Maximum likelihood estimation 125 17 2 Gamma estimation Suppose we have a sample of T independent and identically distributed observations from a Gamma distribution The density function for each observation x is f xt a lt x xP exp axt 17 2 The log likelihood for the entire sample can be written as the logarithm of the joint density of all the observations Since these are independent and identical the joint density is the product of the individual densities and hence its log is T L a p tog ox P exp aso gt amp 17 3 rip where ti p log axi y p logx axt and y is the log of the gamma function In order to estimate the parameters and p via ML we need to maximize 17 3 with respect to them The corresponding gretl code snippet is scalar alpha 1 scalar p 1 mle logl p InCalpha x Ingamma p ln x alpha x end mle The two statements alpha 1 p 1 are necessary to ensure that the variables p and alpha exist before the computation of logl is attempted The values of these variables will be changed by the execution of the mle command upon successful completion they wi
255. n append a single quote to obtain the transpose To specify a matrix in terms of data series the syntax is for example matrix A x1 x2 x3 where the names of the variables are separated by commas Besides names of existing variables you can use expressions that evaluate to a series For example given a series x you could do matrix A x xA2 Each variable occupies a column and there can only be one variable per column You cannot use the semi colon as a row separator in this case if you want the series arranged in rows append the transpose symbol The range of data values included in the matrix depends on the current setting of the sample range t While gretl s built in statistical functions for data series are capable of handling missing values the matrix arithmetic functions are not When you build a matrix from series that include missing values observations for which at least one series has a missing value are skipped Instead of giving an explicit list of variables you may instead provide the name of a saved list see Chapter 11 as in 82 Chapter 12 Matrix manipulation 83 list xlist x1 x2 x3 matrix A xlist When you provide a named list the data series are by default placed in columns as is natural in an econometric context if you want them in rows append the transpose symbol As a special case of constructing a matrix from a list of variables you can say matrix A dataset Thi
256. n boundaries in this case for example both and p must be positive numbers Gretl does not check for this it is the user s responsibility to ensure that the function is always evaluated at an admissible point in the parameter space during the iterative search for the maximum An effective technique is to define a variable for checking that the parameters are admissible and setting the log likelihood as undefined if the check fails An example which uses the conditional assignment operator follows scalar m mean x scalar alpha m var x scalar p m alpha mle logl check p InCax Ingamma p ln x ax NA series ax alpha x scalar check alpha gt 0 p gt 0 params alpha p end mle 17 3 Stochastic frontier cost function When modeling a cost function it is sometimes worthwhile to incorporate explicitly into the sta tistical model the notion that firms may be inefficient so that the observed cost deviates from the theoretical figure not only because of unobserved heterogeneity between firms but also because two firms could be operating at a different efficiency level despite being identical under all other respects In this case we may write Ci Ch uit Vi where C is some variable cost indicator C i is its theoretical value u is a zero mean disturbance term and v is the inefficiency term which is supposed to be nonnegative by its very nature A linear specification for Cf is often chosen For ex
257. n case a non null second argument is given the specified matrix will be over written with the auxiliary result It is not required that the existing matrix be of the right dimensions to receive the result The function eigensym computes the eigenvalues and optionally the right eigenvectors of a sym metric n x n matrix The eigenvalues are returned directly in a column vector of length n if the eigenvectors are required they are returned in an n x n matrix For example matrix V matrix E eigensym M amp V matrix E eigensym M null In the first case E holds the eigenvalues of M and V holds the eigenvectors In the second E holds the eigenvalues but the eigenvectors are not computed Chapter 12 Matrix manipulation 90 The function eigengen computes the eigenvalues and optionally the eigenvectors of a general n Xn matrix The eigenvalues are returned directly in an n x 2 matrix the first column holding the real components and the second column the imaginary components If the eigenvectors are required that is if the second argument to eigengen is not null they are returned in an n x n matrix The column arrangement of this matrix is somewhat non trivial the eigenvectors are stored in the same order as the eigenvalues but the real eigenvectors occupy one column whereas complex eigenvectors take two the real part comes first the total num ber of columns is still n because the conjugate eigenvector is skipped Example 12 1
258. n for Arkansas That s all very well but these markers don t tell us anything about the date of the observation To rectify this we could do Chapter 4 Data files 26 genr time genr year 1960 5 time genr markers s d marker year The first line generates a 1 based index representing the period of each observation and the second line uses the time variable to generate a variable representing the year of the observation The third line contains this special feature if and only if the name of the new variable to generate is markers the portion of the command following the equals sign is taken as C style format string which must be wrapped in double quotes followed by a comma separated list of arguments The arguments will be printed according to the given format to create a new set of observation markers Valid arguments are either the names of variables in the dataset or the string marker which denotes the pre existing observation marker The format specifiers which are likely to be useful in this context are s for a string and d for an integer Strings can be truncated for example 3s will use just the first three characters of the string To chop initial characters off an existing observation marker when constructing a new one you can use the syntax marker n where n is a positive integer in the case the first n characters will be skipped After the commands above are processed then the observation markers wil
259. n the actual data values plus additional information such as the names and descriptions of variables the frequency of the data and so on Most users will probably not have need to read or write such files other than via gretl itself but if you want to manipulate them using other software tools you should examine the DTD and also take a look at a few of the supplied practice data files data4 1 gdt gives a simple example data4 10 gdt is an example where observation labels are included A 2 Traditional ESL format For backward compatibility gretl can also handle data files in the traditional format inherited from Ramanathan s ESL program In this format which was the default in gretl prior to version 0 98 a data set is represented by two files One contains the actual data and the other information on how the data should be read To be more specific 1 Actual data A rectangular matrix of white space separated numbers Each column represents a variable each row an observation on each of the variables spreadsheet style Data columns can be separated by spaces or tabs The filename should have the suffix gdt By default the data file is ASCII plain text Optionally it can be gzip compressed to save disk space You can insert comments into a data file if a line begins with the hash mark the entire line is ignored This is consistent with gnuplot and octave data files 2 Header The data file must be accompanied by a header file
260. n the lag structure The standard way to specify an ARMA model in gretl is via the AR and MA orders p and q respec tively In this case all lags from 1 to the given order are included In some cases one may wish to include only certain specific AR and or MA lags This can be done in either of two ways e One can construct a matrix containing the desired lags positive integer values and supply the name of this matrix in place of p or q e One can give a space separated list of lags enclosed in braces in place of p or q The following code illustrates these options matrix pvec 1 4 arma pvec 1 y arma 1 4 1 y Both forms above specify an ARMA model in which AR lags 1 and 4 are used but not 2 and 3 This facility is available only for the non seasonal component of the ARMA specification Differencing and ARIMA The above discussion presupposes that the time series y has already been subjected to all the transformations deemed necessary for ensuring stationarity see also section 20 3 Differencing is the most common of these transformations and gretl provides a mechanism to include this step into the arma command the syntax arma pdq y would estimate an ARMA p q model on A y It is functionally equivalent to Chapter 20 Time series models 149 series tmp y loop for i 1 d tmp diff tmp end loop arma p q tmp except with regard to forecasting after estimation see below When the series y is di
261. n the maximum number of iterations is reached whichever comes first Let k denote the number of parameters being estimated The maximum number of iterations is 100 x k 1 when analytical derivatives are given and 200 x k 1 when numerical derivatives are used Let denote a small number The iteration is deemed to have converged if at least one of the following conditions is satisfied e Both the actual and predicted relative reductions in the error sum of squares are at most e e The relative error between two consecutive iterates is at most e This default value of e is the machine precision to the power 3 4 but it can be adjusted using the set command with the parameter nls_toler For example set nls_toler 0001 will relax the value of e to 0 0001 16 6 Details on the code The underlying engine for NLS estimation is based on the minpack suite of functions available from netlib org Specifically the following minpack functions are called Imder Levenberg Marquardt algorithm with analytical derivatives chkder Check the supplied analytical derivatives Imdif Levenberg Marquardt algorithm with numerical derivatives fdjac2 Compute final approximate Jacobian when using numerical derivatives dpmpar Determine the machine precision On successful completion of the Levenberg Marquardt iteration a Gauss Newton regression is used to calculate the covariance matrix for the parameter estimates If the robust flag is given a
262. n use the function strlen which retrieves the length of the string as in string temp getenv TEMP Saved empty string as temp scalar x strlen temp Generated scalar x ID 2 0 Chapter 11 Named lists and strings 81 The function isstring returns 1 if its argument is the name of a string variable 0 otherwise However if the return is 1 the string may still be empty At present the getenv function can only be used on the right hand side of a string assignment as in the above illustrations Capturing strings via the shell If shell commands are enabled in gretl you can capture the output from such commands using the syntax string stringname shellcommand That is you enclose a shell command in parentheses preceded by a dollar sign Reading from a file into a string You can read the content of a file into a string variable using the syntax string stringname readfi le filename The filename field may include components that are string variables For example string foo readfile x12adir QNC rts The strstr function Invocation of this function takes the form string stringname strstr sl s2 The effect is to search s1 for the first occurrence of s2 If no such occurrence is found an empty string is returned otherwise the portion of s1 starting with s2 is returned For example string hw hello world Saved string as hw string w strstr hw o Saved string as w print Qu
263. ncoeff matrix b coeff 1 k 2 al coeff k 1 a2 coeff k Wooldridge illustrates the choice effect in the ordered probit by reference to a single non black male aged 60 with 13 5 years of education income in the range 50K 75K and wealth of 200K participating in a plan with profit sharing matrix X 60 13 5 0 0 0 0 0 0 1 0 O 200 1 with choice 0 scalar Xb 0 X b PO cdf N al Xb P50 cdf N a2 Xb PO P100 1 cdf N a2 Xb EO 50 P50 100 P100 with choice 1 Xb 1 X b PO cdf N al Xb P50 cdf N a2 Xb PO P100 1 cdf N a2 Xb El 50 P50 100 P100 printf nWith choice E y 2f without E y 2f n El EO printf Estimated choice effect via ML 2f OLS 2f n El EO choice_ols Chapter 22 Discrete and censored dependent variables 178 can be handled via the mle command see chapter 17 We give here an example of a multinomial logit model Let the dependent variable y take on integer values 0 1 p The probability that yi k is given by exp xiBx P yi klx ae izo exp xiBj For the purpose of identification one of the outcomes must be taken as the baseline it is usually assumed that Bo 0 in which case exp xiBx P yi k xi a 1 31 exp xiB and i P yi 0 x 1 5 exp xiBy Example 22 4 reproduces Table 15 2 in Wooldridge 2002a based on
264. nd Weil QJE 1992 open mrw gdt genr lny log gdp85 genr ngd 0 05 popgrow 100 0 genr lngd log ngd genr linv log inv 100 0 generate variable for testing Solow restriction genr x3 linv lngd set sample to non oil producing countries smpl nonoil dummy modell lt ols lny const linv lngd genr essu ess genr dful df restricted regression ols lny const x3 genr Fl ess essu essu dful set sample to the better data countries smpl intermed dummy replace model2 lt ols lny const linv lngd genr essu ess genr dfu2 df Figure 3 1 Script window editing a command file The toolbar at the top of the script window offers the following functions left to right 1 Save the file 2 Save the file under a specified name 3 Print the file this option is not available on all platforms 4 Execute the commands in the file 5 Copy selected text 6 Paste the selected text 7 Find and replace text 8 Undo the last Paste or Replace action 9 Help if you place the cursor in a command word and press the question mark you will get help on that command 10 Close the window When you execute the script by clicking on the Execute icon or by pressing Ctrl r all output is directed to a single window where it can be edited saved or copied to the clipboard To learn more about the possibilities of scripting take a look at the gretl Help item Command reference 13
265. nd don t mind waiting for the results you can increase the limit using the set command with parameter rq_maxiter as in set rq_maxiter 5000 Part III Technical details 188 Chapter 24 Gretl and TpX 24 1 Introduction TeX initially developed by Donald Knuth of Stanford University and since enhanced by hundreds of contributors around the world is the gold standard of scientific typesetting Gretl provides various hooks that enable you to preview and print econometric results using the TEX engine and to save output in a form suitable for further processing with Tex This chapter explains the finer points of gretl s TpX related functionality The next section describes the relevant menu items section 24 3 discusses ways of fine tuning TeX output section 24 4 ex plains how to handle the encoding of characters not found in English and section 24 5 gives some pointers on installing and learning TeX if you do not already have it on your computer Just to be clear TEX is not included with the gretl distribution it is a separate package including several programs and a large number of supporting files Before proceeding however it may be useful to set out briefly the stages of production of a final document using TEX For the most part you don t have to worry about these details since in regard to previewing at any rate gretl handles them for you But having some grasp of what is going on behind the scences will ena
266. nd to make the above opera tional Example 18 1 OLS via GMM initialize stuff series e 0 scalar beta 0 matrix V I 1 proceed with estimation gmm series e y x beta orthog e x weights V params beta end gmm We feed gretl the necessary ingredients for GMM estimation in a command block starting with gmm and ending with end gmm After the end gmm statement two mutually exclusive options can be specified two step or iterate whose meaning should be obvious Three elements are compulsory within a gmm block 1 one or more orthog statements 2 one weights statement 3 one params statement The three elements should be given in the stated order The orthog statements are used to specify the orthogonality conditions They must follow the syntax orthog x Z where x may be a series matrix or list of series and Z may also be a series matrix or list In example 18 1 the series e holds the residuals and the series x holds the regressor If x had been a list a matrix the orthog statement would have generated one orthogonality condition for each element column of x Note the structure of the orthogonality condition it is assumed that the term to the left of the semicolon represents a quantity that depends on the estimated parameters and so must be updated in the process of iterative estimation while the term on the right is a constant function of the data Chapter 18 GMM estimation 138
267. ne replicates Table 22 7 in Greene 2003 while the second one replicates table 17 1 in Wooldridge 2002a 2Note that the estimates given by gretl do not coincide with those found in the printed volume They do however match those found on the errata web page for Greene s book http pages stern nyu edu wgreene Text Errata ERRATA5 htm Chapter 22 Discrete and censored dependent variables Example 22 5 Interval model on artificial data Input nulldata 100 generate artificial data set seed 201449 x normal epsilon 0 2 normal ystar 1 x epsilon lo_bound floor ystar hi_bound ceil ystar run the interval model intreg lo_bound hi_bound const x estimate ystar gen_resid uhat yhat yhat gen_resid corr ystar yhat Output selected portions Model 1 Interval estimates using the 100 observations 1 100 Lower limit lo_bound Upper limit hi_bound coefficient std error t ratio p value const 0 993762 0 0338325 29 37 1 22e 189 x 0 986662 0 0319959 30 84 8 34e 209 Chi square 1 950 9270 p value 8 3e 209 Log likelihood 44 21258 Akaike criterion 94 42517 Schwarz criterion 102 2407 Hannan Quinn 97 58824 sigma 0 223273 Left unbounded observations 0 Right unbounded observations 0 Bounded observations 100 Point observations 0 corr ystar yhat 0 98960092 Under the null hypothesis of no correlation t 98 68 1071 with two tailed p value 0 0000 182 Chapter 22
268. next step is to specify the number of cross sectional units in the data set The third option Use index variables is applicable if the data set contains two variables that index the units and the time periods respectively the next step is then to select those variables For example a data file might contain a country code variable and a variable representing the year of the observation In that case gretl can reconstruct the panel structure of the data regardless of how the observation rows are organized The setobs command has options that parallel those in the graphical interface If suitable index variables are available you can do for example setobs unitvar timevar panel vars where unitvar is a variable that indexes the units and timevar is a variable indexing the periods Alternatively you can use the form setobs freq 1 1 structure where freq is replaced by the block size of the data that is the number of periods in the case of stacked time series or the number of units in the case of stacked cross sections and structure is either stacked time series or stacked cross section Two examples are given below the first is suitable for a panel in the form of stacked time series with observations from 20 periods the second for stacked cross sections with 5 units setobs 20 1 1 stacked time series setobs 5 1 1 stacked cross section Panel data arranged by variable Publicly available panel data sometimes come arrange
269. ng the first column of Table 6 on page 825 of Brand and Cassola 2004 The results show that weak exogeneity might perhaps be accepted for the long term interest rate and real GDP p values 0 07 and 0 08 respectively Identification and testability One point regarding VECM restrictions that can be confusing at first is that identification does the restriction identify the system and testability is the restriction testable are quite separate matters Restrictions can be identifying but not testable less obviously they can be testable but not identifying This can be seen quite easily in relation to a rank 1 system The restriction f 1 is identifying it pins down the scale of but being a pure scaling it is not testable On the other hand the restriction f B2 0 is testable the system with this requirement imposed will almost certainly have a lower maximized likelihood but it is not identifying it still leaves open the scale of We said above that the number of restrictions must equal at least r where r is the cointegrating Chapter 21 Cointegration and Vector Error Correction Models 169 Example 21 3 Testing for weak exogeneity Input restrict al 0 end restrict ts_m 2 110 rIn1 restrict a2 0 end restrict ts_p 2 110 rlnl restrict a3 0 end restrict ts_1 2 110 rin restrict a4 0 end restrict ts_s 2 110 rIn1 restrict a5 0 end restrict ts_y 2 110
270. no change for both the starting and ending point For example smpl 1 will advance the starting observation by one while preserving the ending observation and smpl 2 1 will both advance the starting observation by two and retard the ending observation by one An important feature of setting the sample as described above is that it necessarily results in the selection of a subset of observations that are contiguous in the full dataset The structure of the dataset is therefore unaffected for example if it is a quarterly time series before setting the sample it remains a quarterly time series afterwards 43 Chapter 6 Sub sampling a dataset 44 6 3 Restricting the sample By restricting the sample we mean selecting observations on the basis of some Boolean logical criterion or by means of a random number generator This is likely to be most relevant for cross sectional or panel data Suppose we have data on a cross section of individuals recording their gender income and other characteristics We wish to select for analysis only the women If we have a gender dummy variable with value 1 for men and 0 for women we could do smp1 gender 0 restrict to this effect Or suppose we want to restrict the sample to respondents with incomes over 50 000 Then we could use smp1 income gt 50000 restrict A question arises here If we issue the two commands above in sequence what do we end up with in our sub sample
271. ns available via genr and some of the finer points of the command 5 2 Long run variance As is well known the variance of the average of T random variables x1 x gt xr with equal vari ance g equals 0 T if the data are uncorrelated In this case the sample variance of x over the sample size provides a consistent estimator If however there is serial correlation among the xs the variance of X T7 Si xt must be estimated differently One of the most widely used statistics for this purpose is a nonparametric kernel estimator with the Bartlett kernel defined as t k i k T k k k to S gt unn Di 5 1 where the integer k is known as the window size and the w terms are the so called Bartlett weights defined as w 1 i It can be shown that for k large enough k T yields a consistent estimator of the variance of X Gretl implements this estimator by means of the function 1rvar which takes two arguments the series whose long run variance must be estimated and the scalar k If k is negative the popular choice T is used 5 3 Time series filters One sort of specialized function in genr is time series filtering In addition to the usual application of lags and differences gretl provides fractional differencing and two filters commonly used in macroeconomics for trend cycle decomposition the Hodrick Prescott filter Hodrick and Prescott 1997 and the Baxter King bandpass filter Baxter and King 1999
272. nsiderations but rather from economic theory A rational individual who must allocate his income between consumption and investment in a financial asset must in fact choose the consumption path of his whole lifetime since investment translates into future consumption It can be shown that an optimal consumption path should satisfy the following condition pu cr E resnU crak Ft 18 11 where p is the asset price U is the individual s utility function 6 is the individual s subjective discount rate and 7 is the asset s rate of return between time t and time t k F is the infor mation set at time t equation 18 11 says that the utility lost at time t by purchasing the asset instead of consumption goods must be matched by a corresponding increase in the discounted future utility of the consumption financed by the asset s return Since the future is uncertain the individual considers his expectation conditional on what is known at the time when the choice is made We have said nothing about the nature of the asset so equation 18 11 should hold whatever asset we consider hence it is possible to build a system of equations like 18 11 for each asset whose price we observe If we are willing to believe that e the economy as a whole can be represented as a single gigantic and immortal representative individual and e the function U x xl is a faithful representation of the individual s preferences
273. nst The name of the list must start with a letter and must be composed entirely of letters numbers or the underscore character The maximum length of the name is 15 characters list names cannot contain spaces Once a named list has been created it will be remembered for the duration of the gretl session and can be used in the context of any gretl command where a list of variables is expected One simple example is the specification of a list of regressors list xlist x1 x2 x3 x4 ols y O xlist To get rid of a list you can use the following syntax list xlist delete Be careful delete xlist will delete the variables contained in the list so it implies data loss which may not be what you want On the other hand list xlist delete will simply undefine the xlist identifier and the variables themselves will not be affected Lists can be modified in two ways To redefine an existing list altogether use the same syntax as for creating a list For example list xlist 1 2 3 xlist 45 6 75 Chapter 11 Named lists and strings 76 After the second assignment xlist contains just variables 4 5 and 6 To append or prepend variables to an existing list we can make use of the fact that a named list stands in for a longhand list For example we can do list xlist xlist 5 6 7 xlist 9 10 xlist 11 12 Another option for appending a term or a list to an existing list is to use as in xlist cpi To dr
274. nst Z5 columns 111 2JE 3J 4 5 TOT 0 20 91 75 93 36 315 1 28 73 54 97 34 286 Chapter 8 Discrete variables 53 TOTAL 48 164 129 190 70 601 Pearson chi square test 5 48233 4 df p value 0 241287 Cross tabulation of Z1 rows against Z6 columns C 9 121 14 16 1 17 1 181 20 TOT 0 4 36 106 70 52 45 2 315 1 3 8 48 45 37 67 78 286 TOTAL 7 44 154 115 89 112 80 601 Pearson chi square test 123 177 6 df p value 3 50375e 24 Cross tabulation of Z4 rows against Z5 columns 1 E 2 E 3JE 410 5 TOT 0 17 60 35 45 14 171 1 31 104 94 145 56 430 TOTAL 48 164 129 190 70 601 Pearson chi square test 11 1615 4 df p value 0 0248074 Cross tabulation of Z4 rows against Z6 columns 91 121 14 16 1 17 181 20 TOT 0 1 8 39 47 30 32 14 171 1 6 36 115 68 59 80 66 430 TOTAL 7 44 154 115 89 112 80 601 Pearson chi square test 18 3426 6 df p value 0 0054306 Pearson s x test for independence is automatically displayed provided that all cells have expected frequencies under independence greater than 1077 However a common rule of thumb states that this statistic is valid only if the expected frequency is 5 or greater for at least 80 percent of the cells If this condition is not met a warning is printed Additionally the row or column options can be given in this case the output displays row or column percentages respectively If you want to cut and pa
275. nst gdp60inv no again Try treating Africa as special genr afdum CCODE 1 genr afslope afdum gdp60 ols gdpgro const afdum gdp60 afslope Chapter 16 Nonlinear least squares 16 1 Introduction and examples Gretl supports nonlinear least squares NLS using a variant of the Levenberg Marquardt algorithm The user must supply a specification of the regression function prior to giving this specification the parameters to be estimated must be declared and given initial values Optionally the user may supply analytical derivatives of the regression function with respect to each of the parameters The tolerance criterion for terminating the iterative estimation procedure can be adjusted using the set command The syntax for specifying the function to be estimated is the same as for the genr command Here are two examples with accompanying derivatives Example 16 1 Consumption function from Greene nls C alpha beta YAgamma deriv alpha 1 deriv beta YAgamma deriv gamma beta YAgamma log Y end nis Example 16 2 Nonlinear function from Russell Davidson nls y alpha beta x1 1 beta x2 deriv alpha 1 deriv beta x1 x2 beta beta end nis Note the command words n1s which introduces the regression function deriv which introduces the specification of a derivative and end nls which terminates the specification and calls for estimation If the vcv flag is appended to the last lin
276. nstalled properly For details on modifying your path please see the documentation or online help for your operating system or shell 1The exception to this rule is the invocation of gnuplot under MS Windows where a full path to the program is given 205 Chapter 27 The command line interface The gretl package includes the command line program gretlcli On Linux it can be run from a terminal window xterm rxvt or similar or at the text console Under MS Windows it can be run in a console window sometimes inaccurately called a DOS box gretIcli has its own help file which may be accessed by typing help at the prompt It can be run in batch mode sending output directly to a file see also the Gretl Command Reference If gretlcli is linked to the readline library this is automatically the case in the MS Windows version also see Appendix C the command line is recallable and editable and offers command completion You can use the Up and Down arrow keys to cycle through previously typed commands On a given command line you can use the arrow keys to move around in conjunction with Emacs editing keystokes The most common of these are Keystroke Effect Ctrl a go to start of line Ctrl e go to end of line Ctrl d delete character to right where Ctr1 a means press the a key while the Ctr1 key is also depressed Thus if you want to change something at the beginning of a command you don t have to ba
277. o choose sensible values for you it also allows you to take complete control over graph details if you wish With a graph displayed you can click on the graph window for a pop up menu with the following options Save as PNG Save the graph in Portable Network Graphics format e Save as postscript Save in encapsulated postscript EPS format e Save as Windows metafile Save in Enhanced Metafile EMF format e Save to session as icon The graph will appear in iconic form when you select Icon view from the View menu e Zoom Lets you select an area within the graph for closer inspection not available for all graphs e Print Gnome desktop or MS Windows only lets you print the graph directly e Copy to clipboard MS Windows only lets you paste the graph into Windows applications such as MS Word e Edit Opens a controller for the plot which lets you adjust many aspects of its appearance e Close Closes the graph window Displaying data labels In the case of a simple X Y scatterplot with or without a line of best fit displayed some further options are available if the dataset includes case markers that is labels identifying each observa tion With a scatter plot displayed when you move the mouse pointer over a data point its label is shown on the graph By default these labels are transient they do not appear in the printed or copied version of the graph They can be removed by selecting Clear data labe
278. o to Data Refresh data in order to have your new variable show up in the main window variable list or just press the r key Alternatively the same could have been accomplished by the script include pc gfn open np foo pc iprod Chapter 10 User defined functions gretl gnuplot graph 1880 1900 1920 1940 1960 Click on graph for pop up menu Figure 10 4 Percent change in industrial production 74 Chapter 11 Named lists and strings 11 1 Named lists Many gretl commands take one or more lists of series as arguments To make this easier to handle in the context of command scripts and in particular within user defined functions gretl offers the possibility of named lists Creating and modifying named lists A named list is created using the keyword list followed by the name of the list an equals sign and an expression that forms a list The most basic sort of expression that works in this context is a space separated list of variables given either by name or by ID number For example list xlist 12 3 4 list reglist income price Note that the variables in question must be of the series type you can t include scalars in a named list Two special forms are available e If you use the keyword nu11 on the right hand side you get an empty list e If you use the keyword dataset on the right you get a list containing all the series in the current dataset except the pre defined co
279. olves taking layers of the data that would naturally stack in a third dimension and stacking them in the vertical dimension Gretl always expects data to be arranged by observation that is such that each row represents an observation and each variable occupies one and only one column In this context the flattening of a panel data set can be done in either of two ways e Stacked time series the successive vertical blocks each comprise a time series for a given unit e Stacked cross sections the successive vertical blocks each comprise a cross section for a given period You may input data in whichever arrangement is more convenient Internally however gretl always stores panel data in the form of stacked time series Chapter 4 Data files 24 When you import panel data into gretl from a spreadsheet or comma separated format the panel nature of the data will not be recognized automatically most likely the data will be treated as undated A panel interpretation can be imposed on the data using the graphical interface or via the setobs command In the graphical interface use the menu item Data Dataset structure In the first dialog box that appears select Panel In the next dialog you have a three way choice The first two options Stacked time series and Stacked cross sections are applicable if the data set is already organized in one of these two ways If you select either of these options the
280. om from the elements of x If the original series has 100 observations each element of x is selected with probability 1 100 at each drawing Thus the effect is to shuffle the elements of x with the twist that each element of x may appear more than once or not at all in xr The primary use of this function is in the construction of bootstrap confidence intervals or p values Here is a simple example Suppose we estimate a simple regression of y on x via OLS and find that the slope coefficient has a reported t ratio of 2 5 with 40 degrees of freedom The two tailed p value for the null hypothesis that the slope parameter equals zero is then 0 0166 using the t 40 distribution Depending on the context however we may doubt whether the ratio of coefficient to standard error truly follows the 40 distribution In that case we could derive a bootstrap p value as shown in Example 5 1 Under the null hypothesis that the slope with respect to x is zero y is simply equal to its mean plus an error term We simulate y by resampling the residuals from the initial OLS and re estimate the model We repeat this procedure a large number of times and record the number of cases where the absolute value of the t ratio is greater than 2 5 the proportion of such cases is our bootstrap p value For a good discussion of simulation based tests and bootstrapping see Davidson and MacKinnon 2004 chapter 4 Example 5 1 Calculation of bootstrap p value ol
281. on of the coefficients of the cointegration matrix would be easier if a meaning could be attached to each of its columns This is possible by hypothesizing the existence of two long run relationships a money demand equation m c Bi inf Boy B3tbr and a risk premium equation cpr co Bainfl Bsy Betbr 2This data set is available in the verbeek data package see http gretl sourceforge net gretl_data html Chapter 21 Cointegration and Vector Error Correction Models 164 which imply that the cointegration matrix can be normalized as 1 0 Bi Ba O 1 Br B2 Bs P3 Be C1 C2 This renormalization can be accomplished by means of the restrict command to be given after the vecm command or in the graphical interface by selecting the Test Linear Restrictions menu entry The syntax for entering the restrictions should be fairly obvious restrict b 1 1 1 b 1 3 0 b 2 1 0 b 2 3 1 end restrict which produces Cointegrating vectors standard errors in parentheses m 1 0000 0 0000 0 0000 0 0000 infl 0 023026 0 041039 0 0054666 0 027790 cpr 0 0000 1 0000 0 0000 0 0000 y 0 42545 0 037414 0 033718 0 17140 tbr 0 027790 1 0172 0 0045445 0 023102 const 3 3625 0 68744 0 25318 1 2870 21 6 Over identifying restrictions One purpose of imposing restrictions on a VECM system is simply to achieve identification If these restrictions are simply normalizations
282. on that points to the Dependent variable slot If you check the Set as default box this variable will be pre selected as dependent when you next open the model dialog box Shortcut double clicking on a variable on the left selects it as dependent and also sets it as the default To select independent variables highlight them on the left and click the Add button or click the Chapter 2 Getting started 7 right mouse button over the highlighted variable To select several variable in the list box drag the mouse over them to select several non contiguous variables hold down the Ctr1 key and click on the variables you want To run a regression with consumption as the dependent variable and income as independent click Ct into the Dependent slot and add Yt to the Independent variables list 2 2 Estimation output Once you ve specified a model a window displaying the regression output will appear The output is reasonably comprehensive and in a standard format Figure 2 4 gretl model 1 File Edit Tests Save Graphs Analysis LaTeX Model 1 OLS estimates using the 36 observations 1959 1994 Dependent variable Ct VARIABLE COEFFICIENT STDERROR T STAT P VALUE const 384 105 151 330 2 538 0 01589 Yt 0 932738 0 0106966 87 199 lt 0 00001 Mean of dependent variable 12490 9 Standard deviation of dep var 2940 03 Sum of squared residuals 1 34675e 06 Standard error of residuals 199 023 Unadjusted R squared 0 99
283. on the test machine when the tolerance was tightened to 1 0e 14 Using numerical derivatives the same tightening of the tolerance raised the worst values to 5 correct figures for the parameters and 3 figures for standard errors at a cost of one additional failure of convergence Note the overall superiority of analytical derivatives on average solutions to the test problems were obtained with substantially fewer iterations and the results were more accurate most notably for the estimated standard errors Note also that the six digit results printed by gretl are not 100 percent reliable for difficult nonlinear problems in particular when using numerical derivatives Having registered this caveat the percentage of cases where the results were good to six digits or better seems high enough to justify their printing in this form Chapter 17 Maximum likelihood estimation 17 1 Generic ML estimation with gretl Maximum likelihood estimation is a cornerstone of modern inferential procedures Gretl provides a way to implement this method for a wide range of estimation problems by use of the mle com mand We give here a few examples To give a foundation for the examples that follow we start from a brief reminder on the basics of ML estimation Given a sample of size T it is possible to define the density function for the whole sample namely the joint distribution of all the observations f Y 0 where Y y1 YT Its shape is determin
284. ons After running the loop coeffs gdt which contains the individual coefficient estimates from all the runs can be opened in gretl to examine the frequency distribution of the estimates in detail The command nulldata is useful for Monte Carlo work Instead of opening a real data set nulldata 50 for instance opens a dummy data set containing just a constant and an index vari able with a series length of 50 Constructed variables can then be added using the genr command See the set command for information on generating repeatable pseudo random series Iterated least squares Example 9 2 uses a while loop to replicate the estimation of a nonlinear consumption function of the form C a BY e as presented in Greene 2000 Example 11 3 This script is included in the gretl distribution under the name greene11_3 inp you can find it in gretl under the menu item File Script files Practice file Greene The option print final for the ols command arranges matters so that the regression results will not be printed each time round the loop but the results from the regression on the last iteration will be printed when the loop terminates Example 9 3 shows how a loop can be used to estimate an ARMA model exploiting the outer product of the gradient OPG regression discussed by Davidson and MacKinnon in their Estimation and Inference in Econometrics Chapter 9 Loop constructs 59 Example 9 2 Nonlinear
285. ons and maxvar con tains M with NAs for right unbounded observations By default standard errors are computed using the negative inverse of the Hessian If the robust flag is given then QML or Huber White standard errors are calculated instead In this case the estimated covariance matrix is a sandwich of the inverse of the estimated Hessian and the outer product of the gradient If the model specification contains regressors other than just a constant the output includes a chi square statistic for testing the joint null hypothesis that none of these regressors has any effect on the outcome This is a Wald statistic based on the estimated covariance matrix If you wish to construct a likelihood ratio test this is easily done by estimating both the full model and the null model containing only the constant saving the log likelihood in both cases via the 1n1 accessor and then referring twice the difference between the two log likelihoods to the chi square distribution with k degrees of freedom where k is the number of additional regressors see the pvalue command in the Gretl Command Reference An example is contained in the sample script wtp inp provided with the gretl distribution As with the probit and Tobit models after a model has been estimated the uhat accessor returns the generalized residual which is an estimate of more precisely it equals y x B for point observations and E m Mi xi otherwise Note tha
286. onstructs 60 Example 9 3 ARMA 1 1 open armaloop gdt genr c genr a genr m Il ooo he ll Il D o series e genr de_c genr de_a genr de_m pon Oo Oo genr crit 1 loop while crit gt 1 0e 9 one step forecast errors genr e y c a y 1 m e 1 log likelihood genr loglik 0 5 sum eA2 print loglik partials of forecast errors wrt c a and m genr de_c 1 m de_c 1 genr de_a y 1 m de_a 1 genr de_m e 1 m de_m 1 partials of 1 wrt c a and m genr sc_c de_c e genr sc_a de_a e genr sc_m de_m e OPG regression ols const sc_c sc_a sc_m print final no df corr vcv Update the parameters genr dc coeff sc_c genr c c dc genr da coeff sc_a genr a a da genr dm coeff sc_m genr m m dm printf constant 8g gradient 6g n c dc printf arl coefficient 8g gradient 6g n a da printf mal coefficient 8g gradient 6g n m dm genr crit T ess print crit endloop genr se_c stderr sc_c genr se_a stderr sc_a genr se_m stderr sc_m noecho print printf constant 8g se 6g t 4f n C se_c c se_c printf arl term 8g se 6g t 4f n a se_a a se_a printf mal term 8g se 6g t 4f n m se_m m se_m Chapter 9 Loop constructs Example 9 4 Panel statistics open hospitals gdt loop i 1991 2000 smpl
287. ontain data on Saturdays because we wouldn t know where to put them but at the same time we want to place missing values on all the Wednesdays In this case the following syntax could be used string QRY SELECT year month day VerdSE FROM AlmeaIndexes data y obs format d d d QRY odbc The column VerdSE holds the data to be fetched which will go into the gretl series y The first three columns are used to construct a string which identifies the day Since a string like 2008 04 26 does not correspond to any observation in our dataset it s a Saturday that row is simply discarded On the other hand since no string 2008 04 23 was found in the data coming from the DBMS it s a Wednesday that entry is left blank in our variable y B 3 Examples In the following examples we will assume that access is available to a database known to ODBC with the data source name AWM with username Otto and password Bingo The database AWM contains quarterly data in two tables see B 3 and B 4 The table Consump is the classic rectangular dataset that is its internal organization is the same as in a spreadsheet or in an econometrics package like gretl itself each row is a data point and each column is a variable On the other hand the structure of the DATA table is different each record is one figure stored in the column xval and the other fields keep track of which variable it belongs to for which date
288. op a variable from a list use xlist cpi In most contexts where lists are used in gretl it is expected that they do not contain any duplicated elements If you form a new list by simple concatenation as in list L3 L1 L2 where L1 and L2 are existing lists it s possible that the result may contain duplicates To guard against this you can form a new list as the union of two existing ones list L3 L1 L2 The result is a list that contains all the members of L1 plus any members of L2 that are not already in L1 In the same vein you can construct a new list as the intersection of two existing ones list L3 L1 86 L2 Here L3 contains all the elements that are present in both L1 and L2 Lists and matrices Another way of forming a list is by assignment from a matrix The matrix in question must be interpretable as a vector containing ID numbers of series variables It may be either a row or a column vector and each of its elements must have an integer part that is no greater than the number of variables in the data set For example matrix m 1 2 3 4 list L m The above is OK provided the data set contains at least 4 variables Querying a list You can determine whether an unknown variable actually represents a list using the function islistO series x11 log x1 series x12 log x2 list xlogs x11 x12 genr isl islist xlogs genr is2 islist x11 The first genr command above will assign a value of 1
289. open the session file by going to the File Session files Open session or e From the command line type gret1 r sessionfile where sessionfile is the name under which the session was saved 3For PDF output you need pdflatex and either Adobe s PDF reader or xpdf on X11 For PostScript you must have dvips and ghostscript installed along with a viewer such as gv ggv or kghostview The default viewer for systems other than MS Windows is gv Chapter 4 Data files 4 1 Native format gretl has its own format for data files Most users will probably not want to read or write such files outside of gretl itself but occasionally this may be useful and full details on the file formats are given in Appendix A 4 2 Other data file formats gretl will read various other data formats Plain text ASCII files These can be brought in using gretl s File Open Data Import ASCII menu item or the import script command For details on what gretl expects of such files see Section 4 4 e Comma Separated Values CSV files These can be imported using gretl s File Open Data Import CSV menu item or the import script command See also Section 4 4 Spreadsheets MS Excel Gnumeric and Open Document ODS These are also brought in us ing gretl s File Open Data Import menu The requirements for such files are given in Sec tion 4 4 Stata data files dta SPSS data files sav Eviews workfiles w
290. oping over two paired lists Problem Suppose you have two lists with the same number of elements and you want to apply some command to corresponding elements over a loop Solution abc xyz list L1 list L2 k1 1 loop foreach i L1 quiet k2 1 loop foreach j L2 quiet if k1 k2 ols i 0 j endif k2 end loop k1 end loop Comment The simplest way to achieve the result is to loop over all possible combinations and filter out the unneeded ones via an if condition as above That said in some cases variable names can help For example if list Lx list Ly x1 x2 x3 yl y2 y3 looping over the integers is quite intuitive and certainly more elegant loop for i 1 3 ols y i const x i end loop Part II Econometric methods Chapter 14 Robust covariance matrix estimation 14 1 Introduction Consider once again the linear regression model y XB u 14 1 where y and u are T vectors X is a T x k matrix of regressors and f is a k vector of parameters As is well known the estimator of given by Ordinary Least Squares OLS is p OX X y 14 2 If the condition E u X 0 is satisfied this is an unbiased estimator under somewhat weaker conditions the estimator is biased but consistent It is straightforward to show that when the OLS estimator is unbiased that is when E B B 0 its variance is Var B E B B B BY XX UXOX X X 14 3 where E uu is the covariance matri
291. or Windows users but this is immaterial The important point is that you have a window where you can type commands to R If the above procedure doesn t work and no R window opens it means that gretl was unable to launch R You should ensure that R is installed and working on your system and that gretl knows where it is The relevant settings can be found by selecting the Tools Preferences General menu entry under the Programs tab Assuming R was launched successfully you will notice that two commands have been executed automatically gretldata lt read table home jack gret1 Rdata tmp header TRUE attach gretldata These commands have the effect of loading our dataset into the R workspace in the form of a data frame one of several forms in which R can store data Use of a data frame enables the subsequent attachQ command which sets things up so that the variable names defined in the gretl workspace are available as valid identifiers within R In order to replicate gretl s OLS estimation go into the R window and type at the prompt model lt Im price sqft bedrms baths summary model You should see something similar to Figure 25 2 Surprise the estimates coincide To get out just close the R window or type q at the R prompt Time series data We now turn to an example which uses time series data we will compare gretl s and R s estimates of Box and Jenkins immortal airline model The
292. ore than one language you can set up per language preamble files A localized preamble file is identified by a name of the form gretlpre_xx tex where xx is replaced by the first two letters of the current setting of the LANG environment variable For example if you are running the program in Polish using LANG p1_PL then gretl will do the following when writing the preamble for a TeX source file 1 Look for a file named gret1lpre_p1 tex in the gretl user directory If this is not found then 2 look for a file named gret1pre tex in the gretl user directory If this is not found then 3 use the default preamble Conversely suppose you usually run gretl in a language other than English and have a suitable gretlpre tex file in place for your native language If on some occasions you want to produce TEX output in English then you could create an additional file gretlpre_en tex this file will be used for the preamble when gretl is run with a language setting of say en_US Chapter 24 Gretl and T X 193 Command line options After estimating a model via a script or interactively via the gretl console or using the command line program gretlcli you can use the commands tabprint or eqnprint to print the model to file in tabular format or equation format respectively These options are explained in the Gretl Command Reference If you wish alter the appearance of gretl s tabular output for models in the context of the tabprint
293. ove is new in gretl 1 7 6 The problem it addresses is quite subtle and was discovered only recently Existing functions that use foreach loops on list arguments may need to be modified For a limited time there is a special switch available that restores the old behavior that is it enables support for functions that do not use the listname varname syntax The command to use is set protect_lists off But we recommend updating old functions as soon as possible Chapter 10 User defined functions 68 Constancy of list arguments When a named list of variables is passed to a function the function is actually provided with a copy of the list The function may modify this copy for instance adding or removing members but the original list at the level of the caller is not modified Optional list arguments If a list argument to a function is optional this should be indicated by appending a default value of nu11 as in function myfunc scalar y list X nul1 In that case if the caller gives null as the list argument or simply omits the last argument the named list X inside the function will be empty This possibility can be detected using the nelem function which returns 0 for an empty list String arguments String arguments can be used for example to provide flexibility in the naming of variables created within a function In the following example the function mavg returns a list containing two moving averages constructed from an inpu
294. owing sub sections we see how matrices can be exchanged and how data can be passed from R back to gretl Chapter 25 Gretl and R 199 e Non interactive just get output O Interactive R session v pre load data Q Cancel Figure 25 4 Editing window for R scripts Passing matrices from gretl to R For passing matrices from gretl to R you can use the mwrite matrix function described in section 12 6 For example the following gretl code fragment generates the matrix 3 7 11 4 8 12 A 5 9 13 6 10 14 and stores it into the file mymatfile mat matrix A mshape seq 3 14 4 3 err mwrite A mymatfile mat In order to retrieve this matrix from R all you have to do is A lt as matrixCread table mymatfile mat skip 1 Although in principle you can give your matrix file any valid filename a couple of conventions may prove useful First you may want to use an informative file suffix such as mat but this is a matter of taste More importantly the exact location of the file created by mwrite could be an issue By default if no path is specified in the file name gretl stores matrix files in the current work directory However it may be wise for the purpose at hand to use the directory in which gretl stores all its temporary files whose name is stored in the built in string dotdir see section 11 2 The value of this string is automatically passed to R as the string variable gret1 dotdir so the abov
295. planatory variables then the random effects estimator would be inconsistent while fixed effects estimates would still be valid It is precisely on this principle that the Hausman test is built see below if the fixed and random effects estimates agree to within the usual statistical margin of error there is no reason to think the additional hypotheses invalid and as a consequence no reason not to use the more efficient RE estimator Testing panel models If you estimate a fixed effects or random effects model in the graphical interface you may notice that the number of items available under the Tests menu in the model window is relatively limited Panel models carry certain complications that make it difficult to implement all of the tests one expects to see for models estimated on straight time series or cross sectional data Nonetheless various panel specific tests are printed along with the parameter estimates as a matter of course as follows When you estimate a model using fixed effects you automatically get an F test for the null hy pothesis that the cross sectional units all have a common intercept That is to say that all the os are equal in which case the pooled model 15 1 with a column of 1s included in the X matrix is adequate When you estimate using random effects the Breusch Pagan and Hausman tests are presented automatically The Breusch Pagan test is the counterpart to the F test mentioned above The nul
296. points are artificially generated for an ordinary probit model yp is a binary variable which takes the value 1 if y B1x11 B2xX21 B3x31 amp gt 0 and 0 otherwise Therefore 2 Again gretl does provide a native probit command see section 22 1 but a probit model makes for a nice example here Chapter 17 Maximum likelihood estimation 130 Y 1 with probability 9 f1x11 B2X24 63x31 Trt The probability function for one observation can be written as P yi 10 1 mu o Since the observations are independent and identically distributed the log likelihood is simply the sum of the individual contributions Hence T f y y log rr 1 yt log 1 Tr t 1 The verbose switch at the end of the end mle statement produces a detailed account of the iterations done by the BFGS algorithm In this case numerical differentiation works rather well nevertheless computation of the analytical score is straightforward since the derivative can be written as ot ot OTT OPi OT OPi via the chain rule and it is easy to see that of Ye 1 y OTT Ttt 1 Ttt OTT 3B QP B1X11 P2X2t P3X3t Xit L The mle block in the above script can therefore be modified as follows mle logl y In P 1 y 1In 1 P series ndx b0 b1 x1 b2 x2 b3 x3 series P cnorm ndx series tmp dnorm ndx y P 1 y 1 P deriv bO tmp deriv b1 tmp x1 deriv b2 tmp x2 deriv b3 tmp x3 end mle verbos
297. ponding data file contains three columns of data each having 90 entries Three further features of the traditional data format may be noted 1 If the BYOBS keyword is replaced by BYVAR and followed by the keyword BINARY this indi cates that the corresponding data file is in binary format Such data files can be written from greticli using the store command with the s flag single precision or the o flag double precision 2 If BYOBS is followed by the keyword MARKERS gretl expects a data file in which the first column contains strings 8 characters maximum used to identify the observations This may be handy in the case of cross sectional data where the units of observation are identifiable countries states cities or whatever It can also be useful for irregular time series data such as daily stock price data where some days are not trading days in this case the observations can be marked with a date string such as 10 01 98 Remember the 8 character maximum Note that BINARY and MARKERS are mutually exclusive flags Also note that the markers are not considered to be a variable this column does not have a corresponding entry in the list of variable names in the header file 3 If a file with the same base name as the data file and header files but with the suffix 1b1 is found it is read to fill out the descriptive labels for the data series The format of the label file is simple each line contains the name of one
298. pose that we observe yy defined as A gt 0 9 17 oT ee 22 9 O for y lt 0 In this case regressing y on the x s does not yield consistent estimates of the parameters because the conditional mean E y x is not equal to Sn xi Bj It can be shown that restricting the sample to non zero observations would not yield consistent estimates either The solution is to estimate the parameters via maximum likelihood The syntax is simply tobit depvar indvars As usual progress of the maximization algorithm can be tracked via the verbose switch while uhat returns the generalized residuals Note that in this case the generalized residual is defined as 1 Ele y 0 for censored observations so the familiar equality 1 y Yi only holds for uncensored observations that is when y gt 0 An important difference between the Tobit estimator and OLS is that the consequences of non normality of the disturbance term are much more severe non normality implies inconsistency for the Tobit estimator For this reason the output for the tobit model includes the Chesher Trish 1987 normality test by default 22 5 Interval regression The interval regression model arises when the dependent variable is unobserved for some possibly all observations what we observe instead is an interval in which the dependent variable lies In other words the data generating process is assumed to be Vi xiB i but we only know that m lt
299. probability that individual i exhibits response j conditional on the characteristics x is then given by P y lt x F 041 Zi for j 0 P yi j xi 3 Play lt y lt ajy 1x1 F amp j 1 Zi F aj zj fr0 lt j lt J 22 8 P y gt aj xi 1 F amp j zi for j J The unknown parameters are estimated jointly with the fs via maximum likelihood The amp estimates are reported by gretl as cut1 cut2 and so on In order to apply these models in gretl the dependent variable must either take on only non negative integer values or be explicitly marked as discrete In case the variable has non integer values it will be recoded internally Note that gretl does not provide a separate command for ordered models the logit and probit commands automatically estimate the ordered version if the dependent variable is acceptable but not binary Example 22 3 reproduces the results presented in section 15 10 of Wooldridge 2002a The ques tion of interest in this analysis is what difference it makes to the allocation of assets in pension funds whether individual plan participants have a choice in the matter The response variable is an ordinal measure of the weight of stocks in the pension portfolio Having reported the results of estimation of the ordered model Wooldridge illustrates the effect of the choice variable by ref erence to an average participant The example script shows how one can compute this effect in gretl
300. process The simplest first order case of the latter can be written as Ut PUt 1 Et l lt p lt l where the es are independently and identically distributed with mean zero and variance a With an AR 1 error if p is positive then a positive value of ut will tend to be followed with probability greater than 0 5 by a positive us 1 With an ARCH error process a disturbance ut of large absolute value will tend to be followed by further large absolute values but with no presumption that the successive values will be of the same sign ARCH in asset prices is a stylized fact and is consistent with market efficiency on the other hand autoregressive behavior of asset prices would violate market efficiency One can test for ARCH of order q in the following way 1 Estimate the model of interest via OLS and save the squared residuals 2 Perform an auxiliary regression in which the current squared residual is regressed on a con stant and q lags of itself 3 Find the TR value sample size times unadjusted R for the auxiliary regression 4 Refer the TR value to the x distribution with q degrees of freedom and if the p value is small enough reject the null hypothesis of homoskedasticity in favor of the alternative of ARCH q This test is implemented in gretl via the arch command This command may be issued following the estimation of a time series model by OLS or by selection from the Tests menu in the model
301. producing consistent estimates of and p was pro posed by Anderson and Hsiao 1981 Instead of de meaning the data they suggest taking the first difference of 15 7 an alternative tactic for sweeping out the group effects AVit AXitB PAYit 1 Nit 15 8 and is inefficient Chapter 15 Panel data 115 where nit Au A Vi Eit Eit Eit 1 We re not in the clear yet given the structure of the error nit the disturbance t 1 is an influence on both nit and Ayj 1 1 Vit Vit 1 The next step is then to find an instrument for the contaminated AV t 1 Anderson and Hsiao suggest using either 1 2 or Ay 1 2 both of which will be uncorrelated with nit provided that the underlying errors Eit are not themselves serially correlated The Anderson Hsiao estimator is not provided as a built in function in gretl since gretl s sensible handling of lags and differences for panel data makes it a simple application of regression with instrumental variables see Example 15 1 which is based on a study of country growth rates by Nerlove 1999 3 Example 15 1 The Anderson Hsiao estimator for a dynamic panel model Penn World Table data as used by Nerlove open penngrow gdt Fixed effects for comparison panel Y 0 Y 1 X Random effects for comparison panel Y 0 Y 1 X random effects take differences of all variables diff Y X Anderson Hsiao using Y 2 as instrument tsls d_Y d_Y 1 d_X
302. provides a hopefully clarifying example see also subsection 12 6 Example 12 1 Complex eigenvalues and eigenvectors set seed 34756 matrix v A mnormal 3 3 do the eigen analysis 1 eigengen A v eigenvalue 1 is real 2 and 3 are complex conjugates print 1 print v column 1 contains the first eigenvector real B A v 1 111 0 vt 1 B should equal c print B print c columns 2 3 contain the real and imaginary parts of eigenvector 2 B A v 2 3 c cmult ones 3 1 1 2 v 2 3 B should equal c print B print c The qrdecomp function computes the OR decomposition of an m x n matrix A A QR where Q is an m x n orthogonal matrix and R is an nx n upper triangular matrix The matrix Q is returned directly while R can be retrieved via the second argument Here are two examples matrix R matrix Q qrdecomp M amp R matrix Q qrdecomp M null In the first example the triangular R is saved as R in the second R is discarded The first line above shows an example of a simple declaration of a matrix R is declared to be a matrix variable Chapter 12 Matrix manipulation 91 but is not given any explicit value In this case the variable is initialized as a 1 x 1 matrix whose single element equals zero The syntax for svd is matrix B func A amp C D The function svd computes all or part of the singular value decomposition of the r
303. pute H vO B Given the relative efficiencies of and B the matrix Y should be positive definite in which case H is positive but in finite samples this is not guaranteed and of course a negative x value is not admissible The regression method avoids this potential problem The procedure is e Treat the random effects model as the restricted model and record its sum of squared resid uals as SSR Estimate via OLS an unrestricted model in which the dependent variable is quasi demeaned y and the regressors include both quasi demeaned X as in the RE model and the de meaned variants of all the time varying variables i e the fixed effects regressors record the sum of squared residuals from this model as SSRy e Compute H n SSR SSR SSR where n is the total number of observations used On this variant H cannot be negative since adding additional regressors to the RE model cannot raise the SSR By default gretl computes the Hausman test via the matrix difference method largely for compara bility with other software but it uses the regression method if you pass the option hausman reg to the panel command Robust standard errors For most estimators gretl offers the option of computing an estimate of the covariance matrix that is robust with respect to heteroskedasticity and or autocorrelation and hence also robust standard errors In the case of panel data robust covariance matrix estimators a
304. rIn loop foreach i m p 1 s y quiet printf ADelta i t 6 3f 6 4f n ts_ i pvalue X 6 ts_ i end loop Output variable LR test p value Delta m 18 111 0 0060 Delta p 21 067 0 0018 Delta 1 11 819 0 0661 Delta s 16 000 0 0138 Delta y 11 335 0 0786 rank for identification This is a necessary and not a sufficient condition In fact when r gt 1 it can be quite tricky to assess whether a given set of restrictions is identifying Gretl uses the method suggested by Doornik 1995 where identification is assessed via the rank of the information ma trix It can be shown that for restrictions of the sort 21 7 and 21 8 the information matrix has the same rank as the Jacobian matrix J10 Ip e B G amp Ip H A sufficient condition for identification is that the rank of J 0 equals the number of free para meters The rank of this matrix is evaluated by examination of its singular values at a randomly selected point in the parameter space For practical purposes we treat this condition as if it were both necessary and sufficient that is we disregard the special cases where identification could be achieved without this condition being met 6See Boswijk and Doornik 2004 pp 447 8 for discussion of this point Chapter 21 Cointegration and Vector Error Correction Models 170 21 7 Numerical solution methods In general the ML estimator for the restricted VECM problem has no closed form solu
305. re available for the pooled and fixed effects model but not currently for random effects Please see section 14 4 for details 15 2 Dynamic panel models Special problems arise when a lag of the dependent variable is included among the regressors in a panel model Consider a dynamic variant of the pooled model 15 1 Vit XitB PVit 1 Uit 15 7 First if the error uit includes a group effect v then yjz_ is bound to be correlated with the error since the value of v affects y at all t That means that OLS applied to 15 7 will be inconsistent as well as inefficient The fixed effects model sweeps out the group effects and so overcomes this particular problem but a subtler issue remains which applies to both fixed and random effects estimation Consider the de meaned representation of fixed effects as applied to the dynamic model Vit Xib PY it 1 Eit where Vit Vit Vi and Eit Uit i or Uit Qi using the notation of equation 15 2 The trouble is that 1 1 will be correlated with via the group mean Yi The disturbance e influences Vit directly which influences yi which by construction affects the value of 7 for all t The same issue arises in relation to the quasi demeaning used for random effects Estimators which ignore this correlation will be consistent only as T oo in which case the marginal effect of it on the group mean of y tends to vanish One strategy for handling this problem and
306. retl allocates memory dynami cally and will ask the operating system for as much memory as your data require Obviously then you are ultimately limited by the size of RAM Aside from the multiple precision OLS option gretl uses double precision floating point numbers throughout The size of such numbers in bytes depends on the computer platform but is typically eight To give a rough notion of magnitudes suppose we have a data set with 10 000 observations on 500 variables That s 5 million floating point numbers or 40 million bytes If we define the megabyte MB as 1024 x 1024 bytes as is standard in talking about RAM it s slightly over 38 MB The program needs additional memory for workspace but even so handling a data set of this size should be quite feasible on a current PC which at the time of writing is likely to have at least 256 MB of RAM If RAM is not an issue there is one further limitation on data size though it s very unlikely to be a binding constraint That is variables and observations are indexed by signed integers and on a typical PC these will be 32 bit values capable of representing a maximum positive value of 231 1 2 147 483 647 The limits mentioned above apply to gretl s native functionality There are tighter limits with regard to two third party programs that are available as add ons to gretl for certain sorts of time series analysis including seasonal adjustment namely TRAMO SEATS and X 12 ARI
307. rfect classifier exists among the regressors in which case estimation is simply impos sible and the algorithm stops with an error This behavior is triggered during the iteration process if max zi lt min Zi 1 y 0 tiy 1 Chapter 22 Discrete and censored dependent variables 176 If this happens unless your model is trivially mis specified like predicting if a country is an oil exporter on the basis of oil revenues it is normally a small sample problem you probably just don t have enough data to estimate your model You may want to drop some of your explanatory variables 22 2 Ordered response models These models constitute a simple variation on ordinary logit probit models and are usually applied when the dependent variable is a discrete and ordered measurement not simply binary but on an ordinal rather than an interval scale For example this sort of model may be applied when the dependent variable is a qualitative assessment such as Good Average and Bad In the general case consider an ordered response variable y that can take on any of the J 1 values 0 1 2 J We suppose as before that underlying the observed response is a latent variable y XP E Z E Now define cut points 1 lt amp 2 lt lt QJ such that y 0 if y lt a y 1 if a lt y lt a y J if y gt ay For example if the response takes on three values there will be two such cut points amp and a The
308. roblem You have a dataset with many variables and want to restrict the sample to those observa tions for which there are no missing observations for the variables x1 x2 and x3 97 Chapter 13 Cheat sheet 98 Solution list X x1 x2 x3 genr sel ok X smpl sel restrict Comment You can now save the file via a store command to preserve a subsampled version of the dataset By operations Problem You have a discrete variable d and you want to run some commands for example estimate a model by splitting the sample according to the values of d Solution matrix vd values d m rows vd loop for i 1 m scalar sel vd i smpl d sel restrict replace ols y const x end loop smp1 full Comment The main ingredient here is a loop You can have gretl perform as many instructions as you want for each value of d as long as they are allowed inside a loop 13 2 Creating modifying variables Generating a dummy variable for a specific observation Problem Generate dt 0 for all observation but one for which d 1 Solution genr d t 1984 2 Comment The internal variable t is used to refer to observations in string form so if you have a cross section sample you may just use d t 123 of course if the dataset has data labels use the corresponding label For example if you open the dataset mrw gdt supplied with gretl among the examples a dummy variable for Italy could be generated via genr Dita
309. rs X as in 20 10 Chapter 20 Time series models Example 20 1 ARIMA forecasting open greenel8_2 gdt log of quarterly U S nominal GNP 1950 1 to 1983 4 genr y log Y and its first difference genr dy diff y reserve 2 years for out of sample forecast smp1 1981 4 Estimate using ARIMA arima 1 1 1 y forecast over full period smpl full fcast fcl Return to sub sample and run ARMA on the first difference of y smpl 1981 4 arma 1 1 dy smpl full fcast fc2 genr fcdiff t lt 1982 1 fc1 y 1 t gt 1982 1 fc1 fc1 1 compare the forecasts over the later period smpl 1981 1 1983 4 print y fcl fc2 fcdiff byobs The output from the last command is y fcl fc2 fcdiff 1981 1 7 964086 7 940930 0 02668 0 02668 1981 2 7 978654 7 997576 0 03349 0 03349 1981 3 8 009463 7 997503 0 01885 0 01885 1981 4 8 015625 8 033695 0 02423 0 02423 1982 1 8 014997 8 029698 0 01407 0 01407 1982 2 8 026562 8 046037 0 01634 0 01634 1982 3 8 032717 8 063636 0 01760 0 01760 1982 4 8 042249 8 081935 0 01830 0 01830 1983 1 8 062685 8 100623 0 01869 0 01869 1983 2 8 091627 8 119528 0 01891 0 01891 1983 3 8 115700 8 138554 0 01903 0 01903 1983 4 8 140811 8 157646 0 01909 0 01909 156 Chapter 21 Cointegration and Vector Error Correction Models 21 1 Introduction The twin concepts of cointegration and error correction have drawn a good deal of attention in macroeconometrics over recent years
310. rval for the median This is obtained by the bootstrap method which can take a while if the data series is very long Chapter 7 Graphs and plots 48 After each variable specified in the boxplot command a parenthesized boolean expression may be added to limit the sample for the variable in question A space must be inserted between the variable name or number and the expression Suppose you have salary figures for men and women and you have a dummy variable GENDER with value 1 for men and O for women In that case you could draw comparative boxplots with the following line in the boxplots dialog salary GENDER 1 salary GENDER 0 Chapter 8 Discrete variables When a variable can take only a finite typically small number of values then the variable is said to be discrete Some gretl commands act in a slightly different way when applied to discrete variables moreover gretl provides a few commands that only apply to discrete variables Specifically the dummify and xtab commands see below are available only for discrete variables while the freq frequency distribution command produces different output for discrete variables 8 1 Declaring variables as discrete Gretl uses a simple heuristic to judge whether a given variable should be treated as discrete but you also have the option of explicitly marking a variable as discrete in which case the heuristic check is bypassed The heuristic is as follows First are all the values o
311. s aria a a 209 B Data import via ODBC 211 BL ODBC base COncepis o o a RA RR eae ee RR 211 A PA NR ee ES 212 Bia Peeples co cs rai a a a RO do ee a a fare a aah 213 Contents C Building gretl CA REQUIERES 2c begs ciis goa C 2 Build instructions a step by step guide D Numerical accuracy E Related free software F Listing of URLs Bibliography vi 217 217 217 221 222 223 224 Chapter 1 Introduction 1 1 Features at a glance Gretl is an econometrics package including a shared library a command line client program and a graphical user interface User friendly Gretl offers an intuitive user interface it is very easy to get up and running with econometric analysis Thanks to its association with the econometrics textbooks by Ramu Ramanathan Jeffrey Wooldridge and James Stock and Mark Watson the package offers many practice data files and command scripts These are well annotated and accessible Two other useful resources for gretl users are the available documentation and the gretl users mailing list Flexible You can choose your preferred point on the spectrum from interactive point and click to batch processing and can easily combine these approaches Cross platform Gretl s home platform is Linux but it is also available for MS Windows and Mac OS X and should work on any unix like system that has the appropriate basic libraries see Appendix C Open source The full source code for gret
312. s a unit accessor but not the equivalent for time What should I use Solution series x time Comment The special construct genr time and its variants are aware of whether a dataset is a panel 13 3 Neat tricks Interaction dummies Problem You want to estimate the model y xX B1 Z B2 d B3 di Z Ba4 t where di is a dummy variable while x and z are vectors of explanatory variables Solution list X x1 x2 x3 list Z z1 z2 list dZ null loop foreach i Z series d i d i list dZ dZ d i end loop ols y X Z d dZ Comment Its amazing what string substitution can do for you isn t it Chapter 13 Cheat sheet 100 Realized volatility Problem Given data by the minute you want to compute the realized volatility for the hour as RV 5 po y Imagine your sample starts at time 1 1 Solution smpl full genr time genr minute int time 60 1 genr second time 60 setobs minute second panel genr rv psd y A2 setobs 1 1 smpl second 1 restrict store foo rv Comment Here we trick gretl into thinking that our dataset is a panel dataset where the minutes are the units and the seconds are the time this way we can take advantage of the special function psd panel standard deviation Then we simply drop all observations but one per minute and save the resulting data store foo rv translates as store in the gretl datafile foo gdt the series rv Lo
313. s builds a matrix using all the series in the current dataset apart from the constant variable 0 When this dummy list is used it must be the sole element in the matrix definition You can however create a matrix that includes the constant along with all other variables using horizontal concatenation see below as in matrix A const dataset The syntax matrix A creates an empty matrix a matrix with zero rows and zero columns See section 12 2 for a discussion of this object t Names of matrices must satisfy the same requirements as names of gretl variables in general the name can be no longer than 15 characters must start with a letter and must be composed of nothing but letters numbers and the underscore character 12 2 Empty matrices The main purpose of the concept of an empty matrix is to enable the user to define a starting point for subsequent concatenation operations For instance if X is an already defined matrix of any size the commands matrix A matrix B A X result in a matrix B identical to X From an algebraic point of view one can make sense of the idea of an empty matrix in terms of vector spaces if a matrix is an ordered set of vectors then A is the empty set As a consequence operations involving addition and multiplications don t have any clear meaning arguably they have none at all but operations involving the cardinality of this set that is the dimension of the spa
314. s graphical interface For details on how to do this see section 10 5 Chapter 17 Maximum likelihood estimation 134 Example 17 3 Zero inflated Poisson Model internal functions compute the log probabilities for the plain Poisson model function In_poi_prob series y list X matrix beta series xb lincomb X beta series ret exp xb y xb Ingamma y 1 return series ret end function compute the log probabilities for the zero inflated Poisson model EJ function In_zip_prob series y list X matrix beta scalar p0 check if the probability is in 0 1 otherwise return NA if p0 gt 1 C p0 lt 0 series ret NA else series ret series ret endif return series ret end function 1n_poi_prob y X beta 1n 1 p0 y 0 In pO exp ret ret do the actual estimation silently ey function zip_estimate series y list X initialize alpha to a sensible value half the frequency of zeros in the sample scalar alpha mean y 0 2 initialize the coeffs we assume the first explanatory variable is the constant here matrix coef zeros nelem X 1 coef 1 mean y 1 alpha do the actual ML estimation mle 11 In_zip_prob y X coef alpha params alpha coef end mle hessian quiet matrix ret coeff vcv return matrix ret end function Chapter 18 GMM estimation 18 1 Introduction and terminology The Generalized Method of Moments GMM is a very powerful and gener
315. s y 0 x save the residuals genr ui uhat scalar ybar mean y number of replications for bootstrap scalar replics 10000 scalar tcount 0 series ysim 0 loop replics quiet generate simulated y by resampling ysim ybar resample ui ols ysim 0 x scalar tsim abs coeff x stderr x tcount tsim gt 2 5 endloop printf proportion of cases with t gt 2 5 g n tcount replics Chapter 5 Special functions in genr 35 5 6 Cumulative densities and p values The two functions cdf and pvalue provide complementary means of examining values from several probability distributions the standard normal Student s t x F gamma and binomial The syntax of these functions is set out in the Gretl Command Reference here we expand on some subtleties The cumulative density function or CDF for a random variable is the integral of the variable s density from its lower limit typically either co or 0 to any specified value x The p value at least the one tailed right hand p value as returned by the pvalue function is the complementary probability the integral from x to the upper limit of the distribution typically 0 In principle therefore there is no need for two distinct functions given a CDF value pp you could easily find the corresponding p value as 1 po or vice versa In practice with finite precision computer arithmetic the two functions are not redundant This requires a little explanation In
316. scanf fd d d C A reshape fscanf fd g r c c r fclose fd Ox decl A loadmat a mat R A lt as matrixCread table a mat skip 1 Example 12 2 Matrix input output via text files nulldata 64 scalar n 3 string f1 a csv string f2 b csv matrix a mnormal n n matrix b inv a err mwrite a f1 if err 0 fprintf Failed to write s n f1 else err mwrite b f2 endif if err 0 fprintf Failed to write s n f2 else c mread f1 d mread f2 a c d printf The following matrix should be an identity matrix n print a endif 3Matlab users may find the Octave example helpful since the two programs are mostly compatible with one another Chapter 12 Matrix manipulation 93 12 7 Matrix accessors In addition to the matrix functions discussed above various accessor strings allow you to create copies of internal matrices associated with models previously estimated These are set out in Table 12 4 coeff vector of estimated coefficients compan companion matrix after VAR or VECM estimation jalpha matrix a loadings from Johansen s procedure jbeta matrix f cointegration vectors from Johansen s procedure jvbeta covariance matrix for the unrestricted elements of f from Johansen s procedure rho autoregressive coefficients for error process sigma residual covariance matrix stderr vector of estimated standard errors uhat matrix of residuals
317. should impose restrictions on uy and p that are consistent with this judgement For example suppose that the data do not exhibit a discernible trend This means that Ay is on average zero so it is reasonable to assume that its expected value is also zero Write equation 21 2 as T LD A Y Ho M1 t QZt 1 Et 21 4 where zt B y is assumed to be stationary and therefore to possess finite moments Taking unconditional expectations we get 0O 0 Hm1 t 0amz Since the left hand side does not depend on t the restriction p 0 is a safe bet As for uo there are just two ways to make the above expression true either Ug 0 with mz 0 or Ho equals amz The latter possibility is less restrictive in that the vector uy may be non zero but is constrained to be a linear combination of the columns of In that case y can be written as c and one may write 21 4 as T L Ay 0a B c des The long run relationship therefore contains an intercept This type of restriction is usually written x Ho 0 where amp is the left null space of the matrix An intuitive understanding of the issue can be gained by means of a simple example Consider a series x which behaves as follows Xt M Xt 1 Et where m is a real number and e is a white noise process x is then a random walk with drift m In the special case m 0 the drift disappears and x is a pure random walk Consider now another process y d
318. son artificial data leverage Influential observations artificial data mou n longley Multicollinearity US employment If you want to make your own data collection available to users these are the steps 1 Assemble the data in whatever format is convenient Chapter 4 Data files 29 2 Convert the data to gretl format and save as gdt files Itis probably easiest to convert the data by importing them into the program from plain text CSV or a spreadsheet format MS Excel or Gnumeric then saving them You may wish to add descriptions of the individual variables the Variable Edit attributes menu item and add information on the source of the data the Data Edit info menu item 3 Write a descriptions file for the collection using a text editor 4 Put the datafiles plus the descriptions file in a subdirectory of the gretl data directory or user directory 5 If the collection is to be distributed to other people package the data files and catalog in some suitable manner e g as a zipfile If you assemble such a collection and the data are not proprietary we would encourage you to submit the collection for packaging as a gretl optional extra Chapter 5 Special functions in genr 5 1 Introduction The genr command provides a flexible means of defining new variables It is documented in the Gretl Command Reference This chapter offers a more expansive discussion of some of the special functio
319. sqft list cubelist make_cubes xlist print xlist cubelist byobs labels Note that the return statement does not cause the function to return exit at the point where it appears within the body of the function Rather it specifies which variable is available for assign ment when the function exits and a function exits only when a the end of the function code is reached b a gretl error occurs or c a funcerr statement is reached The funcerr keyword which may be followed by a string enclosed in double quotes causes a function to exit with an error flagged If a string is provided this is printed on exit otherwise a generic error message is printed This mechanism enables the author of a function to pre empt an ordinary execution error and or offer a more specific and helpful error message For example if nelem xlist 0 funcerr xlist must not be empty end if Error checking When gretl first reads and compiles a function definition there is minimal error checking the only checks are that the function name is acceptable and so far as the body is concerned that you Chapter 10 User defined functions 70 are not trying to define a function inside a function see Section 10 1 Otherwise if the function body contains invalid commands this will become apparent only when the function is called and its commands are executed Debugging The usual mechanism whereby gretl echoes commands and reports on the creation o
320. ste the output of xtab to some other program e g a spreadsheet you may want to use the zeros option this option causes cells with zero frequency to display the number 0 instead of being empty Chapter 9 Loop constructs 9 1 Introduction The command loop opens a special mode in which gretl accepts a block of commands to be re peated zero or more times This feature may be useful for among other things Monte Carlo simulations bootstrapping of test statistics and iterative estimation procedures The general form of a loop is loop control expression progressive verbose quiet loop body endloop Five forms of control expression are available as explained in section 9 2 Not all gretl commands are available within loops The commands that are not presently accepted in this context are shown in Table 9 1 Table 9 1 Commands not usable in loops boxplot corrgm cusum data delete eqnprint function hurst include leverage modeltab nulldata open glrtest rmplot run scatters setmiss setobs tabprint vif xcorrgm By default the genr command operates quietly in the context of a loop without printing informa tion on the variable generated To force the printing of feedback from genr you may specify the verbose option to loop The quiet option suppresses the usual printout of the number of iterations performed which may be desirable when loops are nested The progressive option to loop modifies the behavior of the
321. such that F w A e A e7 is the best approximation to F w for a given k Clearly the higher k the better the approximation is but since 2k observations have to be discarded a compromise is usually sought Moreover the filter has also other appealing theoretical properties among which the property that A 1 0 so a series with a single unit root is made stationary by application of the filter In practice the filter is normally used with monthly or quarterly data to extract the business cycle component namely the component between 6 and 36 quarters Usual choices for k are 8 or 12 maybe higher for monthly series The default values for the frequency bounds are 8 and 32 and the default value for the approximation order k is 8 You can adjust these values using the set command The keyword for setting the frequency limits is bkbp_limits and the keyword for k is bkbp_k Thus for example if you were using monthly data and wanted to adjust the frequency bounds to 18 and 96 and k to 24 you could do set bkbp_limits 18 96 set bkbp_k 24 These values would then remain in force for calls to the bkfi1t function until changed by a further use of set 5 4 Panel data specifics Dummy variables In a panel study you may wish to construct dummy variables of one or both of the following sorts a dummies as unique identifiers for the units or groups and b dummies as unique identifiers for the time periods The former may be used
322. t it is possible to compute an unbiased pre dictor of y by summing this estimate to x f Script 22 5 shows an example As a further similarity with Tobit the interval regression model may deliver inconsistent estimates if the disturbances are non normal hence the Chesher Irish 1987 test for normality is included by default here too 22 6 Sample selection model In the sample selection model also known as Tobit II model there are two latent variables k yt Y xiyBy tei 22 10 jel ll Ms ZijYj Ni 22 11 RU Il m and the observation rule is given by y for s gt 0 22 12 ue f for s lt 0 In this context the 4 symbol indicates that for some observations we simply do not have data on y yi may be 0 or missing or anything else A dummy variable d is normally used to set censored observations apart One of the most popular applications of this model in econometrics is a wage equation coupled with a labor force participation equation we only observe the wage for the employed If y and sf were conditionally independent there would be no reason not to use OLS for estimating equation 22 10 otherwise OLS does not yield consistent estimates of the parameters fj Chapter 22 Discrete and censored dependent variables 181 Since conditional independence between y and sf is equivalent to conditional independence be tween and n one may model the co dependence between e and ni as
323. t its contour lines are banana shaped It is defined by f x y 1 x 100 y x2 Chapter 5 Special functions in genr 38 Example 5 2 Finding the minimum of the Rosenbrock function function Rosenbrock matrix param scalar x param 1 scalar y param 2 scalar f 1 x A2 100 y xA2 A2 return scalar f end function nulldata 10 matrix theta 10 0 set max_verbose 1 M BFGSmax theta Rosenbrock theta print theta The function has a global minimum at x y 1 1 where f x y 0 Example 5 2 shows a gretl script that discovers the minimum using BFGSmax giving a verbose account of progress Computing a Jacobian Gretl offers the possibility of differentiating numerically a user defined function via the fdjac function This function again takes two arguments an n x 1 matrix holding initial parameter values and a function call that calculates and returns an m x 1 matrix given the current parameter values and any other relevant data On successful completion it returns an m x n matrix holding the Jacobian For example matrix Jac fdjac theta SumOC amp theta amp X where we assume that SumOC is a user defined function with the following structure function SumOC matrix theta matrix X matrix V do some computation return matrix V end function This may come in handy in several cases for example if you use BFGSmax to estimate a model you may wish to calculate a numerical approx
324. t series with the names of the newly created variables governed by the string argument function mavg series y string vname series Qvname_2 y y 1 2 series Qvname_4 y y 1 y 2 y 3 4 list retlist vname_2 vname_4 return list retlist end function open data9 9 list malist mavg nocars nocars print malist byobs The last line of the script will print two variables named nocars_2 and nocars_4 For details on the handling of named strings see chapter 11 If a string argument is considered optional it may be given a null default value as in function foo series y string vname nul1 Retrieving the names of arguments The variables given as arguments to a function are known inside the function by the names of the corresponding parameters For example within the function whose signature is function somefun series y we have the series known as y It may be useful however to be able to determine the names of the variables provided as arguments This can be done using the function argname which takes the name of a function parameter as its single argument and returns a string Here is a simple illustration function namefun series y printf the series given as y was named s n argname y end function open data9 7 namefun QNC Chapter 10 User defined functions 69 This produces the output the series given as y was named QNC Please note that this will not always work the arg
325. tab in the Gretl Command Reference Chapter 3 Modes of working 18 The graph page The graph page icon in the session window offers a means of putting together several graphs for printing on a single page This facility will work only if you have the BIFX typesetting system installed and are able to generate and view either PDF or PostScript output In the Icon view window you can drag up to eight graphs onto the graph page icon When you double click on the icon or right click and select Display a page containing the selected graphs in PDF or EPS format will be composed and opened in your viewer From there you should be able to print the page To clear the graph page right click on its icon and select Clear On systems other than MS Windows you may have to adjust the setting for the program used to view postscript Find that under the Programs tab in the Preferences dialog box under the Tools menu in the main window On Windows you may need to adjust your file associations so that the appropriate viewer is called for the Open action on files with the ps extension FIXME discuss PDF here Saving and re opening sessions If you create models or graphs that you think you may wish to re examine later then before quitting gretl select Session files Save session from the File menu and give a name under which to save the session To re open the session later either e Start gretl then re
326. table For example imagine that the database holds a table called NatAccounts containing the data shown in Table B 1 The SQL statement SELECT qtr tradebal gdp FROM NatAccounts WHERE year 1970 lSee http en wikipedia org wiki SQL Appendix B Data import via ODBC 212 year qtr gdp consump tradebal 1970 1 584763 344746 9 5891 01 1970 2 597746 350176 9 7068 71 1970 3 604270 355249 7 8379 27 1970 4 609706 361794 7 7917 61 1971 1 609597 362490 6274 3 1971 2 617002 368313 6 6658 76 1971 3 625536 372605 4795 89 1971 4 630047 377033 9 6498 13 Table B 1 The NatAccounts table produces the subset of the original data shown in Table B 2 qtr tradebal gdp 1 5891 01 584763 2 7068 71 597746 3 8379 27 604270 4 7917 61 609706 Table B 2 Result of a SELECT statement Gretl provides a mechanism for forwarding your query to the DBMS via ODBC and including the results in your currently open dataset B 2 Syntax At present ODBC import is only possible via the command line interface The two commands that gretl uses at present for fetching data via an ODBC connection are open and data The open command is used for connecting to a DBMS its syntax is open dsn database user username password password odbc The user and password items are optional the effect of this command is to initiate an ODBC connection Itis assumed that the machine gretl runs on has a working ODBC client installed
327. tave is a high level language primarily intended for numerical computations It provides a convenient command line interface for solving linear and nonlinear problems numerically and for performing other numerical experiments using a language that is mostly compatible with Matlab It may also be used as a batch oriented language e JMulTi www jmulti de JMulTi was originally designed as a tool for certain econometric pro cedures in time series analysis that are especially difficult to use and that are not available in other packages like Impulse Response Analysis with bootstrapped confidence intervals for VAR VEC modelling Now many other features have been integrated as well to make it possi ble to convey a comprehensive analysis Comment JMulTi is a java GUI program you need a java run time environment to make use of it As mentioned above gretl offers the facility of exporting data in the formats of both Octave and R In the case of Octave the gretl data set is saved as a single matrix X You can pull the X matrix apart if you wish once the data are loaded in Octave see the Octave manual for details As for R the exported data file preserves any time series structure that is apparent to gretl The series are saved as individual structures The data should be brought into R using the source command In addition gretl has a convenience function for moving data quickly into R Under gretl s Tools menu you will find the
328. te that in this sort of loop the index variable is always incremented by one at each iteration If for example you have loop i m n where m and n are scalar variables with values m gt n at the time of execution the index will not be decremented rather the loop will simply be bypassed If you need more complex loop control see the for form below The index loop is particularly useful in conjunction with the values matrix function when some operation must be carried out for each value of some discrete variable see chapter 8 Consider the following example lt is common programming practice to use simple one character names for such variables However you may use any name that is acceptable by gretl up to 15 characters starting with a letter and containing nothing but letters numerals and the underscore character Chapter 9 Loop constructs 56 open greene22_2 open greene22_2 discrete Z8 v8 values Z8 n rows v8 loop i 1 n scalar xi v8 i smpl Z8 xi restrict replace printf mean Y Z8 g 8 5f sd Y Z8 g g n xi mean Y xi sd Y end loop In this case we evaluate the conditional mean and standard deviation of the variable Y for each value of Z8 Foreach loop The fourth form of loop control also uses an index variable in this case to index a specified list of strings The loop is executed once for each string in the list This can be useful for performing repetitive
329. ted as a matter of course In fact the discussion of the classical standard errors labeled homoskedasticity only is confined to an Appendix Against this background it may be useful to set out and discuss all the various options offered by gretl in respect of robust covariance matrix estimation The first point to notice is that gretl produces classical standard errors by default in all cases apart from GMM estimation In script mode you can get robust standard errors by appending the robust flag to estimation commands In the GUI program the model specification dialog usually contains a Robust standard errors check box along with a configure button that is activated when the box is checked The configure button takes you to a configuration dialog which can also be reached from the main menu bar Tools Preferences General HCCME There you can select from a set of possible robust estimation variants and can also choose to make robust estimation the default The specifics of the available options depend on the nature of the data under consideration cross sectional time series or panel and also to some extent the choice of estimator Although we introduced robust standard errors in the context of OLS above they may be used in conjunction with other estimators too The following three sections of this chapter deal with matters that are specific to the three sorts of data just mentioned Note that addit
330. ter estimates are calculated using the Barrodale Roberts method This is simply because the Frisch Newton code does not currently support the calculation of confidence intervals Two further details First the mechanisms for generating confidence intervals for quantile esti mates require that the model has at least two regressors including the constant If the intervals option is given for a model containing only one regressor an error is flagged Second when a model is estimated in this mode you can retrieve the confidence intervals using the accessor coeff_ci This produces a k x 2 matrix where k is the number of regressors The lower bounds are in the first column the upper bounds in the second See also section 23 5 below 23 4 Multiple quantiles As a further option you can give tau as a matrix either the name of a predefined matrix or in numerical form as in 05 25 5 75 95 The given model is estimated for all the T values and the results are printed in a special form as shown below in this case the intervals option was also given Model 1 Quantile estimates using the 235 observations 1 235 Dependent variable foodexp With 90 percent confidence intervals VARIABLE TAU COEFFICIENT LOWER UPPER const 0 05 124 880 98 3021 130 517 0 25 95 4835 73 7861 120 098 0 50 81 4822 53 2592 114 012 0 75 62 3966 32 7449 107 314 0 95 64 1040 46 2649 83 5790 income 0 05 0 343361 0 343327 0 389750 0 25 0 474103 0 420330
331. tes of the three components by plotting them together with the original data you should get a graph similar to Figure 25 6 The estimates of the variances can be seen by printing the vars matrix as in print vars vars 4 x 1 0 00077185 4This example will work on Linux and presumably on OSX without modifications On the Windows platform you may have to substitute the character with Chapter 25 Gretl and R 202 lg 6 6 T T 6 2 52 o 6 5 8 5 8 5 6 5 6 5 4 5 4 5 2 52 5 5 48 48 4 6 4 6 1949 1955 1961 1949 1955 1961 slope sea 0 01025 T T 0 3 T T 0 25 7 0 0102 4 0 2 E 7 0 15 7 0 1 4 01015 4 0 01015 068 7 0 0 0101 4 PT 7 0 1 H 4 0 01005 FP 7 0 15 F 7 0 2 4 0 01 ___ t _ 0 25 m 1949 1955 1961 1949 1955 1961 Figure 25 6 Estimated components from BSM 0 0000 0 0013969 0 0000 That is Gy 0 00077185 GZ 0 6 0 0013969 62 0 Notice that since G2 0 the estimate for B is constant and the level component is simply a random walk with a drift 25 5 Interacting with R from the command line Up to this point we have spoken only of interaction with R via the GUI program In order to do the same from the command line interface gretl provides the foreign command This enables you to embed non native commands within a gretl script A foreign block takes the form foreign language R send d
332. that in addition to the above requirements offers GTK version 2 4 0 or higher see gtk org Gretl calls gnuplot for graphing You can find gnuplot at gnuplot info As of this writing the most recent official release is 4 2 of March 2007 The MS Windows version of gretl comes with a Windows version gnuplot 4 2 the gretl website also offers an rpm of gnuplot 3 8j0 for x86 Linux systems Some features of gretl make use of portions of Adrian Feguin s gtkextra library The relevant parts of this package are included in slightly modified form with the gretl source distribution A binary version of the program is available for the Microsoft Windows platform Windows 98 or higher This version was cross compiled under Linux using mingw the GNU C compiler gcc ported for use with win32 and linked against the Microsoft C library msvcrt d11 It uses Tor Lillqvist s port of GTK 2 0 to win32 The free open source Windows installer program is courtesy of Jordan Russell jrsoftware org C 2 Build instructions a step by step guide In this section we give instructions detailed enough to allow a user with only a basic knowledge of a Unix type system to build gretl These steps were tested on a fresh installation of Debian Etch For other Linux distributions especially Debian based ones like Ubuntu and its derivatives little should change Other Unix like operating systems such as MacOSX and BSD would probably require more substantial adjus
333. the Newey West data based method does not fully pin down the bandwidth for any particular sample The first step involves calculating a series of residual covariances The length of this series is given as a function of the sample size but only up to a scalar multiple for example it is given as O T for the Bartlett kernel Gretl uses an implied multiple of 1 14 4 Special issues with panel data Since panel data have both a time series and a cross sectional dimension one might expect that in general robust estimation of the covariance matrix would require handling both heteroskedasticity and autocorrelation the HAC approach In addition some special features of panel data require attention e The variance of the error term may differ across the cross sectional units e The covariance of the errors across the units may be non zero in each time period e If the between variation is not removed the errors may exhibit autocorrelation not in the usual time series sense but in the sense that the mean error for unit i may differ from that of unit j This is particularly relevant when estimation is by pooled OLS Gretl currently offers two robust covariance matrix estimators specifically for panel data These are available for models estimated via fixed effects pooled OLS and pooled two stage least squares The default robust estimator is that suggested by Arellano 2003 which is HAC provided the panel is of the large n small
334. the main gretl program item under the Programs heading in the Windows Start menu Once the updater has completed its work you may restart gretl Part I Running the program Chapter 2 Getting started 2 1 Let s run a regression This introduction is mostly angled towards the graphical client program please see Chapter 27 below and the Gretl Command Reference for details on the command line program gretlcli You can supply the name of a data file to open as an argument to gretl but for the moment let s not do that just fire up the program You should see a main window which will hold information on the data set but which is at first blank and various menus some of them disabled at first What can you do at this point You can browse the supplied data files or databases open a data file create a new data file read the help items or open a command script For now let s browse the supplied data files Under the File menu choose Open data Sample file A second notebook type window will open presenting the sets of data files supplied with the package see Figure 2 1 Select the first tab Ramanathan The numbering of the files in this section corresponds to the chapter organization of Ramanathan 2002 which contains discussion of the analysis of these data The data will be useful for practice purposes even without the text 2013 x 4 em Greene Gujarati Penn World Table Ramanathan Stock Watson A y
335. the relevant function definitions Clicking Save in this dialog leads you to a File Save dialog All being well this should be pointing towards a directory named functions either under the gretl system directory if you have write permission on that or the gretl user directory This is the recommended place to save function package files since that is where the program will look in the special routine for opening such files see below Needless to say the menu command File Function files Edit package allows you to make changes to a local function package Chapter 10 User defined functions 72 gretl function package editor Author A U Thor Version 1 0 Date YYYY MM DD 2008 07 29 Package description Percentage change Minimum gretl version 1 6 z Data requirement Time series data x Help text for pc Usage pc x x must be a time series Returns x t x t 1 x t 1 Edit function code Save as script Edit sample script O Upload package to server on save rs Figure 10 2 The package editor window A word on the file you just saved By default it will have a gfn extension This is a function package file unlike an ordinary gretl script file it is an XML file containing both the function code and the extra information entered in the packager Hackers might wish to write such a file from scratch rather than using the GUI packager but most people are likely to find it awkward Note
336. the stacked series but this time we want gretl to start reading from the 50th row of the original data and we specify offset 50 Line 4 imposes a panel interpretation on the data finally we save the data in gretl format with the panel interpretation discarding the original variables p1 through p5 The illustrative script above is appropriate when the number of variable to be processed is small When then are many variables in the data set it will be more efficient to use a command loop to accomplish the stacking as shown in the following script The setup is presumed to be the same as in the previous section 50 units 5 periods but with 20 variables rather than 2 open panel txt loop for i 1 20 genr k i 1 50 genr x i stack p1 p5 offset k length 50 endloop setobs 50 1 01 stacked cross section store panel gdt x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 Panel data marker strings It can be helpful with panel data to have the observations identified by mnemonic markers A special function in the genr command is available for this purpose In the example above suppose all the states are identified by two letter codes in the left most column of the original datafile When the stacking operation is performed these codes will be stacked along with the data values If the first row is marked AR for Arkansas then the marker AR will end up being shown on each row containing an observatio
337. they are not testable and should have no effect on the max imized likelihood In addition however one may wish to formulate constraints on B and or that derive from the economic theory underlying the equilibrium relationships substantive restrictions of this sort are then testable via a likelihood ratio statistic Gretl is capable of testing general linear restrictions of the form Ryvec B q 21 5 and or Ravec a 0 21 6 Note that the restriction may be non homogeneous q 0 but the restriction must be homo geneous Nonlinear restrictions are not supported and neither are restrictions that cross between 3Note that in this context we are bending the usual matrix indexation convention using the leading index to refer to the column of the particular cointegrating vector This is standard practice in the literature and defensible insofar as it is the columns of the cointegrating relations or equilibrium errors that are of primary interest Chapter 21 Cointegration and Vector Error Correction Models 165 B and a In the case where r gt 1 such restrictions may be in common across all the columns of f or ax or may be specific to certain columns of these matrices This is the case discussed in Boswijk 1995 and Boswijk and Doornik 2004 section 4 4 The restrictions 21 5 and 21 6 may be written in explicit form as vec B Hd ho 21 7 and vec a Gw 21 8 respectively where and y are th
338. tion hence the maximum must be found via numerical methods In some cases convergence may be difficult and gretl provides several choices to solve the problem Switching and LBFGS Two maximization methods are available in gretl The default is the switching algorithm set out in Boswijk and Doornik 2004 The alternative is a limited memory variant of the BFGS algorithm LBFGS using analytical derivatives This is invoked using the 1bfgs flag with the restrict command The switching algorithm works by explicitly maximizing the likelihood at each iteration with re spect to W and Q the covariance matrix of the residuals in turn This method shares a feature with the basic Johansen eigenvalues procedure namely it can handle a set of restrictions that does not fully identify the parameters LBFGS on the other hand requires that the model be fully identified When using LBFGS therefore you may have to supplement the restrictions of interest with normalizations that serve to identify the parameters For example one might use all or part of the Phillips normalization see section 21 5 Neither the switching algorithm nor LBFGS is guaranteed to find the global ML solution The optimizer may end up at a local maximum or in the case of the switching algorithm at a saddle point The solution or lack thereof may be sensitive to the initial value selected for O By default gretl selects a starting point using a deterministic m
339. tion to the kpss command Chapter 20 Time series models 153 kpss n y trend Note that in this case the asymptotic distribution of the test is different and the critical values reported by gretl differ accordingly Cointegration tests FIXME discuss Engle Granger here and refer forward to the next chapter for the Johansen tests 20 4 ARCH and GARCH Heteroskedasticity means a non constant variance of the error term in a regression model Autore gressive Conditional Heteroskedasticity ARCH is a phenomenon specific to time series models whereby the variance of the error displays autoregressive behavior for instance the time series ex hibits successive periods where the error variance is relatively large and successive periods where it is relatively small This sort of behavior is reckoned to be quite common in asset markets an unsettling piece of news can lead to a period of increased volatility in the market An ARCH error process of order q can be represented as q Ut Ot Et of E uz Qp 1 0 gt ou i l where the s are independently and identically distributed iid with mean zero and variance 1 and where 0 is taken to be the positive square root of OP Qt denotes the information set as of time t 1 and f is the conditional variance that is the variance conditional on information dated t 1 and earlier It is important to notice the difference between ARCH and an ordinary autoregressive error
340. tl export compon end foreign append dotdir compon csv rename level sx_level rename slope sx_slope rename sea sx_seas list ret sx_level sx_slope sx_seas return list ret end function open bjg gdt list X RStructTS 1g 203 Chapter 25 Gretl and R 204 The above syntax despite being already quite useful by itself shows its full power when it is used in conjunction with user written functions Example 25 2 shows how to define a gretl function that calls R internally A note on performance at present when R is called from within gretl using a foreign block the R program is started up on each invocation which can be quite time consuming For maximum performance you should organize your script so as to group together as many R operations as possible hence minimizing the number of distinct foreign blocks SIn future we may be able to improve on this using calls to the R shared library in place of invocations of the program Chapter 26 Troubleshooting gretl 26 1 Bug reports Bug reports are welcome Hopefully you are unlikely to find bugs in the actual calculations done by gretl although this statement does not constitute any sort of warranty You may however come across bugs or oddities in the behavior of the graphical interface Please remember that the usefulness of bug reports is greatly enhanced if you can be as specific as possible what exactly went wrong under what conditions and on what operating
341. tments lUp till version 1 5 1 gretl could also be built using GTK 1 2 Support for this was dropped at version 1 6 0 of gretl 217 Appendix C Building gretl 218 In this guided example we will build gretl complete with documentation This introduces a few more requirements but gives you the ability to modify the documentation files as well like the help files or the manuals Installing the prerequisites We assume that the basic GNU utilities are already installed on the system together with these other programs e some TpX BTpXsystem tetex or texlive will do beautifully e Gnuplot e ImageMagick We also assume that the user has administrative privileges and knows how to install packages The examples below are carried out using the apt get shell command but they can be performed with menu based utilities like aptitude dselect or the GUI based program synaptic Users of Linux distributions which employ rpm packages e g Red Hat Fedora Mandriva SuSE may want to refer to the dependencies page on the gretl website The first step is installing the C compiler and related utilities On a Debian system these are contained in a bunch of packages that can be installed via the command apt get install gcc autoconf automake1 9 libtool flex bison gcc doc libc6 dev Tibc dev libgfortranl libgfortranl dev gettext pkgconfig Then it is necessary to install the development dev packages for the libraries that gretl uses
342. to isl since xlogs is in fact a named list The second genr will assign O to 1s2 since x11 is a data series not a list You can also determine the number of variables or elements in a list using the function nelem Chapter 11 Named lists and strings 77 list xlist 1 2 3 nl nelem xlist The scalar variable n1 will be assigned a value of 3 since xlist contains 3 members You can display the membership of a named list just by giving its name as illustrated in this interactive session list xlist x1 x2 x3 Added list xlist xlist x1 x2 x3 Note that print xlist will do something different namely print the values of all the variables in xlist as should be expected Generating lists of transformed variables Given a named list of variables you are able to generate lists of transformations of these variables using the functions log lags diff I1diff sdi ff or dummi fy For example list xlist x1 x2 x3 list Ixlist log xlist list difflist diff xlist When generating a list of lags in this way you specify the maximum lag order inside the parenthe ses before the list name and separated by a comma For example list xlist x1 x2 x3 list laglist lags 2 xlist or scalar order 4 list laglist lags order xlist These commands will populate lag1ist with the specified number of lags of the variables in xlist You can give the name of a single series in place of a list as the second argument to
343. typewriter style font e g Courier to preserve the output s tabular formatting Select a small font 10 point Courier should do to prevent the output lines from being broken in the wrong place 2Note that when you copy as RTF under MS Windows Windows will only allow you to paste the material into appli cations that understand RTF Thus you will be able to paste into MS Word but not into notepad Note also that there appears to be a bug in some versions of Windows whereby the paste will not work properly unless the target application e g MS Word is already running prior to copying the material in question Chapter 2 Getting started 8 2 3 The main window menus Reading left to right along the main window s menu bar we find the File Tools Data View Add Sample Variable Model and Help menus File Tools Data View Add Sample Variable Model e File menu Open data Open a native gretl data file or import from other formats See Chapter 4 Append data Add data to the current working data set from a gretl data file a comma separated values file or a spreadsheet file Save data Save the currently open native gretl data file Save data as Write out the current data set in native format with the option of using gzip data compression See Chapter 4 Export data Write out the current data set in Comma Separated Values CSV format or the formats of GNU R or GNU Octave See Chapter 4 and also Appendix
344. uals 0 The above reasoning can be generalized as follows suppose O is an n vector and we have m relations like Elfi x1 0 0 fori 1 m 18 3 where E is a conditional expectation on a set of p variables z called the instruments In the above simple example m 1 and f x 0 x g 0 and the only instrument used is z 1 Then it must also be true that E filxe 0 zie E fi 0 0 fori 1 m and j 1 p 18 4 equation 18 4 is known as an orthogonality condition or moment condition The GMM estimator is defined as the minimum of the quadratic form F 0 W f Wf 18 5 135 Chapter 18 GMM estimation 136 where fis a 1 x m p vector holding the average of the orthogonality conditions and W is some symmetric positive definite matrix known as the weights matrix A necessary condition for the minimum to exist is the order condition n lt m p The statistic 0 Argmin F 0 W 18 6 0 is a consistent estimator of 0 whatever the choice of W However to achieve maximum asymp totic efficiency W must be proportional to the inverse of the long run covariance matrix of the orthogonality conditions if W is not known a consistent estimator will suffice These considerations lead to the following empirical strategy 1 Choose a positive definite W and compute the one step GMM estimator 6 Customary choices for W are Im p Or Im 8 Z Z 2 Use to estimate V Fi 0 and use its inverse as the
345. ues from 1 An alternative perspective is given by GMM We define the residual t as y XB as usual But instead of relying on E u X 0 as in OLS we base estimation on the condition E u Z 0 In this case it is natural to base the initial weighting matrix on the covariance matrix of the instruments Example 18 2 presents a model from Stock and Watson s Introduction to Econometrics The demand for cigarettes is modeled as a linear function of the logs of price and income income is treated as exogenous while price is taken to be endogenous and two measures of tax are used as instruments Since we have two instruments and one endogenous variable the model is over identified and there fore the weights matrix will influence the solution Partial output from this script is shown in 18 3 The estimated standard errors from GMM are robust by default if we supply the robust option to the ts1s command we get identical results 18 4 Covariance matrix options The covariance matrix of the estimated parameters depends on the choice of W through PWI wows wy t 18 8 where J is a Jacobian term g pe a 00 and Q is the long run covariance matrix of the orthogonality conditions Gretl computes J by numeric differentiation there is no provision for specifying a user supplied analytical expression for J at the moment As for Q a consistent estimate is needed The simplest choice is the sample covariance matrix of the fts T 000
346. uiet if lisnull ess Chapter 10 User defined functions 66 ess ess endif series uh uhat return series uh end function If the caller does not care to get the ess value it can use nu11 in place of a real argument series resid get_uhat_and_ess price xlist null Alternatively trailing function arguments that have default values may be omitted so the following would also be a valid call series resid get_uhat_and_ess price xlist Pointer arguments may also be useful for optimizing performance even if a variable is not modified inside the function it may be a good idea to pass it as a pointer if it occupies a lot of memory Otherwise the time gretl spends transcribing the value of the variable to the local copy may be non negligible compared to the time the function spends doing the job it was written for Example 10 1 takes this to the extreme We define two functions which return the number of rows of a matrix a pretty fast operation One function gets a matrix as argument the other one a pointer to a matrix The two functions are evaluated on a matrix with 2000 rows and 2000 columns on a typical system floating point numbers take 8 bytes of memory so the space occupied by the matrix is roughly 32 megabytes Running the code in example 10 1 will produce output similar to the following the actual numbers depend on the machine you re running the example on Elapsed time without pointers copy with pointers
347. ularly recommended you are probably better composing a function non interactively For example suppose you decide to package a function that returns the percentage change of a time series Open a script file and type function pc series y Series to process series foo 100 diff y y 1 return series foo end function In this case we have appended a string to the function argument as explained in section 10 1 so as to make our interface more informative This is not obligatory if you omit the descriptive string gretl will supply a predefined one Now run your function You may want to make sure it works properly by running a few tests For example you may open the console and type genr x uniform genr dpcx pc x print x dpcx byobs Chapter 10 User defined functions 71 y 3 gretl console 10 id JOG 8503 gretl console type help for a list of commands 7 genr x uniform Replaced series x ID 2 7 genr dpcx pc x Generated series dpcx ID 3 print x dpcx byobs Obs x 8024579 6330880 4343051 1391594 3507580 5807393 2800963 8875319 4181660 5418941 0 05245 5216106 9040799 1171185 5927462 4445166 WOANAUAWNHE 00000 0000000000 i i i ree OROOMOOONOGOOPOOO Figure 10 1 Output of function check You should see something similar to figure 10 1 The function seems to work ok Once your function is debugged you may proceed to the next stage Cr
348. uments given to functions may be anonymous variables created on the fly as in somefunClog QNC or somefun CPI 100 In that case the argname function fails to return a string Function writers who wish to make use of this facility should check the return from argname using the isstringQ function which returns 1 when given the name of a string variable 0 otherwise Return values Functions can return nothing just printing a result perhaps or they can return a single variable a scalar series list matrix or string The return value if any is specified via a statement within the function body beginning with the keyword return followed by a type specifier and the name of a variable as in the listing of parameters There can be only one such statement An example of a valid return statement is shown below return scalar SSR Having a function return a list is one way of permitting the return of more than one variable That is you can define several variable inside a function and package them as a list in this case they are not destroyed when the function exits Here is a simple example which also illustrates the possibility of setting the descriptive labels for variables generated in a function function make_cubes list xlist list cubes null loop foreach i xlist quiet series 13 xlist 1 A3 setinfo 13 d cube of i list cubes 13 end loop return list cubes end function open data4 1 list xlist price
349. ut and Tr is inflation The names for these variables in the gretl data file are m_p rl rs y and infl respectively The cointegration rank assumed by the authors is 3 and there are 5 variables giving 15 elements in the 6 matrix 3 x 3 9 restrictions are required for identification and a just identified system would have 15 9 6 free parameters However the postulated long run relationships feature only three free parameters so the over identification rank is 3 Example 21 1 replicates Table 4 on page 824 of the Brand and Cassola article Note that we use the 1n1 accessor after the vecm command to store the unrestricted log likelihood and the r1n1 accessor after restrict for its restricted counterpart The example continues in script 21 2 where we perform further testing to check whether a the income elasticity in the money demand equation is 1 By 1 and b the Fisher relation is homo geneous 1 Since the fu11 switch was given to the initial restrict command additional restrictions can be applied without having to repeat the previous ones The second script contains a few printf commands which are not strictly necessary to format the output nicely It turns out that both of the additional hypotheses are rejected by the data with p values of 0 002 and 0 004 4A traditional formulation of the Fisher equation would reverse the roles of the variables in the second equation but this detail is immaterial in the
350. venient way of formalizing this situation is to consider the variable y as a Bernoulli random variable and analyze its distribution conditional on the explanatory variables x That is eee 22 1 ANO 1 A where P P y 1 x is a given function of the explanatory variables xi In most cases the function P is a cumulative distribution function F applied to a linear combi nation of the x s In the probit model the normal cdf is used while the logit model employs the logistic function A Therefore we have probit Pi F z zi 22 2 1 logit Pi F z A z eres 22 3 k Zi gt xij Bj 22 4 j l where z is commonly known as the index function Note that in this case the coefficients 6 cannot be interpreted as the partial derivatives of E yi x with respect to xij However for a given value of x it is possible to compute the vector of slopes that is OF z Ox slope x Z Z Gretl automatically computes the slopes setting each explanatory variable at its sample mean Another equivalent way of thinking about this model is in terms of an unobserved variable y which can be described thus E ye gt xij j amp Zit i 22 5 j l We observe y 1 whenever y gt 0 and y 0 otherwise If e is assumed to be normal then we have the probit model The logit model arises if we assume that the density function of e is oA Ei er ti de 8 1t e 2 A amp Both the
351. w org Minpack http ww netlib org minpack Penn World Table http pwt econ upenn edu Readline homepage http cnsww cns cwru edu chet readline rltop htm Readline manual http cnsww cns cwru edu chet readline readline html Xmlsoft homepage http xmlsoft org 223 Bibliography Agresti A 1992 A Survey of Exact Inference for Contingency Tables Statistical Science 7 pp 131 53 Akaike H 1974 A New Look at the Statistical Model Identification IEEE Transactions on Auto matic Control AC 19 pp 716 23 Anderson T W and Hsiao C 1981 Estimation of Dynamic Models with Error Components Journal of the American Statistical Association 76 pp 598 606 Andrews D W K and Monahan J C 1992 An Improved Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimator Econometrica 60 pp 953 66 Arellano M 2003 Panel Data Econometrics Oxford Oxford University Press Arellano M and Bond S 1991 Some Tests of Specification for Panel Data Monte Carlo Evidence and an Application to Employment Equations The Review of Economic Studies 58 pp 277 97 Baiocchi G and Distaso W 2003 GRETL Econometric software for the GNU generation Journal of Applied Econometrics 18 pp 105 10 Baltagi B H 1995 Econometric Analysis of Panel Data New York Wiley Barrodale I and Roberts F D K 1974 Solution of an overdetermined system of equat
352. weights matrix The resulting esti mator 0 is called the two step estimator 3 Re estimate V fi 0 by means of 02 and obtain 63 iterate until convergence Asymp totically these extra steps are unnecessary since the two step estimator is consistent and efficient however the iterated estimator often has better small sample properties and should be independent of the choice of W made at step 1 In the special case when the number of parameters n is equal to the total number of orthogonality conditions m p the GMM estimator 0 is the same for any choice of the weights matrix W so the first step is sufficient in this case the objective function is 0 at the minimum If on the contrary n lt m p the second step or successive iterations is needed to achieve efficiency and the estimator so obtained can be very different in finite samples from the one step estimator Moreover the value of the objective function at the minimum suitably scaled by the number of observations yields Hansen s J statistic this statistic can be interpreted as a test statistic that has a x distribution with m p n degrees of freedom under the null hypothesis of correct specification See Davidson and MacKinnon 1993 section 17 6 for details In the following sections we will show how these ideas are implemented in gretl through some examples 18 2 OLS as GMM It is instructive to start with a somewhat contrived example consider the linear
353. which has the same basename as the data file plus the suffix hdr This file contains in order e Optional comments on the data set off by the opening string and the closing string each of these strings to occur on lines by themselves Required list of white space separated names of the variables in the data file Names are limited to 8 characters must start with a letter and are limited to alphanumeric characters plus the underscore The list may continue over more than one line it is terminated with a semicolon Required observations line of the form 1 1 85 The first element gives the data fre quency 1 for undated or annual data 4 for quarterly 12 for monthly The second and third elements give the starting and ending observations Generally these will be 1 and the number of observations respectively for undated data For time series data one can use dates of the form 1959 1 quarterly one digit after the point or 1967 03 monthly two digits after the point See Chapter 15 for special use of this line in the case of panel data e The keyword BYOBS 208 Appendix A Data file details 209 Here is an example of a well formed data header file DATA9 6 Data on log money log income and interest rate from US Source Stock and Watson 1993 Econometrica unsmoothed data Period is 1900 1989 annual data Data compiled by Graham Elliott Tmoney lincome intrate 1 1900 1989 BYOBS The corres
354. window again following OLS estimation The result of the test is reported and if the TR from the Chapter 20 Time series models 154 auxiliary regression has a p value less than 0 10 ARCH estimates are also reported These estimates take the form of Generalized Least Squares GLS specifically weighted least squares using weights that are inversely proportional to the predicted variances of the disturbances 6 derived from the auxiliary regression In addition the ARCH test is available after estimating a vector autoregression VAR In this case however there is no provision to re estimate the model via GLS GARCH The simple ARCH q process is useful for introducing the general concept of conditional het eroskedasticity in time series but it has been found to be insufficient in empirical work The dynamics of the error variance permitted by ARCH q are not rich enough to represent the patterns found in financial data The generalized ARCH or GARCH model is now more widely used The representation of the variance of a process in the GARCH model is somewhat but not exactly analogous to the ARMA representation of the level of a time series The variance at time t is allowed to depend on both past values of the variance and past values of the realized squared disturbance as shown in the following system of equations Y Xf u 20 10 Ut Or t 20 11 q p Of K gt aiuit gt 6 07 20 12 i 1 j l As above e is an ii
355. wish to reference test or pvalue in connection with this command you can control the variant that is recorded by using one of the flags nc C ct or ctt with adf e By default the Imtest command which must follow an OLS regression performs several diagnostic tests on the regression in question To control what is recorded in test and pvalue you should limit the test using one of the flags logs autocorr squares or white Chapter 5 Special functions in genr 37 As an aid in working with values retrieved using test and pvalue the nature of the test to which these values relate is written into the descriptive label for the generated variable You can read the label for the variable using the label command with just one argument the name of the variable to check that you have retrieved the right value The following interactive session illustrates this point adf 4 x1 c Augmented Dickey Fuller tests order 4 for x1 sample size 59 unit root null hypothesis a 1 test with constant model 1 L y bO a 1 y 1 e estimated value of a 1 0 216889 test statistic t 1 83491 asymptotic p value 0 3638 P values based on MacKinnon JAE 1996 genr pv pvalue Generated scalar pv CID 13 0 363844 label pv pv Dickey Fuller pvalue scalar 5 9 Numerical procedures Two special functions are available to aid in the construction of special purpose estimators namely BFGSmax the BFGS maximi
356. with your analysis 6 4 Random sampling With very large datasets or perhaps to study the properties of an estimator you may wish to draw a random sample from the full dataset This can be done using for example smp1 100 random to select 100 cases If you want the sample to be reproducible you should set the seed for the random number generator first using set This sort of sampling falls under the restriction category a reduced copy of the dataset is made 6 5 The Sample menu items The discussion above has focused on the script command smp1 You can also use the items under the Sample menu in the GUI program to select a sub sample The menu items work in the same way as the corresponding smp1 variants When you use the item Sample Restrict based on criterion and the dataset is already sub sampled you are given the option of preserving or replacing the current restriction Replacing the current restriction means in effect invoking the replace option described above Section 6 3 Chapter 7 Graphs and plots 7 1 Gnuplot graphs A separate program gnuplot is called to generate graphs Gnuplot is a very full featured graphing program with myriad options It is available from www gnuplot info but note that a suitable copy of gnuplot is bundled with the packaged versions of gretl for MS Windows and Mac OS X gretl gives you direct access via a graphical interface to a subset of gnuplot s options and it tries t
357. wn specific sub directory e g usr share gret1 data mydata or c userdata gret1 data mydata The syntax of the plain text description files is straightforward Here for example are the first few lines of gretl s misc data catalog Gretl various illustrative datafiles arma artificial data for ARMA script example ects_nls Nonlinear least squares example hamilton Prices and exchange rate U S and Italy The first line which must start with a hash mark contains a short name here Gretl which will appear as the label for this collection s tab in the data browser window followed by a colon followed by an optional short description of the collection Subsequent lines contain two elements separated by a comma and wrapped in double quotation marks The first is a datafile name leave off the gdt suffix here and the second is a short de scription of the content of that datafile There should be one such line for each datafile in the collection A script catalog file looks very similar except that there are three fields in the file lines a filename without its inp suffix a brief description of the econometric point illustrated in the script and a brief indication of the nature of the data used Again here are the first few lines of the supplied misc script catalog Gretl various sample scripts mow arma ARMA modeling artificial data ects_nis Nonlinear least squares David
358. x of the error terms Under the assumption that the error terms are independently and identically distributed iid we can write Q oI where 0 is the common variance of the errors and the covariances are zero In that case 14 3 simplifies to the classical formula Var B 0 X X 7 14 4 If the iid assumption is not satisfied two things follow First it is possible in principle to construct amore efficient estimator than OLS for instance some sort of Feasible Generalized Least Squares FGLS Second the simple classical formula for the variance of the least squares estimator is no longer correct and hence the conventional OLS standard errors which are just the square roots of the diagonal elements of the matrix defined by 14 4 do not provide valid means of statistical inference In the recent history of econometrics there are broadly two approaches to the problem of non iid errors The traditional approach is to use an FGLS estimator For example if the departure from the iid condition takes the form of time series dependence and if one believes that this could be modeled as a case of first order autocorrelation one might employ an AR 1 estimation method such as Cochrane Orcutt Hildreth Lu or Prais Winsten If the problem is that the error variance is non constant across observations one might estimate the variance as a function of the independent variables and then perform weighted least squares
359. xn matrix A and an mxp matrix B the result is an mx n p matrix That is matrix C A B produces C A B Row wise concatenation of an m x n matrix A and an p x n matrix B produces an m p xn matrix That is matrix C A B A produces C B 12 5 Matrix scalar operators For matrix A and scalar k the operators shown in Table 12 2 are available Addition and subtrac tion were discussed in section 12 4 but we include them in the table for completeness In addition for square A and integer k gt 0 B AAk produces a matrix B which is A raised to the power k 12 6 Matrix functions Most of the gretl functions available for scalars and series also apply to matrices in an element by element fashion and as such their behavior should be pretty obvious This is the case for functions such as log exp sin etc These functions have the effects documented in relation to the genr command For example if a matrix A is already defined then matrix B sqrt A Chapter 12 Matrix manipulation Expression matrix matrix matrix matrix matrix matrix matrix B B B B B B B DIFADDAD gt DOS RPK PRR 87 Effect bij kai bij aij k bij k aij bij Aij k bij aij k bij k aij bij aij modulo k Table 12 2 Matrix scalar operators generates a matrix such that bj a All such functions require a single matrix as argument or an expression which evaluates to a single matrix In this se
360. ys prompted opening a new data file will lose any unsaved work do you really want to do this When you execute a script that opens a data file however you are not prompted The assumption is that in this case you re not going to lose any work because the work is embodied in the script itself and it would be annoying to be prompted at each iteration of the work cycle described above This means you should be careful if you ve done work using the graphical interface and then decide to run a script the current data file will be replaced without any questions asked and it s your responsibility to save any changes to your data first lThis feature is not unique to gretl other econometric packages offer the same facility However experience shows that while this can be remarkably useful it can also lead to writing dinosaur scripts that are never meant to be executed all at once but rather used as a chaotic repository to cherry pick snippets from Since gretl allows you to have several script windows open at the same time you may want to keep your scripts tidy and reasonably small Chapter 3 Modes of working 15 3 2 Saving script objects When you estimate a model using point and click the model results are displayed in a separate window offering menus which let you perform tests draw graphs save data from the model and so on Ordinarily when you estimate a model using a script you just get a non interactive printout of the resu
361. zer discussed in Chapter 17 and fdjac which produces a forward difference approximation to the Jacobian The BFGS maximizer The BFGSmax function takes two arguments a vector holding the initial values of a set of parame ters and a call to a function that calculates the scalar criterion to be maximized given the current parameter values and any other relevant data If the object is in fact minimization this function should return the negative of the criterion On successful completion BFGSmax returns the maxi mized value of the criterion and the matrix given via the first argument holds the parameter values which produce the maximum Here is an example matrix X dataset matrix theta 1 100 scalar J BFGSmax theta ObjFunc amp theta amp X It is assumed here that ObjFunc is a user defined function see Chapter 10 with the following general set up function ObjFunc matrix theta matrix X scalar val do some computation return scalar val end function The operation of the BFGS maximizer can be adjusted using the set variables bfgs_maxiter and bfgs_toler see Chapter 17 In addition you can provoke verbose output from the maximizer by assigning a positive value to max_verbose again via the set command The Rosenbrock function is often used as a test problem for optimization algorithms It is also known as Rosenbrock s Valley or Rosenbrock s Banana Function on account of the fact tha

Gretl User's Guide

Contents

Download Pdf Manuals

Related Search

Related Contents