{smcl} {* *! version 1.0.0 28jun2009}{...} {cmd:help lsfeivreg}{right:dialog: {bf:{dialog lsfeivreg}} {space 14}} {hline} {title:Title} {p2colset 5 17 19 2}{...} {p2col :{hi:[R] lsfeivreg} {hline 2} N-Way Fixed Effects IV Regression } {p_end} {p2colreset}{...} {title:Syntax} {p 8 14 2} {cmd:lsfeivreg} {depvar} {varlist:_1} {ifin} {weight} [{cmd:,} {it:{help lsfeivreg##options:options}}]{p_end} {marker options}{...} {synoptset 26 tabbed}{...} {synopthdr} {synoptline} {syntab:Required} {synopt :{cmdab:absorb:(}{varlist:_2}{cmd:)}}Categorical variables to include either through within transformation or dummies {p_end} {synopt :{cmdab:endog:(}{varlist:_e}{cmd:)}}List of endogenous variables{p_end} {synopt :{cmdab:iv:(}{varlist:_iv}{cmd:)}}List of instruments{p_end} {syntab:Optional} {synopt :{opt nowith:in}}Use a dummy variable approach for all categorical variables{p_end} {synopt :{opt full:vcv}}Store and display the full VCV matrix. The default is to store and display the sub-matrix corresponding to {varlist:_1} and {varlist:_e}{p_end} {synopt :{opt r:obust}}Report Heteroskedastic-Robust standard errors{p_end} {synopt :{opt pred:ict}}Create variables with predicted values {p_end} {synoptline} {p2colreset}{...} {p 4 6 2}{it:depvar} and {it:varlist_1} may contain time-series operators; see {help tsvarlist}.{p_end} {p 4 6 2}{cmd:by}, and {cmd:xi} are allowed; see {help prefix}.{p_end} {pstd} {title:Description} {pstd} {cmd:lsfivereg} fits a linear instrumental variables regression absorbing multiple categorical factors. In other words, it estimates an IV regressions with N-way fixed effects. It is the IV complement of lsfereg. It can also be used as the IV extension of {help areg}. {pstd} Of the varibles specified in {cmd: absorb( )}, that with the largest number of unique values is absorbed through a within transformation (i.e. de-meaning). For the remaining variables in {cmd: absorb( )}, the program creates dummies for each unique value of each variable and adds them to the set of exogenous variables. {pstd} The alternative to lsfeivreg would be to implement the regression using {help xtivreg} if only one categorical variable is to be included and the data set is in panel form. If more than one categorical variable is to be included than {help xtivreg} could also be used jointly with the {help xi} prefix command. {pstd} {cmd: lsfeivreg} is convenient when the data is not in panel form. It also tends to outperform {help xtivreg} in speed, at the cost of memory consumption. These differences are especially accute when two or more large sets of categorical variables are included. {pstd} {cmd: lsfeivreg} is implemented as a {help mata} function, and is thus not limited by Stata's size limitations (see {help matsize}). {pstd} One caveat is that the program uses significant amounts of memory. Memory used by Mata is independant of that assigned to Stata. It is recommended that the memory allocated to Stata (see {help memory}) be as small as possible, so as to free up memory for Mata. {title:Options} {dlgtab:Required} {phang} {cmd:absorb({varlist:_2}) } specifies the categorical variables to be included in the regression. They may contain either string names or numeric ids. {phang} {cmd:endog({varlist:_e})} specifies the variables that are endogenous. These variables should not be included in {it: varlist_1}. At least 1 variable should be specified in {it:varlist_e}. {phang} {cmd:iv({varlist:_iv})} specifies the variables that are to be used as excluded instruments. The number of variables that are to be included in {it:varlist_iv} should be no less than the number of variables specified in {it:varlist_e} and should not include variables specified in {it:varlist_1}. {dlgtab:Optional} {phang} {cmd: nowithin} specifies that no categorical variable specified in {cmd: absorb( )} be included using a within transformation. Dummy variables will be created for {it: all} groups of each categorical variable. Using a dummy variable approach implies larger memory consumption and slower performance. Using this option is generally not recomended. {phang} {cmd: fullvcv} specifies that the full VCV matrix from the regression be stored in memory and displayed. The default is to store only the submatrix of the VCV (and b vector) corresponding to the variables in {it: varlist_1} and {it: varlist_e}. This way, the default option saves memory and makes the print out easy to read. The downside is that tests cannot be performed with a subset of the VCV matrix, but require the full matrix. {cmd: fullvcv} should be specified if a {help test} is to be performed. {phang} {cmd: robust} specifies that the robust (a.k.a. sandwich) estimator of the variance be used. If specified, the function performs the two stage procedure to obtain the efficient GMM IV-estimator (cfr. Baum, Schaffer, & Stillman - 2003 ). The first stage estimates b and the estimated residuals. The second stage uses the estimated residulas to build an efficient weighting matrix, with which it restimates b and the VCV matrix. {pmore} This estimator is robust to some types of misspecification so long as the observations are independent; see {cmd: [U] 20.15 Obtaining robust variance estimates.} {pmore} Specifiying {cmd: robust} can help solve problems with close to singular matrices when the model is exactly identified (the number of instruments equals the number of endogenous variables). The two stage procedure uses an estimated weighting matrix in the 2nd stage. Because the matrix being inverted is no longer the same (it has been multiplied by an appropiate weighting matrix), somtimes, numerical problems with selecting which columns to drop dissapear. {phang} {cmd: predict} generates variables with predicted values from the regression. The variables generated are {cmd: y_hat}, {cmd: xb_hat}, and, for each categorical variable specified in {cmd: absorb( )}, the variables {cmd:{it: varname}_hat}. If a variable with the same name has already been defined, the program aborts with error {it: before} running the regression. {pmore} {cmd: y_hat} contains the predicted values of the full specification, including the effects of all fixed effects. It is equivalent to the {cmd:xbd} option in the {help areg postestimation}. Differencing {it: depvar} from {cmd: y_hat} gives the predicted residuals. {pmore} {cmd: xb_hat} contains the predicted values of the linear combination of {it: varlist_1}. It is equivalent to the {cmd:xb} option in the {help areg postestimation}. {pmore} {cmd: {it:varname}_hat} contains the predicted value of the fixed effects for each categorical variable. It is similar to the {cmd:d} option in the {help areg postestimation}, with the difference that it creates a unique {it:varname}_hat for each categorical variable in {cmd: absorb({it:varlist_2})}. {pmore} All predictions are done in sample. Predictions out of sample have to be manually calculated by the user. {title:Examples} {hline} {phang}{cmd:. sysuse highschool}{p_end} {phang}{cmd:. gen byte male = sex=="male"}{p_end} {phang}{cmd:. lsfereg height weight , absorb(school race) endog(weight) iv(male) predict robust }{p_end} {phang}{cmd:. summ y_hat xb_hat school_hat race_hat }{p_end} {hline} {title:Saved results} {pstd} {cmd:lsfeivreg} saves the following in {cmd:e()}: {synoptset 15 tabbed}{...} {p2col 5 15 19 2: Scalars}{p_end} {synopt:{cmd:e(N)}}number of observations{p_end} {synopt:{cmd:e(N_g)}}number of fixed effects{p_end} {synopt:{cmd:e(df_r}{it:#}{cmd:)}}residual degrees of freedom{it:#}{p_end} {synopt:{cmd:e(r2)}}R-squared{p_end} {synopt:{cmd:e(r2_a)}}adjusted R-squared{p_end} {synoptset 15 tabbed}{...} {p2col 5 15 19 2: Matrices}{p_end} {synopt:{cmd:e(b)}}coefficient vector{p_end} {synopt:{cmd:e(V)}}variance-covariance matrix of the estimators{p_end} {synoptset 15 tabbed}{...} {p2col 5 15 19 2: Functions}{p_end} {synopt:{cmd:e(sample)}}marks estimation sample{p_end} {p2colreset}{...} {title:Reference} {phang} Baum, C., Schaffer, M., and Stillman, S. 2003 {it: Instrumental Variables and GMM: Estimation and Testing} working paper no. 545, Boston College {phang} Greene, W., 2006. {it:Econometric Analysis}. 5th Edition, New York: Pearson Education. {title:Author} {phang} Mauricio Varela, University of Arizona, mvarela@email.arizona.edu {title:Also see} {psee} Online: {manhelp xtivreg XT}, {manhelp xtivreg_postestimation XT: xtivreg postestimation};{break} {manhelp areg R}, {manhelp areg_postestimation R:areg postestimation};{break} {manhelp lsfereg O}, {manhelp xi R} {p_end}