{smcl}
{* *! version 1.0.0 28jun2009}{...}
{cmd:help lsfereg}{right:dialog: {bf:{dialog lsfereg}} {space 14}}
{hline}
{title:Title}
{p2colset 5 17 19 2}{...}
{p2col :{hi:[R] lsfereg} {hline 2} N-Way Fixed Effects OLS Regression }
{p_end}
{p2colreset}{...}
{title:Syntax}
{p 8 14 2}
{cmd:lsfereg} {depvar} {varlist:_1} {ifin} {weight}
[{cmd:,} {it:{help lsfereg##options:options}}]{p_end}
{marker options}{...}
{synoptset 26 tabbed}{...}
{synopthdr}
{synoptline}
{syntab:Required}
{synopt :{cmdab:absorb:(}{varlist:_2}{cmd:)}}Categorical variables to include either through within transformation or dummies {p_end}
{syntab:Optional}
{synopt :{opt nowith:in}}Use a dummy variable approach for all categorical variables{p_end}
{synopt :{opt full:vcv}}Store and display the full VCV matrix. The default is to store and display the sub-matrix corresponding to {varlist:_1} {p_end}
{synopt :{opt r:obust}}Report Heteroskedastic-Robust standard errors {p_end}
{synopt :{cmdab:clus:ter(}{varname}{cmd:)} }Report Standard Errors for Clustered Residuals {p_end}
{synopt :{opt pred:ict}}Create variables with predicted values {p_end}
{synoptline}
{p2colreset}{...}
{p 4 6 2}{it:depvar} and {it:varlist_1} may contain time-series operators; see
{help tsvarlist}.{p_end}
{p 4 6 2}{cmd:by}, and {cmd:xi} are allowed; see {help prefix}.{p_end}
{pstd}
{title:Description}
{pstd}
{cmd:lsfereg} fits a linear regression absorbing multiple categorical factors. In other words, it
estimates an OLS regressions with N-way fixed effects. It is the natural extension of {help areg}
when the specification contains more than one group of fixed effects.
{pstd}
Of the varibles specified in {cmd: absorb( )}, that with the largest number of unique values is absorbed
through a within transformation (i.e. de-meaning). For the remaining variables in {cmd: absorb( )}, the program creates dummies for
each unique value of each variable and adds them to the set of exogenous variables.
{pstd} The alternative to {cmd:lsfereg} would be to implement the regression using {help areg} jointly with
the {help xi} prefix command. The downside of this approach is that is is slower than {cmd:lsfereg}, it requires
creating many variables that remain on the database after the regression is finished, results are not easy to read,
and it does not have an easy way of predicting the estimates for a specific group of fixed effects (especially
when the number of fixed effects in that group is very large).
{pstd}
{cmd: lsfereg} is implemented as a {help mata} function, and is thus not limited by Stata's
size limitations (see {help matsize} ). It also runs significantly faster, especially
with large sets of fixed effects.
{pstd}
One caveat is that the program uses significant amounts of memory. Memory used by Mata
is independant of that assigned to Stata. It is recommended that the memory allocated to Stata
(see {help memory}) be as small as possible, so as to free up memory for Mata.
{title:Options}
{dlgtab:Required}
{phang}
{cmd:absorb({help varlist}) } specifies the categorical variables to be included in the regression. They may contain either
string names or numeric ids. If only one categorical variable is included, see {help areg} for a simpler implementation
that allows more options.
{dlgtab:Optional}
{phang}
{cmd: nowithin} specifies that no categorical variable specified in {cmd: absorb( )} be included using a within transformation.
Dummy variables will be created for {it: all} groups of each categorical variable. Using a dummy variable approach implies larger
memory consumption and slower performance. Using this option is generally not recomended.
{phang}
{cmd: fullvcv} specifies that the full VCV matrix from the regression be stored in memory and displayed. The default is to store
only the submatrix of the VCV (and b vector) corresponding to the variables in {it: varlist_1}. This way, the default option saves
memory and makes the print out easy to read. The downside is that statistical tests cannot, in general, be performed with a
subset of the VCV matrix, but require the full matrix. {cmd: fullvcv} should be specified if {help test} is to be performed.
{phang}
{cmd: robust} specifies that the robust (a.k.a. sandwich) estimator of the variance be used. This estimator is robust to some types of
misspecification so long as the observations are independent; see {cmd: [U] 20.15 Obtaining robust variance estimates.}
{phang}
{cmd:cluster(}{it:clustvar}{cmd:)} specifies that the standard errors allow for intragroup correlation, relaxing the usual
requirement that the observations be independent. That is, the observations are independent across groups (clusters) but
not necessarily within groups. {it:clustvar} specifies to which group each observation belongs, e.g., {cmd:cluster(personid)}
in data with repeated observations on individuals. {cmd:cluster(} {it:clustvar}{cmd:)} affects the standard errors and
variance-covariance matrix of the estimators but not the estimated coefficients;
see {hi:[U] 20.15 Obtaining robust variance estimates}.May not be combined with {hi: robust}.
{phang}
{cmd: predict} generates variables with predicted values from the regression. The variables generated are {cmd: y_hat}, {cmd: xb_hat},
and, for each categorical variable specified in {cmd:absorb( )}, the variables {cmd:{it: varname}_hat}. If
a variable with the same name has already been defined, the program aborts with error {it: before} running the regression.
{pmore}
{cmd: y_hat} contains the predicted values of the full specification, including the effects of all fixed effects. It is equivalent to
the {cmd:xbd} option in the {help areg postestimation}. Differencing {it: depvar} from {cmd: y_hat} gives the predicted residuals.
{pmore}
{cmd: xb_hat} contains the predicted values of the linear combination of {it: varlist_1}. It is equivalent to the {cmd:xb} option in
the {help areg postestimation}.
{pmore}
{cmd: {it:varname}_hat} contains the predicted value of the fixed effects for the categorical variable {it:varname}. It is similar
to the {cmd:d} option in the {help areg postestimation}, with the difference that it creates a unique {it:varname}_hat for
each categorical variable in {cmd: absorb({it:varlist_2})}.
{pmore}
All predictions are done in sample. Predictions out of sample have to be manually calculated by the user.
{title:Examples}
{hline}
{phang}{cmd:. sysuse highschool}{p_end}
{phang}{cmd:. lsfereg height weight , absorb(school sex race) predict robust }{p_end}
{phang}{cmd:. summ y_hat xb_hat school_hat sex_hat race_hat }{p_end}
{hline}
{title:Saved results}
{pstd}
{cmd:lsfereg} saves the following in {cmd:e()}:
{synoptset 15 tabbed}{...}
{p2col 5 15 19 2: Scalars}{p_end}
{synopt:{cmd:e(N)}}number of observations{p_end}
{synopt:{cmd:e(N_g)}}number of fixed effects{p_end}
{synopt:{cmd:e(df_r)}}residual degrees of freedom{p_end}
{synopt:{cmd:e(r2)}}R-squared{p_end}
{synopt:{cmd:e(r2_a)}}adjusted R-squared{p_end}
{synoptset 15 tabbed}{...}
{p2col 5 15 19 2: Matrices}{p_end}
{synopt:{cmd:e(b)}}coefficient vector{p_end}
{synopt:{cmd:e(V)}}variance-covariance matrix of the estimators{p_end}
{synoptset 15 tabbed}{...}
{p2col 5 15 19 2: Functions}{p_end}
{synopt:{cmd:e(sample)}}marks estimation sample{p_end}
{p2colreset}{...}
{title:Reference}
{phang}
Greene, W., 2006.
{it:Econometric Analysis}.
5th Edition, New York: Pearson Education.
{title:Author}
{phang}
Mauricio Varela, University of Arizona, mvarela@email.arizona.edu
{title:Also see}
{psee}
Online: {manhelp areg R}, {manhelp areg_postestimation R:areg postestimation};{break}
{manhelp lsfeivreg O}, {manhelp xi R}
{p_end}