** Chapter 24.5 ** for W.H. Greene, Econometric Analysis 6th ed. ***************** * (c) Noel Roy 2003 * * TRUNCATION, CENSORING, AND SAMPLE SELECTION * *=============================================================================== * 24.5 SAMPLE SELECTION * * Example 24.8 (p. 888) Female Labor Supply * * Read in the Mroz female labor participation data * SAMPLE 1 753 READ (TableF4-1.txt) LFP WHRS KL6 K618 Age Educatn WW RPWG HHRS HA HE HW Income MTR WMED WFED UN CIT AX /SKIPLINES=37 * * Set up regression variables. GENR Age2=Age**2 GENR Kids=DUM(KL6+K618) GENR AX2=AX**2 * * The SHAZAM Manual contains a procedure to perform the two-step * Heckman estimation procedure, which we use here. * ******************************************************************* * Sample Selection-Corrected Estimation ("Heckit") * * Programmmer: * David A. Jaeger * The University of Michigan * * Background: * Heckman (1979) discusses the bias that results from using * nonrandomly selected samples when estimating behavioral * relationships as "omitted variables" bias. He proposes * a simple consistent method to estimate these models, * using a bivariate normal model for the selection equation, * and ordinary least squares to estimate the behavioral * equation with the selected sample. * * Greene (1981) notes that the standard errors in the OLS * stage that are typically computed can either be smaller * or larger than the correct standard errors, not * just smaller as Heckman had asserted. He then derives * a simple-to-compute formula for the correct variance- * covariance matrix of the OLS estimates. * * Description: * This program uses SHAZAM's PROBIT and OLS routines to * estimate the parameters of the Heckman model and SHAZAM's * MATRIX language to calculate the correct standard errors * for the second stage (OLS). * *********************************************************************** PAR 8000 * ================= DATA INPUT REQUIREMENTS ==================== * * Modify this section as appropriate * * List of independent variables for the 1st-stage probit estimation X1: AGE AGE2 Income EDUCATN KIDS * List of independent variables for the 2nd-stage OLS estimation X2: AX AX2 Educatn CIT * Binary variable for probit estimation RENAME LFP SEL * Dependent variable for 2nd-stage OLS estimation RENAME WW DEP * ==================== END OF DATA INPUT ======================= ***** First Stage: Run Probit PROBIT SEL [X1] / INDEX=ALPHAW COV=SIG IMR=LAMBDA PCOV ***** Second Stage: Run OLS on the selected sample SET NOWARNSKIP SKIPIF (SEL.EQ.0) OLS DEP [X2] OLS DEP LAMBDA [X2] / RESID=ERR COEF=BETA STDERR=OLSSTD GEN1 N=$N GEN1 K=$K GEN1 THETA=BETA:1 PRINT THETA GENR CONSTANT=1 GENR DELTA=LAMBDA*(LAMBDA+ALPHAW) COPY ERR E COPY DELTA CAPDELTA COPY [X1] CONSTANT W COPY LAMBDA [X2] CONSTANT XSTAR DELETE SKIP$ MATRIX CAPDELTA=DIAG(CAPDELTA) MATRIX DELTABAR=TRACE(CAPDELTA)/N MATRIX SIGSQE=E'E/N+THETA**2*DELTABAR MATRIX SIGE=SQRT(SIGSQE) ***** Standard Error of 2nd Stage (OLS) corrected for selection PRINT SIGE GEN1 RHOSQ=THETA**2/SIGSQE GEN1 RHO=(ABS(THETA)/THETA)*SQRT(RHOSQ) ***** Correlation Between error in regression and error in selection PRINT RHO MATRIX Q=RHOSQ*(XSTAR'CAPDELTA*W)*SIG*(W'*CAPDELTA*XSTAR) MATRIX ASYVCOV=SIGSQE*INV(XSTAR'XSTAR)* & (XSTAR'(IDEN(N)-RHOSQ*CAPDELTA)*XSTAR + Q)*INV(XSTAR'*XSTAR) ***** Consistent Variance-Covariance Matrix of 2nd Stage (OLS) PRINT ASYVCOV MATRIX ASYSE=DIAG(ASYVCOV) MATRIX ASYSE=SQRT(ASYSE) ***** Consistent Standard Errors for 2nd Stage (OLS) SAMPLE 1 K PRINT BETA OLSSTD ASYSE STOP * *=============================================================================== * Updated December 1, 2008