** Chapter 24.5 ** for W.H. Greene, Econometric Analysis 6th ed. *****************
* (c) Noel Roy 2003
*
*                TRUNCATION, CENSORING, AND SAMPLE SELECTION
*
*===============================================================================
* 24.5 SAMPLE SELECTION
* 
* Example 24.8 (p. 888) Female Labor Supply
*
* Read in the Mroz female labor participation data
*
SAMPLE 1 753
READ (TableF4-1.txt) LFP WHRS KL6 K618 Age Educatn WW RPWG HHRS HA HE HW Income MTR WMED WFED UN CIT AX /SKIPLINES=37
*
* Set up regression variables.
GENR Age2=Age**2
GENR Kids=DUM(KL6+K618)
GENR AX2=AX**2
*
* The SHAZAM Manual contains a procedure to perform the two-step
* Heckman estimation procedure, which we use here.
*
*******************************************************************
* Sample Selection-Corrected Estimation ("Heckit")               
*                                                                 
* Programmmer:                                                   
*    David A. Jaeger                                              
*    The University of Michigan  
*                                                                
* Background: 
*    Heckman (1979) discusses the bias that results from using
*    nonrandomly selected samples when estimating behavioral
*    relationships as "omitted variables" bias.  He proposes
*    a simple consistent method to estimate these models,
*    using a bivariate normal model for the selection equation,
*    and ordinary least squares to estimate the behavioral
*    equation with the selected sample.
*
*    Greene (1981) notes that the standard errors in the OLS
*    stage that are typically computed can either be smaller
*    or larger than the correct standard errors, not
*    just smaller as Heckman had asserted.  He then derives
*    a simple-to-compute formula for the correct variance-
*    covariance matrix of the OLS estimates.
*                                                                
* Description:
*    This program uses SHAZAM's PROBIT and OLS routines to 
*    estimate the parameters of the Heckman model and SHAZAM's
*    MATRIX language to calculate the correct standard errors
*    for the second stage (OLS). 
*
***********************************************************************
PAR 8000
* ================= DATA INPUT REQUIREMENTS ====================
*
* Modify this section as appropriate
*
* List of independent variables for the 1st-stage probit estimation
X1: AGE AGE2 Income EDUCATN KIDS 
* List of independent variables for the 2nd-stage OLS estimation
X2: AX AX2 Educatn CIT 
* Binary variable for probit estimation
RENAME LFP SEL
* Dependent variable for 2nd-stage OLS estimation
RENAME WW DEP
* ==================== END OF DATA INPUT =======================
 
***** First Stage:   Run Probit

PROBIT SEL [X1] / INDEX=ALPHAW COV=SIG IMR=LAMBDA PCOV
 
***** Second Stage:  Run OLS on the selected sample
SET NOWARNSKIP
SKIPIF (SEL.EQ.0)
   OLS DEP [X2]
   OLS DEP LAMBDA [X2] / RESID=ERR COEF=BETA STDERR=OLSSTD
   GEN1 N=$N
   GEN1 K=$K
   GEN1 THETA=BETA:1
   PRINT THETA
   GENR CONSTANT=1
   GENR DELTA=LAMBDA*(LAMBDA+ALPHAW)
   COPY ERR E
   COPY DELTA CAPDELTA
   COPY [X1] CONSTANT W
   COPY LAMBDA [X2] CONSTANT XSTAR
DELETE SKIP$
 
MATRIX CAPDELTA=DIAG(CAPDELTA)
MATRIX DELTABAR=TRACE(CAPDELTA)/N
MATRIX SIGSQE=E'E/N+THETA**2*DELTABAR
MATRIX SIGE=SQRT(SIGSQE)
***** Standard Error of 2nd Stage (OLS) corrected for selection
PRINT SIGE
 
GEN1 RHOSQ=THETA**2/SIGSQE
GEN1 RHO=(ABS(THETA)/THETA)*SQRT(RHOSQ)
 
***** Correlation Between error in regression and error in selection 
PRINT RHO
MATRIX Q=RHOSQ*(XSTAR'CAPDELTA*W)*SIG*(W'*CAPDELTA*XSTAR)
MATRIX ASYVCOV=SIGSQE*INV(XSTAR'XSTAR)* &
       (XSTAR'(IDEN(N)-RHOSQ*CAPDELTA)*XSTAR + Q)*INV(XSTAR'*XSTAR)
 
***** Consistent Variance-Covariance Matrix of 2nd Stage (OLS)
PRINT ASYVCOV
 
MATRIX ASYSE=DIAG(ASYVCOV)
MATRIX ASYSE=SQRT(ASYSE)
 
***** Consistent Standard Errors for 2nd Stage (OLS)
SAMPLE 1 K
PRINT BETA OLSSTD ASYSE
STOP
*
*===============================================================================
* Updated December 1, 2008