«Self-Selection Models in Corporate Finance» - конспект лекции

Конспект лекции по дисциплине «Self-Selection Models in Corporate Finance», Word формат

Ch. 2: Self-Selection Models in Corporate Finance 39 Abstract Corporate finance decisions are not made at random, but are usually deliberate decisions by firms or their managers to self-select into their preferred choices. This chapter reviews econometric models of self-selection. The review is organized into two parts. The first part reviews econometric models of self-selection, focusing on the key assumptions of different models and the types of applications they may be best suited for. Part two reviews empirical applications of selection models in the areas of corporate investment, financing, and financial intermediation. We find that self-selection is a rapidly growing area in corporate finance, partly reflecting its recognition as a pervasive feature of corporate finance decisions, but more importantly, the increasing recognition of selection models as unique tools for understanding, modeling, and testing the role of private information in corporate finance. Keywords selection, private information, switching regression, treatment effect, matching, propensity score, Bayesian selection methods, panel data, event study, underwriting, investment banking, diversification 40 K. Li and N.R. Prabhala Introduction Corporate finance concerns the financing and investment choices made by firms and a broad swathe of decisions within these broad choices. For instance, firms pick their target capital structure, and to achieve the target, must make several choices including issue timing of security issues, structural features of the securities issued, the investment bank chosen to underwrite it, and so on. These choices are not usually random, but are deliberate decisions by firms or their managers to self-select into their preferred choices. This chapter reviews econometric models of self-selection. We review the approaches used to model self-selection in corporate finance and the substantive findings obtained by implementing selection methods. Self-selection has a rather mixed history in corporate finance. The fact that there is self-selection is probably not news; indeed, many papers at least implicitly acknowledge its existence. However, the literature differs on whether to account for self-selection using formal econometric methods, and why one should do so. One view of self-selection is that it is an errant nuisance, a “correction” that must be made to prevent other parameter estimates from being biased. Selection is itself of little economic interest under this view. In other applications, self-selection is itself of central economic interest, because models of self-selection represent one way of incorporating and controlling for unobservable private information that influences corporate finance decisions. Both perspectives find expression in the literature, although an increasing emphasis in recent work reflects the positive view in which selection models are used to construct interesting tests for private information. Our review is organized into two parts. Part I focuses on econometric models of self-selection. We approach selection models from the viewpoint of a corporate finance researcher who is implementing selection models in an empirical application. We formalize the notion of self-selection and overview several approaches towards modeling it, including reduced form models, structural approaches, matching methods, fixed effect estimators, and Bayesian methods. As the discussion clarifies, the notion of selection is not monolithic. No single model universally models or accounts for all forms of selection, so there is no one “fix” for selection. Instead, there are a variety of approaches, each of which makes its own economic and statistical assumptions. We focus on the substantive economic assumptions underlying the different approaches to illustrate what each can and cannot do and the type of applications a given approach may be best suited for. We do not say much on estimation, asymptotic inference, or computational issues, but refer the reader to excellent texts and articles on these matters. Part II of our review examines corporate finance applications of self-selection models. We cover a range of topics such as mergers and acquisitions, stock splits, equity offerings, underwriting, analyst behavior, share repurchases, and venture capital. Our objective is to illustrate the wide range of corporate finance settings in which selection arises and the different econometric approaches employed in modeling it. Here, Ch. 2: Self-Selection Models in Corporate Finance 41 we focus on applications published in the last decade or so, and on articles in which self-selection is a major component of the overall results.1 I. MODELING SELF-SELECTION This portion of our review discusses econometric models of self-selection. Our intention is not to summarize the entire range of available models and their estimation. Rather, we narrow our focus to models that have been applied in the corporate finance literature, and within these models, we focus on the substantive assumptions made by each specification. From the viewpoint of the empirical researcher, this is the first order issue in deciding what approach suits a given application in corporate finance. We do not touch upon asymptotic theory, estimation, and computation. These important issues are well covered in excellent textbooks.2 We proceed as follows. Section 1 describes the statistical issue raised by selfselection, the wedge between the population distribution and the distribution within a selected sample. Sections 2–6 develop the econometric models that can address selection. Section 2 discusses a baseline model for self-selection, the “Heckman” selection model analyzed in Heckman (1979), a popular modeling choice in corporate finance.3 We discuss identification issues related to the model, which are important but not frequently discussed or justified explicitly in corporate finance applications. Because the Heckman setting is so familiar in corporate finance, we use it to develop a key point of this survey, the analogy between econometric models of self-selection and private information models in corporate finance. Section 3 considers switching regressions and structural self-selection models. While these models generalize the Heckman selection model in some ways, they also bring additional baggage in terms of economic and statistical assumptions that we discuss. We then turn to other approaches towards modeling selection. Section 4 discusses matching models, which are methods du jour in the most recent applications. The popularity of matching models can be attributed to their relative simplicity, easy interpretation of coefficients, and minimal structure with regard to specification. However, these gains come at a price. Matching models make the strong economic assumption that unobservable private information is irrelevant. This assumption may not be realistic in many corporate finance applications. In contrast, selection models explicitly model and incorporate private information. A second point we develop is that while matching 1 Our attempt is to capture the overall flavor of self-selection models as they stand in corporate finance as of the writing. We apologize to any authors whose work we have overlooked: no slight is intended. 2 The venerable reference, Maddala (1983), continues to be remarkably useful, though its notation is often (and annoyingly, to the empirical researcher) different from that used in other articles and software packages. Newer material is covered in Wooldridge (2002) and Greene (2003). 3 Labeling any one model as “the” Heckman model surely does disservice to the many other contributions of James Heckman. We choose this label following common usage in the literature. 42 K. Li and N.R. Prabhala methods are often motivated by the fact that they yield easily interpretable treatment effects, selection methods also estimate treatment effects with equal ease. Our review of methodology closes by briefly touching upon fixed effect models in Section 5 and Bayesian approaches to selection in Section 6. 1. Self-selection: The statistical issue To set up the self-selection issue, assume that we wish to estimate parameters β of the regression Yi = Xi β + ϵi (1) Yi |E = Xi β + ϵi |E. (2) for a population of firms. In equation (1), Yi is the dependent variable, which is typically an outcome such as profitability or return. The variables explaining outcomes are Xi , and the error term is ϵi . If ϵi satisfies usual classical regression conditions, standard OLS/GLS procedures consistently estimate β. Now consider a sub-sample of firms who self-select choice E. For this sub-sample, equation (1) can be written as The difference between equations (2) and (1) is at the heart of the self-selection problem. Equation (1) is a specification written for the population but equation (2) is written for a subset of firms, those that self-select choice E. If self-selecting firms are not random subsets of the population, the usual OLS/GLS estimators applied to equation (2), are no longer consistent estimators of β. Accounting for self-selection consists of two steps. Step 1 specifies a model for selfselection, using economic theory to model why some firms select E while others do not. While this specification step is not often discussed extensively in applications, it is critical because the assumptions involved ultimately dictate what econometric model should be used in the empirical application. Step 2 ties the random variable(s) driving self-selection to the outcome variable Y . 2. The baseline Heckman selection model 2.1. The econometric model Early corporate finance applications of self-selection are based on the model analyzed in Heckman (1979). We spend some time developing this model because most other specifications used in the finance literature can be viewed as extensions of the Heckman model in various directions. In the conventional perspective of self-selection, the key issue is that we have a regression such as equation (1) that is well specified for a population but it must be estimated Ch. 2: Self-Selection Models in Corporate Finance 43 using sub-samples of firms that self-select into choice E. To estimate population parameters from self-selected subsamples, we first specify a self-selection mechanism. This usually takes the form of a probit model in which firm i chooses E if the net benefit from doing so, a scalar Wi , is positive. Writing the selection variable Wi as a function of explanatory variables Zi , which are assumed for now to be exogenous,4 we have the system C = E ≡ Wi = Zi γ + ηi > 0, (3) Yi = Xi β + ϵ i , (5) C = NE ≡ Wi = Zi γ + ηi ! 0, (4) where Zi denotes publicly known information influencing a firm’s choice, γ is a vector of probit coefficients, and ηi is orthogonal to public variables Zi . In the standard model, Yi is observed only when a firm picks one of E or NE (but not both), so equation (5) would require the appropriate conditioning. Assuming that ηi and ϵi are bivariate normal, the likelihood function and the maximum likelihood estimators for equations (3)–(5) follow, although a simpler two-step procedure (Heckman, 1979, and Greene, 1981) is commonly used for estimation. Virtually all applied work is based on the bivariate normal structure discussed above. 2.2. Self-selection and private information In the above setup, self-selection is a nuisance problem. We model it because not doing so leads to inconsistent estimates of parameters β in regression (1). Self-selection is, by itself, of little interest. However, this situation is frequently reversed in corporate finance, because tests for self-selection can be viewed as tests of private information theories. We develop this point in the context of the Heckman (1979) model outlined above, but we emphasize that this private information interpretation is more general. We proceed as follows. Following a well-established tradition in econometrics, Section 2.2.1 presents selection as an omitted variable problem. Section 2.2.2 interprets the omitted variable as a proxy for unobserved private information. Thus, including the omitted self-selection variable controls for and tests for the significance of private information in explaining ex-post outcomes of corporate finance choices. 2.2.1. Selection: An omitted variable problem Suppose that firm i self-selects choice E. For firm i, we can take expectations of equation (5) and write 4 Thus, we preclude for now the possibility that Z includes the outcome variable Y . This restriction can be relaxed at a cost, as we show in later sections. 44 K. Li and N.R. Prabhala Yi |E = Xi β + (ϵi |Zi γ + ηi > 0) = Xi β + π(ηi |Zi γ + ηi > 0) + νi . (6) (7) Equation (7) follows from the standard result that ϵi |ηi = πηi + νi where π is the coefficient in the regression of ϵi on ηi , and νi is an orthogonal zero-mean error term.5 Given the orthogonality and zero-mean properties of νi , we can take expectations of equation (7) and obtain the regression model E(Yi |E) = Xi β + πE(ηi |Zi γ + ηi > 0) (8) and a similar model for firms choosing not to announce E, E(Yi |NE) = Xi β + πE(ηi |Zi γ + ηi ! 0). (9) Equations (8) and (9) can be compactly rewritten as E(Yi |C) = Xi β + πλC (Zi γ ) (10) where C ∈ {E, NE} and λC (.) is the conditional expectation of ηi given C. In particular, if η and ϵ are bivariate normal, as is standard in the bulk of the applied work, λE (.) = φ(.) φ(.) Φ(.) and λNE (.) = − 1−Φ(.) (Greene, 2003, p. 759). A comparison of equations (1) and (10) clarifies why self-selection is an omitted variable problem. In the population regression in equation (1), regressing outcome Y on X consistently estimates β. However, in self-selected samples, consistent estimation requires that we include an additional variable, the inverse Mills ratio λC (.). Thus, the process of correction for self-selection can be viewed as including an omitted variable. 2.2.2. The omitted variable as private information In the probit model (3) and (4), ηi is the part of Wi not explained by public variables Zi . Thus, ηi can be viewed as the private information driving the corporate financing decision being modeled. The ex-ante expectation of ηi should be zero, and it is so, given that it has been defined as an error term in the probit model. Ex-post after firm i selects C ∈ {E, NE}, the expectations of ηi can be updated. The revised expectation, E(ηi |C), is thus an updated estimate of the firm’s private information. If we wished to test whether the private information in a firm’s choice affected post-choice outcomes, we would regress outcome Y on E(ηi |C). But E(ηi |C) = λC (.) is the inverse Mills ratio term that we add anyway to adjust for self-selection. Thus, correcting for self-selection is equivalent to testing for private information. The omitted variable used to correct for self-selection, λC (.), is an estimate of the private information 5 Note that π = ρ σ where ρ is the correlation between ϵ and η, and σ 2 is the variance of ϵ. ηϵ ϵ ηϵ ϵ Ch. 2: Self-Selection Models in Corporate Finance 45 underlying a firm’s choice and testing its significance is a test of whether private information possessed by a firm explains ex-post outcomes. In fact, a two-step procedure most commonly used to estimate selection models follows this logic.6 Our main purpose of incorporating the above discussion of the Heckman model is to highlight the dual nature of self-selection “corrections”. One can think of them as a way of accounting for a statistical problem. There is nothing wrong with this view. Alternatively, one can interpret self-selection models as a way of testing private information hypotheses, which is perhaps an economically more useful perspective of selection models in corporate finance. Selection models are clearly useful if private information is one’s primary focus, but even if not, the models are useful as means of controlling for potential private information effects. 2.3. Specification issues Implementing selection models in practice poses two key specification issues: the need for exclusion restrictions and the assumption that error terms are bivariate normal. While seemingly innocuous, these issues, particularly the exclusion question, are often important in empirical applications, and deserve some comment. 2.3.1. Exclusion restrictions In estimating equations (3)–(5), researchers must specify two sets of variables: those determining selection (Z) and those determining the outcomes (X). An issue that comes up frequently is whether the two sets of variables can be identical. This knotty issue often crops up in practice. For instance, consider the self-selection event E in equations (3) and (4) as the decision to acquire a target and suppose that the outcome variable in equation (5) is post-diversification productivity. Variables such as firm size or the relatedness of the acquirer and the target could explain the acquisition decision. The same variables could also plausibly explain the ex-post productivity gains from the acquisition. Thus, these variables could be part of both Z and X in equations (3)–(5). Similar arguments can be made for several other explanatory variables: they drive firms’ decision to self-select into diversification and the productivity gains after diversification. Do we need exclusion restrictions so that there is at least one variable driving selection, an instrument in Z that is not part of X? Strictly speaking, exclusion restrictions are not necessary in the Heckman selection model because the model is identified by non-linearity. The selection-adjusted outcome regression (10) regresses Y on X and λC (Z ′ γ ). If λC (.) were a linear function of Z, we would clearly need some variables in Z that are not part of X or the regressors 6 Step 1 estimates the probit model (3) and (4) to yield estimates of γ , say γ̂ , and hence the private infor- mation function λC (Zi γ̂ ). In step 2, we substitute the estimated private information in lieu of its true value in equation (10) and estimate it by OLS. Standard errors must be corrected for the fact that γ is estimated in the second step, along the lines of Heckman (1979), Greene (1981), and Murphy and Topel (1985). 46 K. Li and N.R. Prabhala would be collinear.7 However, under the assumption of bivariate normal errors, λC (.) is a non-linear function. As Heckman and Navarro-Lozano (2004) note, collinearity between the outcome regression function (here and usually the linear function Xi β) and the selection “control” function λC (.) is not a generic feature, so some degree of nonlinearity will probably allow the specification to be estimated even when there are no exclusion restrictions. In practice, the identification issue is less clear cut. The problem is that while λC (.) is a non-linear function, it is roughly linear in parts of its domain. Hence, it is entirely possible that λC (Z ′ γ ) has very little variation relative to the remaining variables in equation (10), i.e., X. This issue can clearly arise when the selection variables Z and outcome variables X are identical. However, it is important to realize that merely having extra instruments in Z may not solve the problem. The quality of the instruments also matters. Near-multicollinearity could still arise when the extra instruments in Z are weak and have limited explanatory power. What should one do if there appears to be a multicollinearity issue? It is tempting to recommend that the researcher impose additional exclusion restrictions so that selfselection instruments Z contain unique variables not spanned by outcome variables X. Matters are, of course, a little more delicate. Either the exclusions make sense, in which case these should have been imposed in the first place. Alternatively, the restrictions are not reasonable, in which case it hardly makes sense to force them on a model merely to make it estimable. In any event, as a practical matter, it seems reasonable to always run diagnostics for multicollinearity while estimating selection models whether one imposes exclusion restrictions or not. The data often offer one degree of freedom that can be used to work around particularly thorny cases of collinearity. Recall that the identification issue arises mainly because of the 1/0 nature of the selection variable Wi , which implies that we do not observe the error term ηi and we must take its expectation, which is the inverse Mills ratio term. However, if we could observe the magnitude of the selection variable Wi , we would introduce an independent source of variation in the selection correction term and in effect observe the private information ηi itself and use it in the regression in lieu of the inverse Mills ratio. Exclusion restrictions are no longer needed. This is often more than just a theoretical possibility. For instance, in analyzing a sample of firms that have received a bank loan, we do observe the bank loan amount conditional on a loan being made. Likewise, in analyzing equity offerings, we observe the fact that a firm made an equity offering and also the size of the offer. In hedging, we do observe (an estimate of) the extent of hedging given that a firm has hedged. This introduces an independent source of variation into the private information variable, freeing one from the reliance on non-linearity for identification. 7 In this case, having a variable in X that is not part of Z does not help matters. If λ (.) is indeed linear, it C is spanned by X whenever Z is spanned by X. Thus, we require extra variables that explain the decision to self-select but are unrelated to the outcomes following self-selection. Ch. 2: Self-Selection Models in Corporate Finance 47 2.3.2. Bivariate normality A second specification issue is that the baseline Heckman model assumes that errors are bivariate normal. In principle, deviations from normality could introduce biases in selection models, and these could sometimes be serious (for an early illustration, see Goldberger, 1983). If non-normality is an issue, one alternative is to assume some specific non-normal distribution (Lee, 1983, and Maddala, 1983, Chapter 9.3). The problem is that theory rarely specifies a particular alternative distribution that is more appropriate. Thus, whether one uses a non-normal distribution and the type of the distribution should be used are often driven by empirical features of the data. One approach that works around the need to specify parametric structures is to use semi-parametric methods (e.g., Newey, Powell and Walker, 1990). Here, exclusion restrictions are necessary for identification. Finance applications of non-normal selection models remain scarce, so it is hard at this point of time to say whether non-normality is a first order issue deserving particular attention in finance. In one application to calls of convertible bonds (Scruggs, 2006), the data were found to be non-normal, but non-normality made little difference to the major conclusions. 3. Extensions We review two extensions of the baseline Heckman self-selection model, switching regressions and structural selection models. The first allows some generality in specifying regression coefficients across alternatives, while the second allows bidirectional simultaneity between self-selection and post-selection outcomes.8 Each of these extensions generalizes the Heckman model by allowing some flexibility in specification. However, it should be emphasized that the additional flexibility that is gained does not come for free. The price is that the alternative approaches place additional demands on the data or require more stringent economic assumptions. The plausibility and feasibility of these extra requirements should be carefully considered before selecting any alternative to the Heckman model for a given empirical application. 3.1. Switching regressions As in Section 2, a probit model based on exogenous variables drives firms’ self-selection decisions. The difference is that the outcome is now specified separately for firms selecting E and NE, so the single outcome regression (5) in system (3)–(5) is now replaced 8 For instance, in modeling corporate diversification as a decision involving self-selection, structural models would allow self-selection to determine post-diversification productivity changes, as in the standard setup, but also allow anticipated productivity changes to impact the self-selection decision. 48 K. Li and N.R. Prabhala by two regressions. The complete model is as follows: C = E ≡ Zi γ + ηi > 0, (11) YE,i = XE,i βE + ϵE,i , (13) C = NE ≡ Zi γ + ηi ! 0, (12) YNE,i = XNE,i βNE + ϵNE,i , (14) where C ∈ {E, NE}. Along with separate outcome regression parameter vectors βE and βNE , there are also two covariance coefficients for the impact of private information on outcomes, the covariance between private information η and ϵE and that between η and ϵNE . Two-step estimation is again straightforward, and is usually implemented assuming that the errors {ηi , ϵE,i , ϵNE,i } are trivariate normal.9 Given the apparent flexibility in specifying two outcome regressions (13) and (14) compared to the one outcome regression in the standard selection model, it is natural to ask why we do not always use the switching regression specification. There are three issues involved. First, theory should say whether there is a single population regression whose LHS and RHS variables are observed conditional on selection, as in the Heckman model, or whether we have two regimes in the population and the selection mechanism dictates which of the two we observe. In some applications, the switching regression is inappropriate: for instance, it is not consistent with the equilibrium modeled in Acharya (1988). A second issue is that the switching regression model requires us to observe outcomes of firms’ choices in both regimes. This may not always be feasible because we only observe outcomes of firms self-selecting E but have little data on firms that choose not to self-select. For instance, if we were analyzing stock market responses to merger announcements as in Eckbo, Maksimovic and Williams (1990), implementing switching models literally requires us to obtain a sample of would-be acquirers that had never announced to the market and the market reaction on the dates that the markets realize that there is no merger forthcoming. These data may not always be available (Prabhala, 1997).10 A final consideration is statistical power: imposing restrictions such as equality of coefficients {β, π} for E and NE firms (when valid), lead to greater statistical power. A key advantage of the switching regression framework is that we obtain more useful estimates of (unobserved) counterfactual outcomes. Specifically, if firm i chooses E, we observe outcome YE,i . However, we can ask what the outcome might have been had 9 Write equations (13) and (14) in regression form as YC,i = XC,i βC + πC λC (Zi γ ), (15) where C ∈ {E, NE}. The two-step estimator follows: the probit model (11) and (12) gives estimates of γ and hence the inverse Mills ratio λC (.), which is fed into regression (15) to give parameters {βE , βNE , πE , πNE }. As before, standard errors in the second step regression require adjustment because λC (Z γ̂ ) is a generated regressor (Maddala, 1983, pp. 226–227). 10 Li and McNally (2004) and Scruggs (2006) describe how we can use Bayesian methods to update priors on counterfactuals. More details on their approach are given in Section 6. Ch. 2: Self-Selection Models in Corporate Finance 49 firm i chosen NE, the unobserved counterfactual, and what the gain is from firm i’s having made choice E rather than NE. The switching regression framework provides an estimate. The net benefit from choosing E is the outcome of choosing E less the counterfactual had it chosen NE, i.e., YE,i − YNE,i = YE,i − Xi βNE − πNE λNE (Zi γ ). The expected gain for firm i is Xi (βE − βNE ) + (πE λE (.) − πNE λNE (.)).11 We return to the counterfactuals issue when we deal with treatment effects and propensity scores. We make this point at this stage only to emphasize that selection models do estimate treatment effects. This fact is often not apparent when reading empirical applications, especially those employing matching methods. 3.2. Simultaneity in self-selection models The models considered thus far presume that the variables Z explaining the selfselection decision (equations (3) and (4) or equations (11) and (12)) are exogenous. In particular, the bite of this assumption is to preclude the possibility that the decision to self-select choice C does not directly depend on the anticipated outcome from choosing C. This assumption is sometimes too strong in corporate finance applications. For instance, suppose we are interested in studying the diversification decision and that the outcome variable to be studied is firm productivity. The preceding models would assume that post-merger productivity does not influence the decision to diversify. If firms’ decisions to diversify depend on their anticipated productivity changes, as theory might suggest (Maksimovic and Phillips, 2002), the assumption that Z is exogenous is incorrect. The dependence of the decision to self-select on outcomes and the dependence of outcomes on the self-selection decision is essentially a problem of simultaneity. Structural selection models can account for simultaneity. We review two modeling choices. The Roy (1951) model places few demands on the data but it places tighter restrictions on the mechanism by which self-selection occurs. More elaborate models are less stringent on the self-selection mechanism, but they demand more of the data, specifically instruments, exactly as in conventional simultaneous equations models. 3.2.1. The Roy model The Roy model hard-wires the dependence of self-selection on post-selection outcomes. Firms self-select E or NE depending on which of the two alternatives yields the higher outcome. Thus, {E, YE } is observed for firm i if YE,i > YNE,i . If, on the other hand, 11 This expression stands in contrast to the basic Heckman setup. There, in equation (9), β = β NE and E πE = πNE , so the expected difference is π(λE (.) − λNE (.)). There, the sign of the expected difference is fixed: it must equal to the sign of π because (λE (.) − λNE (.)) > 0. Additionally, the expected difference in the setup of Section 2 does not vary with β or variables X that are not part of Z: here, it does. In short, the counterfactual choices that could be made but were not are less constrained in the switching regression setup. 50 K. Li and N.R. Prabhala YNE,i " YE,i , we observe {NE, YNE,i }. The full model is C = E ≡ YE,i > YNE,i , (16) YE,i = Xi βE + ϵE,i , (18) C = NE ≡ YE,i ! YNE,i , YNE,i = Xi βNE + ϵNE,i , (17) (19) where the ϵ’s are (as usual) assumed to be bivariate normal. The Roy model is no more demanding of the data than standard selection models. Two-step estimation is again fairly straightforward (Maddala, 1983, Chapter 9.1). The Roy selection mechanism is rather tightly specified on two dimensions. One, the model exogenously imposes the restriction that firms selecting E would experience worse outcomes had they chosen NE and vice versa. This is often plausible. However, it is unclear whether this should be a hypothesis that one wants to test or a restriction that one imposes on the data. Two, the outcome differential is the only driver of the self-selection decision in the Roy setup. Additional flexibility can be introduced by loosening the model of self-selection. This extra flexibility is allowed in models to be described next, but it comes at the price of requiring additional exclusion restrictions for model identification. 3.2.2. Structural self-selection models In the standard Heckman and switching regression models, the explanatory variables in the selection equation are exogenous. At the other end of the spectrum is the Roy model of Section 3.2.1, in which self-selection is driven solely by the endogenous variable. The interim case is one where selection is driven by both exogenous and outcome variables. This specification is C = E ≡ Zi γ + δ(YE,i − YNE,i ) + ηi > 0, (20) YE,i = Xi βE + ϵE,i , (22) C = NE ≡ Zi γ + δ(YE,i − YNE,i ) + ηi ! 0, YNE,i = Xi βNE + ϵNE,i . (21) (23) The structural model generalizes the switching regression model of Section 3.1, by incorporating the extra explanatory variable YE,i − YNE,i , the net outcome gain from choosing E over NE, in the selection decision, and generalizes the Roy model by permitting exogenous variables Zi to enter the selection equation. Estimation of the system (20)–(23) follows the route one typically treads in simultaneous equations systems estimation—reduced form estimation followed by a step in which we replace the dependent variables appearing in the RHS by their fitted projections. A trivariate normal assumption is standard (Maddala, 1983, pp. 223–239). While structural self-selection models have been around for a while in the labor economics literature, particularly Ch. 2: Self-Selection Models in Corporate Finance 51 those studying unionism and the returns to education (see Maddala, 1983, Chapter 8), applications in finance are of very recent origin. The structural self-selection model clearly generalizes every type of selection model considered before. The question is why one should not always use it. Equivalently, what additional restrictions or demands does it place on the data? Because it is a type of the switching regression model, it comes with all the baggage and informational requirements of the switching regression. As in simultaneous equations systems, instruments must be specified to identify the model. In the diversification example at the beginning of this section, the identification requirement demands that we have at least one instrument that determines whether a firm diversifies but does not determine the expost productivity of the diversifying firm. The quality of one’s estimates depends on the strength of the instrument, and all the caveats and discussion of Section 2.3.1 apply here. 4. Matching models and self-selection This section reviews matching models, primarily those based on propensity scores. Matching models are becoming increasingly common in applied work. They represent an attractive means of inference because they are simple to implement and yield readily interpretable estimates of “treatment effects.” However, matching models are based on fundamentally different set of assumptions relative to selection models. Matching models assume that unobserved private information is irrelevant to outcomes. In contrast, unobserved private information is the essence of self-selection models. We discuss these differences between selection and matching models as well as specific techniques used to implement matching models. To clarify the issues, consider the switching regression selection model of Section 3.1, but relabel the choices to be consistent with the matching literature. Accordingly, firms are treated and belong to group E or untreated and belong to group NE. This assignment occurs according to the probit model pr(E|Z) = pr(Zγ + η) > 0, (24) YE = XE βE + ϵE , (25) where Z denotes explanatory variables, γ is a vector of parameters and we drop firm subscript i for notational convenience. The probability of being untreated is 1−pr(E|Z). We write post-selection outcomes as YE for treated firms and YNE for untreated firms, and for convenience, write YNE = XNE βNE + ϵNE , (26) where (again suppressing subscript i) ϵC denotes error terms, XC denotes explanatory variables, βC denotes parameter vectors, and C ∈ {E, NE}. We emphasize that the basic setup is identical to that of a switching regression of Section 3.1. 52 K. Li and N.R. Prabhala 4.1. Treatment effects Matching models focus on estimating treatment effects. A treatment effect, loosely speaking, is the value added or the difference in outcome when a firm undergoes treatment E relative to not undergoing treatment, i.e., choosing NE. Selection models such as the switching regression specification (equations (11)–(14)) estimate treatment effects. Their approach is indirect. In selection models, we estimate a vector of parameters and covariances in the selection equations and use these parameters to estimate treatment effects. In contrast, matching models go directly to treatment effect estimation, setting aside the step of estimating parameters of regression structures specified in selection models. The key question in the matching literature is whether treatment effects are significant. In the system of equations (24)–(26), this question can be posed statistically in a number of ways. • At the level of an individual firm i, the effectiveness of a treatment can be judged by asking whether E(YE,i − YNE,i ) = 0. • For the group of treated firms, the effectiveness of the treatment for treated firms is assessed by testing whether the treatment effect on treated (TT), equals zero, i.e., whether E[(YE − YNE )|C = E] = 0. • For the population as a whole whether treated or not, we test the significance of the average treatment effect (ATE) by examining whether E(YE − YNE ) = 0. The main issue in calculating any of the treatment effects discussed above, whether by selection or matching models, is the fact that unchosen counterfactuals are not observed. If a firm i chooses E, we observe outcome of its choice YE,i . However, because firm i chose E, we do not explicitly observe the outcome YNE,i that would occur had the firm instead made the counterfactual choice NE. Thus, the difference YE,i − YNE,i is never directly observed for any particular firm i, so its expectation—whether at the firm level, or across treated firms, or across treated and untreated firms—cannot be calculated directly. Treatment effects can, however, be obtained via selection models or by matching models, using different identifying assumptions. We discuss selection methods first and then turn to matching methods. 4.2. Treatment effects from selection models Self-selection models obtain treatment effects by first estimating parameters of the system of equations (24)–(26). Given the parameter estimates, it is straightforward to estimate treatment effects described in Section 4.1, as illustrated, e.g., in Section 3.1 for the switching regression model. The key identifying assumption in selection models is the specification of the variables entering selection and outcome equations, i.e., variables X and Z in equations (24)–(26). Two points deserve emphasis. The first is that the entire range of selection models discussed in Section 2 through Section 3.2 can be used to estimate treatment effects. This point deserves special mention because in received corporate finance applications, the Ch. 2: Self-Selection Models in Corporate Finance 53 tendency has been to report estimates of matching models and as a robustness check, an accompanying estimate of a selection model. With virtually no exception, the selection model chosen for the robustness exercise is the Heckman model of Section 2. However, there is no a priori reason to impose this restriction—any other model, including the switching regression models or the structural models, can be used, and perhaps ought to at least get a hearing. The second point worth mentioning is that unlike matching models, selection models always explicitly test for and incorporate the effect of unobservable private information, through the inverse Mills ratio term, or more generally, through control functions that model private information (Heckman and Navarro-Lozano, 2004). 4.3. Treatment effects from matching models In contrast to selection models, matching models begin by assuming that private information is irrelevant to outcomes.12 Roughly speaking, this is equivalent to imposing zero correlation between private information η and outcome YE in equations (24)–(26). Is irrelevance of private information a reasonable assumption? It clearly depends on the specific application. The assumption is quite plausible if the decision to obtain treatment E is done through an exogenous randomization process. It becomes less plausible when the decision to choose E is an endogenous choice of the decision-maker, which is probably close to many corporate finance applications except perhaps for exogenous shocks such as regulatory changes.13 If private information can be ignored, matching methods offer two routes to estimate treatment effects: dimension-by-dimension matching and propensity score matching. 4.3.1. Dimension-by-dimension matching If private information can be ignored, the differences in firms undergoing treatment E and untreated NE firms only depend on observable attributes X. Thus, the treatment effect for any firm i equals the difference between its outcome and the outcome for a firm j (i) that matches it on all observable dimensions, Formally, the treatment effect equals Yi,E − Yj (i),NE , where j (i) is such that Xi,k = Xj (i),k for all K relevant dimensions, i.e., ∀k, k = 1, 2, . . . , K. Other measures such as TT and ATE defined in Section 4.1 follow immediately.14 Dimension-by-dimension matching methods have a long history of usage in empirical corporate finance, as explained in Chapter 1 (Kothari and Warner, 2007) in this book. 12 See, e.g., Wooldridge (2002) for formal expressions of this condition. 13 Of course, even here, if unobservable information guides company responses to such shocks, irrelevance of unobservables is still not a good assumption. 14 One could legitimately ask why we need to match dimension by dimension when we have a regression structure such as (25) and (26). The reason is that dimension-by-dimension matching is still consistent when the data come from the regressions, but dimension-by-dimension matching is also consistent with other data generating mechanisms. If one is willing to specify equations (25) and (26), the treatment effect is immediately obtained as the difference between the fitted values in the two equations. 54 K. Li and N.R. Prabhala Virtually all studies routinely match on size, industry, the book-to-market ratio, and so on. The “treatment effect” is the matched-pair difference in outcomes. There is nothing inherently wrong with these methods. They involve the same economic assumptions as other matching methods based on propensity scores used in recent applications. In fact, dimension-by-dimension matching imposes less structure and probably represents a reasonable first line of attack in typical corporate finance applications. Matching on all dimensions and estimating the matched-pair differences in outcomes poses two difficulties. One is that characteristics are not always exactly matched in corporate finance applications. For instance, we often match firm size or book-to-market ratios with 30% calipers. When matches are inexact, substantial biases could build up as we traverse different characteristics being matched. A second issue that proponents of matching methods frequently mention is dimensionality. When the number of dimensions to be matched goes up and the matching calipers become fine (e.g., size and prior performance matched within 5% rather than 30%, and 4-digit rather than 2-digit SIC matches), finding matches becomes difficult or even infeasible. When dimensionby-dimension matching is not feasible, a convenient alternative is methods based on propensity scores. We turn to these next. 4.3.2. Propensity score (PS) matching Propensity score (PS) matching methods handle the problems caused by dimensionby-dimension matching by reducing it to a problem of matching on a single one: the probability of undergoing treatment E. The probability of treatment is called the propensity score. Given a probability model such as equation (24), the treatment effect equals the outcome for the treated firm minus the outcome for an untreated firm with equal treatment probability. The simplicity of the estimator and its straightforward interpretation makes the propensity score estimator attractive. It is useful to review the key assumptions underlying the propensity score method. Following Rosenbaum and Rubin (1983), suppose that the probability model in equation (24) satisfies • PS1: 0 < pr(E|Z) < 1. • PS2: Given Z, outcomes YE , YNE do not depend on whether the firm is in group E (NE). Assumption (PS1) requires that at each level of the explanatory variable Z, some firms should pick E and others pick NE. This constraint is frequently imposed in empirical applications by requiring that treated and untreated firms have common support. Assumption (PS2) is the strong ignorability or conditional independence condition. It requires that unobserved private information should not explain outcome differentials between firms choosing E and those choosing NE. This is a crucial assumption. As Heckman and Navarro-Lozano (2004) show, even fairly mild departures can trigger substantial biases in treatment effect estimates. Given assumptions (PS1) and (PS2), Rosenbaum and Rubin (1983) show that the treatment effect is the difference between outcomes of treated and untreated firms hav- Ch. 2: Self-Selection Models in Corporate Finance 55 ing identical treatment probabilities (or propensity scores). Averaging across different treatment probabilities gives the average treatment effect across the population.15 4.3.3. Implementation of PS methods In light of Rosenbaum and Rubin (1983), the treatment effect is the difference between outcomes of treated and untreated firms with identical propensity scores. One issue in implementing matching is that we need to know propensity scores, i.e., the treatment probability pr(E|Z). This quantity is not ex-ante known but it must be estimated from the data, using, for instance, probit, logit, or other less parametrically specified approaches. The corresponding treatment effects are also estimated with error and the literature develops standard error estimates (e.g., Heckman, Ichimura and Todd, 1998; Dehejia and Wahba, 1999; Wooldridge, 2002, Chapter 18). A second implementation issue immediately follows. What variables do we include in estimating the probability of treatment? While self-selection models differentiate between variables determining outcomes and variables determining probability of being treated (X and Z, respectively, in equations (24)–(26)), matching models make no such distinction. Roughly speaking, either a variable determines the treatment probability, in which case it should be used in estimating treatment probability, or it does not, in which case it should be randomly distributed across treated and untreated firms and is averaged out in computing treatment effects. Thus, for matching models, the prescription is to use all relevant variables in estimating propensity scores.16 A third issue is estimation error. In principle, matching demands that treated firms be compared to untreated firms with the same treatment probability. However, treatment probabilities must be estimated, so exact matching based on the true treatment probability is usually infeasible. A popular approach, following Dehejia and Wahba (1999), divides the data into several probability bins. The treatment effect is estimated as the average difference between the outcomes of E and NE firms within each bin. Heckman, Ichimura and Todd (1998) suggest taking the weighted average of untreated firms, with weights declining inversely in proportion to the distance between the treated and untreated firms. For statistical reasons, Abadie and Imbens (2004) suggest that the counterfactual outcomes should be estimated not as the actual outcomes for a matched untreated firm, but as the fitted value in a regression of outcomes on explanatory variables.17 15 This discussion points to another distinction between PS and selection methods. The finest level to which PS methods can go is the propensity score or the probability of treatment. Because many firms can have the same propensity score, PS methods do not estimate treatment effects at the level of the individual firm, while selection methods can do so. 16 This statement is not, of course, a recommendation to engage in data snooping. For instance, in fitting models to estimate propensity scores, using quality of fit as a model selection criterion leads to difficulties, as pointed out by Heckman and Navarro-Lozano (2004). 17 The statistical properties of different estimators has been extensively discussed in the econometrics literature, most recently in a review issue devoted to the topic (Symposium on the Econometrics of Matching, Review of Economics and Statistics 86 (1), 2004).

Self-Selection Models in Corporate Finance

Тебе могут подойти лекции

Strategic marketing planning – a brief overview

Statistics: Introduction

Устойчивое развитие: вызовы и возможности

Глобальные финансовые рынки и экономика России

Диагностика банкротства предприятий

Анализ деловой активности и результатов деятельности предприятия

Анализ финансовой устойчивости

Анализ ликвидности и платежеспособности

Анализ отчетности в системе финансового анализа

Анализ расходов на оплату труда и прочих затрат

Self-Selection Models in Corporate Finance

Тебе могут подойти лекции

Другие экономические предметы

Помощь с написанием учебных работ