Regression systems for unbalanced panel data: a stepwise maximum likelihood procedure
Introduction
Systems of regression equations have a long history in econometrics. Notable examples are systems of demand equations for inputs or consumer goods derived from producers’ or consumers’ optimizing behaviour. The reduced form of a structural model also has this format. This history is substantially shorter in panel data econometrics. Starting from the textbook model for single equations with balanced panel data and random effects (see e.g., Greene (2003, Chapter 13)), Avery (1977), Baltagi (1980), and Magnus (1982) generalized this framework and the estimation procedures to ‘seemingly unrelated’ regression systems, Magnus (1982) by considering maximum likelihood (ML).1 These generalizations also assumed balanced panel data, which is a very restrictive assumption from a practical point of view, since data sets where the time series of the different units have unequal length, often is the rule, rather than the exception. Single equation extensions of the textbook model to models with unbalanced panel data and random individual effects were considered in Biørn (1981) and Baltagi (1985); see also Verbeek and Nijman (1996). Substantial efficiency may be lost by dropping observations from an unbalanced data set to make it balanced; see Mátyás and Lovrics (1991) and Baltagi and Chang (1994).
The purpose of this paper is to integrate, for random effects situations, the regression system ML approach to balanced panel data with the single equation approach to unbalanced panel data, when the attrition or accretion is random. As a preliminary to the ML problem, the generalized least-squares (GLS) problem is considered. Time specific random effects, which often are of secondary interest for micro data from households, firms, etc., are ignored.2
Section snippets
Model and notation
Consider a system of G regression equations, indexed by g=1,…,G, with observations from an unbalanced panel with N individuals, indexed by i=1,…,N. The individuals are observed in at least one and at most P periods. Let Np denote the number of individuals observed in p periods (not necessarily the same and not necessarily consecutive), p=1,…,P, and let n be the total number of observations, i.e., and n=∑p=1PNpp. Assume that the individuals are ordered in P groups such that the N1
GLS estimation
Before addressing the ML problem, we consider the GLS problem for when and are known, i.e., the problem of minimizing with respect to . Since , we can rewrite Q as
GLS estimation for the individuals observed p times: We may, as a preliminary to full GLS estimation, apply GLS on the observations for the individuals observed p times, denoted as group p, separately
ML estimation
We next consider ML estimation of , , and when assuming normality of the individual effects and the disturbances, i.e., , . Then the 's are independent across i(p) and distributed as . The log-likelihood function of all 's conditional on all 's for the individuals in group p and for all individuals then become, respectively,where
Acknowledgements
I thank an editor, four referees, Terje Skjerpen, and Knut R. Wangen for helpful comments.
References (18)
Pooling cross-sections with unequal time-series lengths
Economics Letters
(1985)- et al.
Incomplete panelsa comparative study of alternative estimators for the unbalanced one-way error component regression model
Journal of Econometrics
(1994) - et al.
A monotonic property for iterative GLS in the two-way random effects model
Journal of Econometrics
(1992) Estimating economic relations from incomplete cross-section/time-series data
Journal of Econometrics
(1981)Maximum likelihood estimation of random effects models
Journal of Econometrics
(1987)Multivariate error components analysis of linear and nonlinear regression models by maximum likelihood
Journal of Econometrics
(1982)- et al.
Missing observations and panel dataa Monte-Carlo analysis
Economics Letters
(1991) - et al.
Estimation of the error components model with incomplete panels
Journal of Econometrics
(1989) Error components and seemingly unrelated regressions
Econometrica
(1977)
Cited by (70)
A household perspective on the commuting paradox: Longitudinal relationships between commuting time and subjective wellbeing for couples in China
2023, Transportation Research Part A: Policy and PracticeTechnological leadership and firm performance in Russian industries during crisis
2021, Journal of Business Venturing InsightsCitation Excerpt :We did this with the help of the seemingly unrelated regression equations (SURE) technique, which explicitly provides for this possibility. Given the panel nature of our data, we utilized the one-way random effect estimation of seemingly unrelated regressions based on the derivations by Biørn (2004) as explained in Nguyen and Nguyen (2010). As some of the variables (e.g., industry) were time-invariant, fixed effects estimation was not appropriate.
New evidence on international risk-sharing in the Economic Community of West African States (ECOWAS)
2021, International EconomicsCitation Excerpt :For this purpose, we use the SUR (Seemingly Unrelated Regression) estimation method which takes into account heteroscedasticity and the contemporary correlation of residues between equations. The SUR method usually takes into account the individual correlation at a given period, while assuming zero correlation between two hazards as soon as the periods are different (Biørn, 2004). Since the study does not individually constrain the coefficients β, they can be greater than 1 or even negative.
Compliance with the EU waste hierarchy: A matter of stringency, enforcement, and time
2021, Journal of Environmental Management