Skip to main content
Log in

Handling Baseline Differences and Missing Items in a Longitudinal Study of HIV Risk Among Runaway Youths

  • Published:
Health Services and Outcomes Research Methodology Aims and scope Submit manuscript

Abstract

Many longitudinal studies in field settings present challenges due to selection bias and incomplete data. A motivating example is provided by an intervention study aimed at preventing HIV transmission among runaway youths housed at shelters in New York City. Two shelters with 167 youths received the intervention, and two shelters with 144 youths received the control treatment. The number of unprotected sexual acts in the prior three months for each youth was assessed at a baseline interview and (to the extent possible) at five follow-up time points. Among observed items, there is strong evidence of a lack of balance on baseline characteristics between the intervention and control groups; meanwhile, beyond occasional missing items among participants interviewed at baseline, there were three items about baseline characteristics added after the study began, resulting in a few items being missing on a large percentage of the study sample. Here, we outline two strategies for handling the complexities of this data set, both of which make use of propensity scores to address the imbalances across treatment groups. One approach relies on available cases and ad hoc choices to simplify the steps leading up to a linear mixed model analysis in SAS PROC MIXED; the other approach uses multiple imputation strategies to reflect uncertainty due to missing values in an analogous linear mixed model analysis. Ultimately we did not find substantial qualitative differences in this setting between the available-case and imputed-data approaches. But in both cases, we find that the considerable imbalance on covariates between treatment arms constrains the ability to draw inferences about the intervention effect, suggesting the importance of evaluating propensity-score distributions in quasi-experimental intervention research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • W. G. Cochran, “The effectiveness of adjustment by subclassification in removing bias in observational studies,” Biometrics, 24, pp. 205-213, 1968.

    Google Scholar 

  • R. B. D'Agostino and D. B. Rubin, “Estimating and using propensity scores with partially missing data,” Journal of the American Statistical Association, 95, pp. 749-759, 2000.

    Google Scholar 

  • P. J. Diggle, K.-Y. Liang and S. L. Zeger, Analysis of Longitudinal Data, Clarendon Press, Oxford, 1994.

    Google Scholar 

  • N. M. Laird and J. H. Ware. “Random-effects models for longitudinal data,” Biometrics, 38, PP. 963-974, 1982.

    Google Scholar 

  • P. W. Lavori, R. Dawson and D. Shera, “A multiple imputation strategy for clinical trials with truncation of patient data,” Statistics in Medicine, 14, pp. 1913-1925, 1995.

    Google Scholar 

  • R. C. Littell, G. A. Milliken, W.W. Stroup and R. D. Wolfinger, SAS System for Mixed Models, SAS Institute, Inc., Cary, NC, 1996.

    Google Scholar 

  • R. J. A. Little and D. B. Rubin, Statistical Analysis with Missing Data, John Wiley, New York, 1987.

    Google Scholar 

  • M. Liu, J. M. G. Taylor and T. R. Belin, “Multiple imputation and posterior simulation for multivariate missing data for longitudinal studies,” Biometrics, 56, pp. 1157-1163, 1995.

    Google Scholar 

  • X. Meng, “Multiple-imputation inferences with uncongenial sources of input,” Statistical Science, 9, pp. 538-558, 1994.

    Google Scholar 

  • J. M. Robins, A. Rotnitzky and L. P. Zhao, “Analysis of semiparametric regression models for repeated outcomes in the presence of missing data,” Journal of the American Statistical Association, 90, pp. 106-121, 1995.

    Google Scholar 

  • P. R. Rosenbaum, Observational Studies, Springer-Verlag, New York, 1995.

    Google Scholar 

  • P. R. Rosenbaum and D. B. Rubin, “The central role of the propensity score in observational studies for causal effects,” Biometrika, 70, pp. 41-55, 1983.

    Google Scholar 

  • P. R. Rosenbaum and D. B. Rubin, “Reducing bias in observational studies using subclassification on the propensity score,” Journal of the American Statistical Association, 79, pp. 516-524, 1984.

    Google Scholar 

  • D. B. Rubin, Multiple Imputation for Nonresponse in Surveys, John Wiley, New York, 1987.

    Google Scholar 

  • D. B. Rubin and N. Thomas, “Characterizing the effect of matching using linear propensity score methods with normal distributions,” Biometrika, 79, pp. 797-809, 1992.

    Google Scholar 

  • J. L. Schafer, Analysis of Incomplete Multivariate Data, Chapman & Hall, New York, 1997.

    Google Scholar 

  • J. L. Schafer, Multivariate linear mixed-effects models with missing values, Unpublished technical report, Department of Statistics, Penn State University, In electronic form at http:==www.stat.psu.edu=_jls, 1999.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, J., Belin, T.R., Lee, M.B. et al. Handling Baseline Differences and Missing Items in a Longitudinal Study of HIV Risk Among Runaway Youths. Health Services & Outcomes Research Methodology 2, 317–329 (2001). https://doi.org/10.1023/A:1020327530029

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1020327530029

Navigation