Study Design and Methods

The  databases housed at MCHP permit a large range of population-based observational research. At the same time they are complex, and can present unique methodological challenges. This web page provides information on dealing with these challenges, including general study planning considerations.

Types of Studies

Cross-sectional, case control, and cohort studies are the most commonly used observational designs for studies using data from the MCHP Data Repository. Multiple outcome variables can be examined and data collection tends to be relatively inexpensive. At the same time, issues can include possible loss to followup as well as longitudinal changes in coding, databases, and data fields. Study designs can be descriptive or analytical:

  • Descriptive studies measure disease occurrence, health service use, or risk factor prevalence in populations, seeking to identify personal characteristics, geographic locations, and time periods associated with an unusually high or low risk of events. Population-based descriptive studies provide the most accurate quantitative estimates of disease frequency/exposure prevalence or health service use in various populations. (Ecological studies are a type of descriptive study using aggregate statistics on populations. A fundamental limitation of ecological studies is that they consider only aggregate data for an entire population and not whether the disease and exposure of interest occurred in the same person.)
  • Analytical studies measure the association between disease or service use and exposure in individuals or populations. identifying the specific factors that cause increased or decreased risk.

While cross-sectional studies collect information for a single point in time, case control and cohort studies can be either retrospective or prospective. The former poses a question and looks back whereas the latter poses a question and looks forwards (Institute for Work and Health):

  • A retrospective study "follows up" to view outcomes using data that has already been collected (frequently for other purposes) from past records or from a person's recollections. Issues for this type of study include recall bias and limits to the amount of data that can be collected on one occasion. (Mann 2003).
  • A prospective study follows individuals forward through time to see if they develop the outcome of interest. (Mann 2003). Intra-subject variability tends to be smaller than inter-subject variability for this type of study, which also allows for separation of aging/maturation effects from cohort effects.  


  • Defining the study sample/population [Eligibility Criteria] - includes information on:
    • What are the demographics of the population/sample to be studied? Age, sex and residence might be specified, for example, as men aged 65+ resident in Manitoba.
    • What are the treatment characteristics (disease/procedure, provider)? This might be specified, for example, as  hospital admissions in Winnipeg for acute myocardial infarction defined as ICD-9-CM 410. 
    • Will small area analysis be needed for location of residence and/or for provider?
    • Other questions include:
      • What is the unit of analyis? Will a cohort need to be developed?
      • How will issues characteristic of the study design (e.g., loss to follow-up for longitudinal studies?) be dealt with?
      • Has work been done previously using the study design on MCHP data?
    • See also the study checklist from Mann (2003)  [pdf]
    • Longitudinal Design Issues - Changes in data values occurring over time need to be taken into account in longitudinal study design. They include loss to followup (e.g., residential moves, death), classification coding changes, family structure alterations, etc. Concepts have been developed to assist the researcher with information on handling these issues:
    • Statistical Tools/Issues
      • Statistics for Large Databases - general statistical tools and approaches for working with regression analysis, GEE (generalized estimating equations), sensitivity and specificity (ROC), survival analysis, etc.
      • MCHP Mapping Tools  (e.g., Geocoding Addresses (ArcView); Intra-urban Areas)
      • Methods for making causal inferences
        • Propensity Score Matching [Methodology and SAS code - pdf] - an alternative method to estimate the effect of receiving treatment when random assignment of treatments to subjects is not feasible. 
        • Instrumental variables [Introduction - pdf] - used to control for confounding and measurement error, they allow for the possibility of making causal inferences with observational data.
      • Study bias and validity both need to be carefully explored before and during the observational study:
        • Validity - results should be checked against original or other sources for accuracy (within an acceptable percentage) and logic. Often, a Gold Standard [glossary term] is used. The most common gold standards are: patient chart reviews, a validated database, or survey data.
        • Study Bias - "can occur in any research and reflect the potential that the sample studied is not representative of the population it was drawn from and/or the population at large." (Mann 2003) Subject selection and loss-to-follow-up are examples of potential causes of study bias.

    Last updated July 22, 2011