C H A P T E R 3 Expressing the Treatment Effect 3.1 Measures of Treatment Effect 3.2 What Happens When There Is Confounding 3.3 Treatment Effect Dependent on a Background Factor References In an ideal hypothetical situation we could observe on the same group of individuals the outcome resulting both from applying and from not applying the treatment, We could then calculate the effect of the treatment by comparing the outcomes under the two conditions. We could define a measure of treatment effect for each individual as the difference between his or her outcomes with and without the treatment. If all subjects were exactly alike, this measure would be the same for each. But more commonly, differences between subjects will cause the measure to vary, possibly in relation to background factors. A treatment may, for instance, be more beneficial to younger than to older people; so the effect would vary with age. We may then wish to define a summary measure of the effect of the treatment on the entire group. In Section 3.1 we will explore different summary measures of treatment effect. In Example 2.1, Dr. A's choice was to express the treatment effect as the average difference in blood pressure between patients who drink coffee and those who do not drink coffee. We will see that this choice was dictated partly by the nature of the risk factor and partly by the underlying model that Dr. A had in mind as to how coffee consumption affects blood pressure. In Section 3.2 we will leave our ideal situation and see how, when we use a comparison group to estimate a summary measure of treatment effect, a confounding factor may distort that estimate. Finally, in Section 3.3 we will focus on situations in which the treatment effect is not constant and show that a single summary measure of treatment effect might not be desirable. 3.1 MEASURES OF TREATMENT EFFECT The choice of measure for treatment effect depends upon the form of the risk and outcome variables. It is useful to make the distinction between a numerical variable and a categorical variable. The levels of a numerical variable are numbers, whereas the levels of a categorical variable are labels. Thus age expressed in years is a numerical variable, whereas age expressed as young-middle-aged/old or religion expressed as Catholic/Protestant/Jewish/other are categorical variables. Since the levels of a numerical variable are numbers, they can be combined to compute, for instance, a mean (e.g., the mean age of a group of individuals). For categorical variables, on the other hand, the levels are looked at separately (e.g., there are 45 young individuals, 30 middle-aged, and 60 old). Categorical variables with only two possible levels (e.g., intensive reading program vs. standard reading program) are called dichotomous variables. Furthermore, we will sometimes distinguish between an ordered categorical variable, such as age, and an unordered categorical variable, such as religion. There exists for the first type an intrinsic ordering of the levels (e.g., young/ middle-aged/old), whereas for the second type there is no relationship between the levels (e.g., we cannot arrange the various religions in any particular order). A numerical variable can be created from an ordered categorical variable by assigning numbers or scores to the different levels (e.g.,‹1 to young, 0 to middle-aged, and 1 to old). Using numerical and categorical variables, we can distinguish four different situations, as shown in Figure 3.1. In this book we are concerned mainly with Cases 1 and 2, where the risk variable is categorical. Risk variable Categorical Numerical / \ / \ / \ / \ / \ / \ Outcome variable Categorical Numerical Categorical Numerical Case 1 2 3 4 Figure 3.1 Different cases for measures of treatment effect. Case 1: Consider first the effect of a treatment on a dichotomous outcome, specifically death or survival. Three measures of treatment effect are commonly used (Fleiss, 1973; see also Sheps, 1959, for other proposals). We define the three measures and illustrate their use with the data given in Table 3.1. Notice that in all three examples given in Table 3.1 the treatment is harmful, since the death rate is higher in the treatment group than in the control group. The three measures of treatment effect are: Table 3.1 Measures of Treatment Effect for Dichotomous Treatment and Outcome in Three Examples Example (a) Example (b) Example (c) ----------------- ----------------- ----------------- Treatment Control Treatment Control Treatment Control Death rate 0.06 0.01 0.55 0.50 0.60 0.10 Survival rate 0.94 0.99 0.45 0.50 0.40 0.90 Difference of death rates (Delta) 0.06-0.01=0.05 0.55-0.50=0.05 0.60-0.10=0.50 Relative risk (Theta) 0.06/0.01=6.00 0.55/0.50=1.10 0.60/0.10=6.00 Odds 0.06 0.01 0.55 0.50 0.60 0.10 ratio ---- / ---- = 6.32 ---- / ---- = 1.22 ---- / ---- =13.5 (Psi) 0.94 0.99 0.45 0.50 0.40 0.90 The difference in death rates (Delta) between the treatment and control groups. (In epidemiology this is called the attributable risk.) In example (a) in Table 3.1, Delta = 0.05 means that the risk of dying is 0.05 greater in the treatment group. The relative risk (Theta) is defined as the ratio of the death rate in the treatment group to the death rate in the control group. In example (c) in Table 3.1, Theta = 6 implies that the risk of dying in the treatment group (0.60) is 6 times higher than the risk of dying in the control group (0.10). The odds ratio (Psi) or cross-product ratio is based on the notion of odds. The odds of an event are defined as the ratio of the probability of the event to the probability of its complement. For instance, the odds of dying in the treatment group of example (c) are equal to the death rate (0.60) divided by the survival rate (0.40), or 1.50. When the odds of dying are greater than 1, the risk or probability of dying is greater than that of surviving. Now, the odds ratio in our example is the ratio of the odds of dying in the treatment group (1.50) to the odds of dying in the control group (0.10/0.90 = 0. 11), or 13.50. The odds of dying are 13.50 times higher in the treatment group. The odds ratio can be conveniently computed as the ratio of the product of the diagonal cells of the treatment by survival table‹hence its alternative name, cross-product ratio. In our example, 0.60 0.10 Psi = ---- / ---- (ratio of the odds) 0.40 0.90 0.60 x 0.90 or = ------------- (cross-product ratio) 0.40 x 0.10 = 13.50. The three measures of treatment effect‹difference of rates (Delta), relative risk (Theta), and odds ratio (psi ‹ are linked in the following ways: 1. If the treatment has no effect (i.e., the death rates are equal in the control and treatment groups), then Delta = 0 and Theta = Psi = 1. 2. lf Delta is negative or Theta or Psi is smaller than 1, the treatment is beneficial. Conversely, if Delta is positive, Theta or Psi greater than 1, the treatment is harmful. 3. If the death rates in the treatment and control groups are low, the odds ratio and relative risk are approximately equal [see, e.g., Table 3.1, example (a); see also Appendix 4A]. 4. In certain types of studies (see case-control studies in Chapter 4), only the odds ratio can be meaningfully computed. In these studies the total number of deaths and the total number of survivors are fixed by the investigator, so that death rates and hence differences of death rates and relative risks cannot b~ interpreted. We shall see in Chapter 4 that the odds ratio does have a sensible interpretation in these studies. The three examples of Table 3.1 were chosen in such a way that (a) and (b) lead to the same difference of rates and (a) and (c) to the same relative risk. These examples show that the value of one of the three measures has no predictable relation (other than those mentioned above) to the value of any other two: although (a) and (b) have the same Delta of 0.05, their relative risks (6.00 and 1.10) are widely different. Several factors influence the choice of the measure of treatment effect. The choice may depend on how the measure is going to be used. For example, a difference in death rates would give a better idea of the impact that the treatment would have if it were applied to all diseased people (MacMahon and Pugh, 1970). Berkson (1958; also quoted in Fleiss, 1973), in looking at the effect of smoking on survival, makes this point by saying that "of course, from a strictly practical viewpoint, it is only the total number of increased deaths that matters." On the other hand, the relative risk may highlight a relationship between a risk and an outcome factor. Hill (1965) remarks that although 71 per 10,000 and 5 per 10,000 are both very low death rates, what "stands out vividly" is that the first is 14 times the second. Thus the choice of a measure may be guided by the aim of the study. Also, the investigator may believe that one model is more appropriate than another in expressing how the treatment affects the outcome, and he or she can use the data at hand to test his or her belief. That particular model may suggest a measure of treatment effect. This applies for any of the four cases considered in this section. We will turn to Case 2 and illustrate there how a measure may derive from a model. Case 2: When the outcome variable is numerical (e.g., weight, blood pressure, test score), the difference of the average of the outcome variable between the treatment and comparison groups is a natural measure of treatment effect. For instance, Dr. A can calculate the average blood pressure among coffee drinkers and among non-coffee drinkers and take the difference as a measure of treatment effect. Dr. A. may think of two different ways in which coffee might affect blood pressure. Let Y1 and Y0 be the blood pressure of a given patient with and without coffee drinking. First, coffee drinking might increase blood pressure by a certain amount Delta, which is the same for all patients: Y1 = Y0 + Delta for any patient (ignoring random variation). Second, coffee drinking might increase blood pressure proportionally to each patient's blood pressure. Let Pi be this coefficient of proportionality: Y1' = Pi.Y0 for any patient. By taking logarithms on each side of this expression, we have, equivalently, log Y1 = log Y0 + log Pi. Notice that we have transformed a multiplicative effect (Pi) into an additive effect (log Pi) by changing the scale of the variables through the logarithmic function. In the first case, Delta would be the measure of treatment effect suggested by the model, which Dr. A. could estimate by the difference of average blood pressure in the coffee and no-coffee group. In the second case, he could consider log ~r as a measure of treatment effect, which he could estimate by the difference of the average logarithm of blood pressure between the two groups. Or he may find Pi easier to interpret as a measure of treatment effect and transform back to the original units through the exponential function. Clearly, with the data at hand (see Table 2. 1), the first model (and hence A) is more appropriate. Case 3: An example of Case 3, where the risk variable is numerical and the outcome categorical, is a study of increasing doses of a drug on the chance of surviving for 1 year. The odds of dying can be defined for each dose of the drug. The effect of the drug can be assessed by looking at the change in the odds of dying as the dose increases. A model often used in such cases assumes that for any increase of the dose by 1 unit, the logarithm of the odds changes by a constant amount. This amount is taken as the measure of treatment effect. Case 4: Here both the risk and outcome variables are numerical. Suppose that we want to look at the effect of increasing doses of a drug on blood pressure; if a straight line is fitted to the blood pressure-dose points, the slope of the line can be taken as a measure of the effect of the drug. It represents the change in blood pressure per unit increase in dosage. Regression techniques that can be used in this case will not be discussed in this book. This topic has been covered in many other books (see, e.g., Tufte, 1974; Mosteller and Tukey, 1977; Hanushek and Jackson, 1977; Daniel and Wood, 1971; Colton, 1974). From the discussion of these four cases, it should be clear that a measure of treatment effect not only depends on the form of the risk and outcome variables, but also on the aim of the study, the scale of the variables, and the models judged appropriate by the investigators. 3.2 WHAT HAPPENS WHEN THERE IS CONFOUNDING We know from previous chapters that we might be wary of confounding factors when we compare a group of treated individuals and a group of comparison individuals to assess the effect of a treatment. The purpose of this section is to show how a confounding factor distorts the estimate of the treatment effect, and how crude odds ratios or differences of average outcome are not good estimates of treatment effect in the presence of confounding. As before, we will consider different cases, depending on how the outcome and confounding factors are measured (i.e., whether they are numerical or categorical). We will consider here only dichotomous risk variables, one level being the treatment and the other the comparison. Figure 3.2 illustrates the four possibilities. The numbers (2) and ( I ) at the top of the figure refer to the case [Risk variable] [Dichotomous] / \ / \ (2) / \ (1) / \ Outcome variable Numerical Categorical / \ / \ / \ / \ / \ / \ Confounding variable Numerical Categorical Numerical Categorical Case A B C D Figure 3.2 Different cases for the effect of a confounding factor. Proportion | of | Smokers individuals | __ | __| |__ | __| |__ | __| |__ |_________|____________________|__ | Proportion | of | Nonsmokers individuals | __ | __| |__ | __| |__ | __| |__ |__|____________________|_________ Figure 3.3 Age distribution in the smoking and nonsmoking groups. numbers in Figure 3.1, and these indicate which measures of treatment effect are appropriate for Cases A, B, C, and D. An example of Case A is a study of the effect of smoking on blood pressure where age expressed in years would be a confounding factor. Suppose that the smoking and nonsmoking groups that we compare have the age distributions shown in Figure 3.3. Note that there are very few young smokers and very few old nonsmokers. The average age of smokers is greater than the average age of nonsmokers. In addition, suppose that a plot of blood pressure vs. age in each group suggests, as in Figure 3.4, that blood pressure is linearly related to age, with equal slopes among smokers and nonsmokers. If we denote blood pressure by Y and age by X and use the subscripts S for smokers and NS for nonsmokers, we have (ignoring random variation) Y[S] = Alpha[S] + Beta.X[S] in the smoking group Y[NS] = Alpha[NS] + Beta.X[NS] in the nonsmoking group. | ( * * * indicates a line ) | | | x x * Smokers | x x * | x * x | x * x | * x Blood | x Pressure | | x x * Nonsmokers | x x * | x * x | x * x | * x | x | |_______________________________ Age Figure 3.4 Relationship of blood pressure with age in the smoking and nonsmoking groups. The same slope (beta) appears in the two equations, but the intercepts Alpha[S] and Alpha[NS] are different. Note that age satisfies the definition of a confounding factor given in Chapter 2: it has a different distribution in the smoking and nonsmoking groups (Figure 3.3) and it affects blood pressure within each population (Figure 3.4). If we assume that age and smoking are the only factors affecting blood pressure. we can measure the effect of smoking by the vertical distance between the two lines of Figure 3.4 (i.e., Alpha[S] - Alpha[NS]) In the discussion of Case 2 in Section 3.1, we suggested measuring the treatment effect by the difference between the average outcomes: in our example by Ybar[S] - Ybar[NS], the difference between the average blood pressure in the smoking group and that in the nonsmoking group. Since Ybar[S] = Alpha[S] + beta.Xbar[S] Ybar[NS] = Alpha[NS] + beta.Xbar[NS] (where the bar indicates that we have averaged over the group), it follows that Ybar[S] - Ybar[NS] = (Alpha[S] + beta.Xbar[S]) - (Alpha[NS] + beta.Xbar[NS]) = (Alpha[S] - Alpha[NS]) + Beta.(Xbar[S] ‹ Xbar[NS]) = treatment effect + bias. Thus if we use the difference of average blood pressure, in our example we overestimate the treatment effect by the amount Beta.(Xbar[S] ‹ Xbar[NS]), which we call the bias. We have represented this situation in Figure 3.5, which combines Figures 3.3 and 3.4. (In Figure 3.5 the age distributions in each group from Fig. 3.3 appear at the bottom of the figure and the relationships between blood pressure and age from Figure 3.4 appear as solid lines. The vertical axis of Figure 3.3 is not explicitly shown.) Note that if age were not a confounding factor, either the age distribution would be the same in the two groups (so that Xs‹XNS = 0) or age would not be related to blood pressure (so that ~ = 0): in both cases the bias would be 0. | | | * | | | * > | "Net" | * <------------ Ybar[S] | Diff. | * | C | *| | C Blood | N | | C "Crude" difference Pressure | N | | C | N | |* C | N | * | C > | -------- * <----|-------------- Ybar[NS] | * | | | * | | | | | | | | Proportion | | | of | | | Smokers [S] individuals | | _|_ | | __| | |__ | __| | |__ | __| | |__ |_________|_____________________|__ Age | | | Blood | | | | | | Pressure | _|_ | | __| | |__ | Nonsmokers [NS] | __| | |_| | __| | ||__ |__|__________|______|___|_________ Age Figure 3.5 Treatment effect and bias. As an example of Case B, let us consider sex as a confounding factor. If the difference in mean blood pressures for smokers vs. nonsmokers is the same for males and females, this difference ~may be regarded as the treatment effect (again assuming that no factors, other than smoking and sex, affect blood pressure). But if males have higher blood pressures than females and if males are more likely to smoke than females, the overall difference in average blood pressure between smokers and nonsmokers is biased as in Case A. Another example of Case B is Example 2.1. To illustrate Case C, where the outcome is categorical and the confounding is numerical, let us suppose that we are interested in the effect of smoking on mortality, and once again we will consider age as a confounding factor. Assume the same age distributions as in the example for Case A (see Figure 3.3). Now consider, for instance, the smoking group: to each level of age corresponds a death rate, and a plot of death rate vs. age may suggest a simple relationship between them; similarly in the nonsmoking group. For instance, in Figure 3.6, we have assumed that the relationship between death rate and age could be described by an exponential curve in each group, or equivalently that the relationship between the logarithm of the death rate and age could be described by a straight line in each group. There is a bit more in the chapter, but i havent been able to fix it up just yet....