500 likes | 678 Vues
A New Approach to Measuring Socio-Spatial Economic Segregation. Sean F. Reardon Stanford University Glenn Firebaugh Pennsylvania State University David O’Sullivan University of Auckland Stephen Matthews Pennsylvania State University.
E N D
A New Approach to Measuring Socio-Spatial Economic Segregation Sean F. Reardon Stanford University Glenn Firebaugh Pennsylvania State University David O’Sullivan University of Auckland Stephen Matthews Pennsylvania State University Presentation prepared by Jacques Silber (Bar-Ilan University)
Goal of the paper • To develop an approach to measuring spatial economic segregation • To develop different measures of segregation along an ordinal dimension since income data are often reported using ordered categories • To show how these measures can be adapted to take into account the spatial or social proximity of individuals
Existing Measures of Economic Segregation • Category-Based Measures of Economic Segregation • Most common way of measuring income segregation has been to divide the population into two categories, based on some chosen income threshold • Segregation between the two groups is then computed using some segregation measure (e.g. the dissimilarity index) that assumes the existence of only two groups
Category-Based Measures of Economic Segregation Problems : - Dichotomizing the income distribution amounts to discarding a lot of information - The results of such an approach may depend on the choice of threshold Possible Solutions : - Compute the two-group segregation for all possible pairs of income categories, and then construct some summary measure of the pairwise indices - Compute segregation among multiple income category groups using for example the Theil information theory index of segregation
Category-Based Measures of Economic Segregation But still unsolved issues: • The Theil index measures segregation among a set of unordered groups (such as racial groups) and hence is insensitive to the ordinal nature of income segregation • Such an approach remains sensitive to the number and location of the thresholds used to define income categories
Existing Measures of Economic Segregation • Variation-Ratio Based Measures of Economic Segregation • An alternative approach defines segregation as a ratio of the between neighborhoods variation in income to the total population variation in income • As measures of dispersion one can use the variance of incomes, the so-called Theil-Bourguignon population-weighted decomposable inequality index, etc…
Variation-Ratio Based Measures of Economic Segregation • Advantages of such an approach : • Can use complete information on the income distribution • Does not rely on arbitrary thresholds • These measures are often invariant to certain types of changes in the income distribution
Existing Measures of Economic Segregation • Spatial Autocorrelation Measures of Economic Segregation • Most proposed measures of income segregation are aspatial in the sense that they do not account for the spatial proximity of individuals/households. • A third approach would therefore consider that segregation should measure the extent to which households near one another have more similar incomes than those that are farther from one another • This approach is the least well-developed
Measuring Segregation by an Ordinal Category • Following Reardon and Firebaugh (2002) the idea is to assume that segregation is the proportion of the total variation in a population that is due to differences in population composition of different organizational units (e.g., schools or census tracts) • Assume a measure of variation v. The segregation measure S(v) will then be expressed as :
Measuring segregation by an ordinal category S(v) = j=1 to J (tj /T) (1 – (vj / v)) where : the subindex j refers to the organizational unit (e.g. region), tj to the population in unit j, T to the total population, vj to the variation in unit j and v to the variation in the total population
Measuring segregation by an ordinal category • Measuring ordinal variation • Consider an ordinal variable x that can take on any of K ordered categories 1, 2, …, K • Ordinal variation will be assumed to be maximal (e.g. equal 1) when half the population has x=1 and half has x=K • Ordinal variation will be minimal (e.g. equal to 0) when all observations have x=k for some k1, 2,…, K • Measuring ordinal variation then amounts to measuring how close the distribution of x is to these minimum and maximum variation states
Measuring segregation by an ordinal category • Consider the set C of [K-1] cumulative proportions ck , with C=(c1 , c2 ,…, cK-1), where ck is the cumulative proportion of the sample with values of X in category k or below (cK = 1 by definition). • The maximal variation corresponds to C = (½, ½, ½,…, ½) which is the case where half the population has the lowest possible value and half has the highest possible value of x. Note also that there are K possible distributions of x such that there is no variation in x, which implies that cj = 0 for j<k and cj = 1 for jk.
Measuring segregation by an ordinal category • Two possible measures of ordinal variation • Entropy based measure E0 = (1/(K-1))k=1 to K-1[ck log2 (1/ck)+(1-ck) log2 (1/(1- ck))] • Measure based on the concept of diversity I0 = (1/(K-1)) k=1 to K-1 4 ck (1- ck)
Corresponding Measures of Segregation • Ordinal information theory index (an ordinal generalization of the categorical information theory index H of Theil) H0 = j=1 to J (tj/T) (1 – (Eoj/E0)) • Ordinal variation ratio index (an ordinal generalization of the categorical variance ratio index, based on the concept of diversity) R0 = j=1 to J (tj/T) (1 – (Ioj/I0))
Ordinal segregation as an average of pairwise segregation • Both H0 and R0 can be written as weighted averages of a set of K-1 pairwise segregation indices: H0 = (1/(K-1)) (1/E0) k=1 to K-1 Ek Hk R0 = (1/(K-1)) (1/I0) k=1 to K-1 Ik Rk where the subscript k indicates variation or segregation computed between the two groups defined by the kth threshold (i.e., Hk and Rk are segregation levels measured between a group consisting of all those with incomes in category k or below and a group consisting of all those in category k+1 or above)
Measuring segregation by an ordinal category • Graphical Representation • Figure 1 shows cumulative household income percentile density curves for each of the 176 census tracts in San Francisco County, in 2000 • Figure 2 shows the corresponding curves for the 613 tracts in Wayne County • In both figures, the x-axis indicates both the local (i.e., San Francisco or Wayne County) income percentiles and the 15 income thresholds used in the 2000 census • The income distribution in San Francisco is generally “higher” than in Wayne County (25% of households in San Francisco reported incomes greater than $100,000, compared to 12% of Wayne County) • If there were no income segregation in either county, each tract’s cumulative household income percentile density curve would fall exactly on the 45-degree line
Measuring segregation by an ordinal category • If income segregation had been complete, each tract’s curve would be a vertical line at some income level, indicating that within each tract all households have the same income • Thus, income segregation can be measured by the average deviation of the tract cumulative household income percentile density curves from their regional average (which is, by definition, the 45-degree line) • By this measure, Wayne County appears more segregated by income than San Francisco, since the variation of the tract cumulative density curves around the 45-degree line is greater in Wayne County
Measuring segregation by an ordinal category • Figures 3 and 4 show the pairwise household income segregation levels computed at each of the 15 Census 2000 thresholds for San Francisco and Wayne County, respectively • In addition, each figure illustrates the relative weight (dashed lines) that the pairwise segregation computed at each threshold is given in the calculation of the ordinal segregation measures (which are shown by the thin horizontal lines in each figure)
Measuring segregation by an ordinal category • Segregation, (whether using an approach based on entropy or one based on diversity) is relatively flat across most of the middle of the income percentile distribution in both places, but increases or decreases sharply at the extremes of the distribution, depending on which measure is used • As expected, measured segregation at each income percentile is generally higher in Wayne County than in San Francisco, regardless of which measure is used • N.B. The ordinal segregation measures for San Francisco and Wayne County in Figures 3 and 4 are not exactly comparable to one another because income thresholds do not fall at the same percentiles of the distributions (there are differences in the overall income distributions in the two counties) • Moreover the measures clearly depend on the choice of thresholds
Measuring segregation by an ordinal category • Therefore if pk is near 0 or 1, the entropy related index contains little information about the segregation experienced by an individual, since it distinguishes among individuals only at one extreme of the income distribution. Conversely, if pk is near 0.5, then the entropy related index will contain a maximal amount of information, since the distinction between the two groups takes place at the median of the distribution. • The same phenomenon occurs with a diversity based segregation index. For a given threshold k, the probability that two randomly-selected individuals from the population will have incomes on opposite sides of threshold k is 2 pk(1- pk). • Because such a probability is maximal when pk=0.5 and minimal when pk=0 or pk=1, a greater weight will be given to the case where segregation between groups is defined by the median of the income distribution than to the cases where a distinction is made between an extreme income group and the remainder.
Measuring segregation by an ordinal category • Incorporating spatial proximity into measures of income segregation • Spatial rank-order information theory index and spatial rank-order ratio index • The expressions are similar to those given previously for the rank-order segregation indices but take into account spatial proximity, as explained in the paper by Reardon and Sullivan on measures of spatial segregation
Empirical Examples • Table 1 reports estimated household income segregation levels for 6 metropolitan areas. The first column in each panel of the table reports the ordinal income segregation measures H0 and R0. • The subsequent columns report the rank-order income segregation measures estimated based on polynomial approximations of orders M=2 through M=10. • The rank-order measures are remarkably stable, regardless of the order of polynomial used. This is largely because the functions H(p) and R(p) are relatively smooth functions
Empirical Examples • Figure 5 illustrates the values of Hk at each of the 15 thresholds for the New York metropolitan area, as well as the fitted polynomials of order M=2, 3,…, 10. Note that for polynomials of order 4 or higher, the curves fit the points extremely well through most of the range • Table 1 indicates that the ordinal measures H0 and R0 are often a reasonably good estimate of the rank-order measures HR and RR, though not always. In San Jose, for example, R0 is 10 percent lower than the values of RR. In Pittsburgh, in contrast, H0 is 3% larger than the values of HR. Moreover, Atlanta appears more segregated than Minneapolis on the basis of H0, but less segregated on the basis of HR
Empirical Examples Illustration based on spatial rank order information theory index • The authors use a biweight kernel proximity function with radii varying from 500m to 4000m. The information theory based index will then indicate how much less income variation there is in a radius of 500m (or 4000m) than in the metropolitan area as a whole • Table 2 reports values, for each of the 6 metropolitan areas, of the spatial rank order information theory index and the spatial relative diversity index, each computed at radii of 500m, 1000m, 2000m, and 4000m
Empirical Examples • Table 2 indicates that income segregation declines with scale, though at different rates across metropolitan areas. In Atlanta, for example, income segregation computed using 4000m-radius local environments is two-thirds of segregation computed using a 500m radius. • We examine the patterns of income segregation in more detail for each of the metropolitan areas in Figures 6-11.
Empirical Examples • In each figure, we plot the pairwise spatial information theory index computed at the threshold defined by the various income percentiles, estimated at the four radii. • Figure 9, for example, illustrates the case of New York. It appears that the segregation gradient at the high end of the income distribution is quite steep. This indicates that the highest-income households are substantially more segregated from other households than are the lowest-income households.
DISCUSSION • This is a very interesting paper that suggests new ways of measuring segregation by income • I will not discuss the section of the paper that deals (at the end) with spatial segregation measurement because this is a vast field to which Reardon has made very important contributions • I have to confess however that I am only starting to read this literature (to which until now mostly sociologists and geographers have contributed). I do not feel that I know enough about this topic to make relevant remarks
DISCUSSION • I feel however much more comfortable with the other topics covered by this paper • I found out that sociologists are not aware of important contributions in economics in the same way as sociologists reviewing some of my papers have drawn my attention to the fact that I was almost completely ignorant of important sociological contributions
DISCUSSION (1) • Let me start with Figures 1 and 2 which are the basis for the derivation of some of the segregation measures proposed by the authors • The curves drawn in Figures 1 and 2 are in fact what economists have called Interdistributional Lorenz Curves • This concept was first proposed by Butler and McDonald (1987)
DISCUSSION (1) • Let x be a continuous income variable with a probability density f(x). The hth partial moment, given a target income , will then be defined as 0 to xh f(x) dx. • The normalized incomplete moment of x for x is then defined as (,h,x) = [0 to xh f(x) dx]/E(xh) where E(xh) = lim 0 to xh f(x) dx We may therefore interpret (,1,x) as the proportion of income received by individuals with income x smaller than or equal to Butler and McDonald have then proposed to plot (,h,x) against (,h,x) for h=0 or h=1 where and are two population subgroups Plotting (,h,x) on the horizontal line and (,h,x) on the vertical line, we will obtain an interdistributional Lorenz curve that will lie below the 45-degree line if subgroup is unambiguously disdvantaged, that is if (,h,x) (,h,x) for every .
DISCUSSION (1) • Deutsch and Silber (1999) have shown that these curves allow one to compute Pietra or Gini indices that measure the economic advantage of one group over another, a concept originally proposed by Dagum (1985) but which goes back to Gini’s notions of “Transvariazione” and “Ipertransvariazione”
DISCUSSION (1) • The novelty of the paper by Reardon et al. is that it does not limit the comparison to two groups but extends it to all relevant subgroups (e.g. Census tracts) in a given population(e.g. metropolitan area), the horizontal axis referring always to the overall population (metropolitan area).
DISCUSSION (1) • Moreover the paper suggests using all the binary comparisons between the various census tracts and the metropolitan area, that give rise to all these curves, to derive a measure of the dispersion of these curves which amounts to computing income based segregation indices • Bishop, Chow and Zeager (2004) have derived statistical tests for these interdistributional Lorenz curves and I believe it would be worthwhile to extend their work and derive tests that would allow to conclude more firmly, for example, whether an income based segregation index in a given metropolitan area is higher or lower than the corresponding one in another area
DISCUSSION (2) • My second basic remark has to do with the way the authors compute their income based segregation indices • Whether they used information or diversity based indices, their idea is to derive the ratio of the between tracts segregation over the total (metropolitan area) segregation. Clearly if this ratio is high, most of the segregation takes place between tracts
DISCUSSION (2) • It is interesting to note that such a ratio will always • rise when the between areas segregation increases • decrease when the within areas segregation increases • But these are precisely the two basic axioms used by Esteban and Ray as well as others when devising what they call a polarization index
DISCUSSION (2) • This comes out actually very clearly when one recalls that Reardon and his co-authors write that : • ordinal variation will be assumed to be maximal when half the population has x=1 and half has x=k (recall that there are K ordered categories 1,2 …, K) • ordinal variation will be minimal when all observations have x=k for some k1,2,..,K.
DISCUSSION (2) • So here again is a proof that there is room for much more interaction between sociologists and economists, at least in the field of segregation measurement
DISCUSSION (3) • My third remark concerns again this idea of measuring segregation when categories are ordered • I would like to draw the attention of the authors to the fact that Hutchens has recently completed a paper where he extends his square root index to the case of ordered categories (e.g. occupations ranked by prestige or income). This square root index is in fact used by Jenkins et al. in the paper they present in this session. • I would like to stress that Hutchens derived axiomatically this broader index that he has recentlysuggested