270 likes | 439 Vues
Techniques for Reallocating State Estimates of the Undocumented Immigrant Population to Small Area Geographies. Texas SDC/BIDC Conference for Data Users May 22, 2013 Austin, TX. Rationale. Texas is one of the fastest growing states, with migration making up 45% of this growth.
 
                
                E N D
Techniques for Reallocating State Estimates of the Undocumented Immigrant Population to Small Area Geographies Texas SDC/BIDC Conference for Data Users May 22, 2013 Austin, TX
Rationale • Texas is one of the fastest growing states, with migration making up 45% of this growth. • Issue of immigration, especially unauthorized or illegal migration, critical when planning and considering: • Concerns about border security • Concerns about economic impact on receiving communities • Concerns about resulting shifts in the social characteristics of communities • With the exception of California, sub-state level estimates of the undocumented population are not available.
Background • Conventionally, estimation of the undocumented population is produced using the residual method (Warren 2011; Passel 2010, 2011). • Estimates of legal foreign born residents are subtracted from estimates of the foreign born population. • Most commonly used national and state estimates include Pew Hispanic Center, Dept. of Homeland Security, and R. Warren estimates.
Background Source: Pew Hispanic Center, 2011
Background Source: Warren, 2010
Challenge Residual method presents challenges when attempting to produce estimates at lower geographies due to data unavailability.
Literature Review • Hill & Johnson (2011) employ a methodology that combines census population data with newadministrative data that allows for estimation of the total unauthorized population and its distribution at sub-state level geographies • 80 percent of unauthorized immigrants report filing federal income taxes and about 75 percent report having payroll taxes withheld (Porter 2005; Hill et al. 2010) • Estimates suggest over half of unauthorized immigrants already pay income and payroll taxes through withholding, filed tax returns, or both (Orrenius and Zavodny 2012)
Literature Review • Since immigrants without work authorization do not have valid social security numbers, many instead use Internal Revenue Service (IRS) issued Individual Taxpayer Identification Numbers (ITIN) when filing tax returns. • Hill et al. (2011) have shown a high correlation (0.96 < r < 0.98) between the ITIN filers and unauthorized immigrant estimates in the U.S.
Objectives • To reallocate Texas state estimates of the unauthorized to the county level using ITIN data • To expand upon this new estimation method by employing spatial prediction techniques to refine the distribution of unauthorized immigrants across the state
Data Sources • R. Warren’s 2008 state level estimates of the unauthorized, • 2008 IRS Individual Taxpayer Identification Number (ITIN) administrative data, • American Community Survey (ACS) 2008 estimates of relevant sociodemographic characteristics, and • U.S. Bureau of Economic Analysis (BEA) local employment data for 2008
Methodology • Not all unauthorized immigrants file tax returns and not all ITIN filers are unauthorized (Hill et al. 2011). • Hill & Johnson use regression analysis and incorporate economic and sociodemographic characteristics related to the unauthorized immigrant status to predict a state level ratio of ITIN filers to unauthorized immigrants.
Methodology • Model 1 borrows Hill & Johnson method to identify important parameters useful for modeling the ITIN to unauthorized state estimate ratio. • Model 2 is a simple OLS regression using parameters identified in Model 1 to estimate the ITIN to unauthorized at the county level. • Model 3 is a geographically weighted regression model that incorporates a county-specific ratio that estimates the distribution of ITIN filers as a percentage of unauthorized immigrants by county. • The final step involves applying the respective predicted values from each of these models and scaling these to Warren’s statewide estimate.
Methodology • Run a weighted least squares regression, weighted by foreign-born residents, using a backward elimination stepwise method (ITIN/Warren Estimate)s = Xsα + Wsβ + Zsγ + εs • This ratio is then used as a factor to allocate the unauthorized populations at the county level.
Methodology • Model 1 borrows Hill & Johnson method to identify important parameters useful for modeling the ITIN to unauthorized state estimate ratio. • Model 2 is a simple OLS regression using parameters identified in Model 1 to estimate the ITIN to unauthorized at the county level. • Model 3 is a geographically weighted regression model that incorporates a county-specific ratio that estimates the distribution of ITIN filers as a percentage of unauthorized immigrants by county. • The final step involves applying the respective predicted values from each of these models and scaling these to Warren’s statewide estimate.
Methodology • Model 1 borrows Hill & Johnson method to identify important parameters useful for modeling the ITIN to unauthorized state estimate ratio. • Model 2 is a simple OLS regression using parameters identified in Model 1 to estimate the ITIN to unauthorized at the county level. • Model 3 is a geographically weighted regression model that incorporates a county-specific ratio that estimates the distribution of ITIN filers as a percentage of unauthorized immigrants by county. • The final step involves applying the respective predicted values from each of these models and scaling these to Warren’s statewide estimate.
Methodology OLS Model Space plays no role in the modeling process, and the global coefficients are constant across the entire sample size. GWR Model Addresses spatial non-stationarity and yields a set of estimates of spatially varying parameters for each geographic location. Smooths out distribution and provides estimates even in counties where ITIN=0.
Methodology • Model 1 borrows Hill & Johnson method to identify important parameters useful for modeling the ITIN to unauthorized state estimate ratio. • Model 2 is a simple OLS regression using parameters identified in Model 1 to estimate the ITIN to unauthorized at the county level. • Model 3 is a geographically weighted regression model that incorporates a county-specific ratio that estimates the distribution of ITIN filers as a percentage of unauthorized immigrants by county. • The final step involves applying the respective predicted values from each of these models and scaling these to Warren’s statewide estimate.
Results • The GWR model was a better fit when compared to the OLS model. • Higher unauthorized estimates were found in areas characterized by agriculture, urbanicity, high employment, fast Hispanic population growth, and substantial foreign born populations • These areas include counties in the Dallas-Fort Worth-Arlington, Houston-Baytown-Sugarland, and Austin-Round Rock metropolitan areas, large border counties, and counties in parts of East Texas. • When examined as a percentage of the county population, Panhandle counties and counties in the Dallas and border areas have higher percentages.
Future Directions • Estimate models specific to Texas • Explore trends from available data • Explore other spatial techniques
Acknowledgements Laura Hill & Hans Johnson @ Public Policy Institute of California & Robert Warren
Contact Office: (512) 463-8390 or (210) 458-6530 E-mail: State.Demographer@osd.state.tx.us Website: http://osd.state.tx.us Office of the State Demographer