230 likes | 348 Vues
CMGPD-LN Methodological Lecture. Day 9 Family and contextual influences. Existing variables. The various datasets already have a number of basic kinship variables such as counts of various types of kin. Next slide is parental survival
E N D
CMGPD-LNMethodological Lecture Day 9 Family and contextual influences
Existing variables • The various datasets already have a number of basic kinship variables such as counts of various types of kin. • Next slide is parental survival • If you create kinship variables, be careful about mixing them with existing kinship variables in calculations • Kinship variables are sensitive to assumptions made in their creation, so best to be consistent
Parental survival use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear merge 1:1 RECORD_NUMBER using "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0003\27063-0003-Data.dta" keep if FATHER_ALIVE >= 0 & MOTHER_ALIVE >= 0 generate both_parents_alive = FATHER_ALIVE & MOTHER_ALIVE generate one_parent_alive = (FATHER_ALIVE+MOTHER_ALIVE) == 1 generate no_parents_alive = FATHER_ALIVE+MOTHER_ALIVE == 0 keep if AGE_IN_SUI >= 1 & AGE_IN_SUI <= 80 keep if SEX == 2 & PRESENT bysort AGE_IN_SUI: generate byte first_in_age = _n == 1 bysort AGE_IN_SUI: egenp_both_parents_alive = mean(FATHER_ALIVE & MOTHER_ALIVE) bysort AGE_IN_SUI: egenp_one_parent_alive = mean((FATHER_ALIVE + MOTHER_ALIVE)==1) bysort AGE_IN_SUI: egenp_no_parent_alive = mean((FATHER_ALIVE + MOTHER_ALIVE)==0) bysort AGE_IN_SUI: egenp_father_alive = mean(FATHER_ALIVE) bysort AGE_IN_SUI: egenp_mother_alive = mean(MOTHER_ALIVE) line p_both_parents_alivep_one_parent_alivep_no_parent_alivep_father_alivep_mother_alive AGE_IN_SUI if first_in_age, ytitle("Proportion") xtitle("Age in sui") legend(order(1 "Both parents alive" 2 "One parent alive" 3 "No parent alive" 4 "Father alive" 5 "Mother alive")) scheme(s1mono) lpattern(solid dash dot dash_dotshortdash)
Basic principles for locating descendants in same year • Sons • Males in the same YEAR whose FATHER_ID is same as individual’s PERSON_ID • Daughters • Never-married females (MARITAL_STATUS == 2) in the same YEAR whose FATHER_ID is same as an individual’s PERSON_ID • Married daughters-in-law • Married women (MARITAL_STATUS == 1 or 4) in the same year whose FATHER_ID is same as individual’s PERSON_ID • For married and widowed women, kinship identifiers for father, mother etc. all refer to in-laws • Widowed daughters-in-law • Widowed women (MARITAL_STATUS == 3) in the same year whose FATHER_ID is same as individual’s PERSON_ID • Grandchildren • Same as above, but look for values of GRANDFATHER_ID that match PERSON_ID
Numbers of living sons use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if FATHER_ID != "-99" keep if PRESENT keep if SEX == 2 bysort FATHER_ID YEAR: generate sons_alive = _N bysort FATHER_ID YEAR: keep if _n == 1 keep FATHER_ID YEAR sons_alive rename FATHER_ID PERSON_ID save sons_alive use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear merge m:1 PERSON_ID YEAR using sons_alive, keep(match master) keep if SEX == 2 replace sons_alive = 0 if sons_alive == . keep if AGE_IN_SUI > 0 & AGE_IN_SUI <= 80 bysort AGE_IN_SUI: generate byte first_in_age = _n == 1 bysort AGE_IN_SUI: egenmean_sons_alive = mean(sons_alive) line mean_sons_alive AGE_IN_SUI if first_in_age, scheme(s1mono) ytitle("Mean number of living sons") xtitle("Age in sui") bysort AGE_IN_SUI: egen p_sons_alive_0 = mean(sons_alive == 0) bysort AGE_IN_SUI: egen p_sons_alive_0 = mean(sons_alive == 1) bysort AGE_IN_SUI: egen p_sons_alive_1 = mean(sons_alive == 1) bysort AGE_IN_SUI: egen p_sons_alive_2 = mean(sons_alive == 2) bysort AGE_IN_SUI: egen p_sons_alive_3 = mean(sons_alive == 3) bysort AGE_IN_SUI: egen p_sons_alive_4 = mean(sons_alive == 4) bysort AGE_IN_SUI: egen p_sons_alive_gt_5 = mean(sons_alive >= 5) line p_sons_alive_0 p_sons_alive_1 p_sons_alive_2 p_sons_alive_3 p_sons_alive_4 p_sons_alive_gt_5 AGE_IN_SUI if first_in_age, scheme(s1mono) ytitle("Prop. of men with specified # of living sons") xtitle("Age in sui") legend(order(1 "0" 2 "1" 3 "2" 4 "3" 5 "4" 6 "5+"))
Numbers of living grandsons use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if GRANDFATHER_ID != "-99" keep if PRESENT keep if SEX == 2 bysort GRANDFATHER_ID YEAR: generate grandsons_alive = _N bysort GRANDFATHER_ID YEAR: keep if _n == 1 keep GRANDFATHER_ID YEAR grandsons_alive rename GRANDFATHER_ID PERSON_ID save grandsons_alive, replace use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear merge m:1 PERSON_ID YEAR using grandsons_alive, keep(match master) keep if SEX == 2 replace grandsons_alive = 0 if grandsons_alive == . keep if AGE_IN_SUI > 0 & AGE_IN_SUI <= 80 bysort AGE_IN_SUI: generate byte first_in_age = _n == 1 bysort AGE_IN_SUI: egenmean_gsons_alive = mean(grandsons_alive) line mean_gsons_alive AGE_IN_SUI if first_in_age, scheme(s1mono) ytitle("Mean # living grandsons") xtitle("Age in sui") bysort AGE_IN_SUI: egen p_gsons_alive_0 = mean(grandsons_alive == 0) bysort AGE_IN_SUI: egen p_gsons_alive_1 = mean(grandsons_alive == 1) bysort AGE_IN_SUI: egen p_gsons_alive_2 = mean(grandsons_alive == 2) bysort AGE_IN_SUI: egen p_gsons_alive_3 = mean(grandsons_alive == 3) bysort AGE_IN_SUI: egen p_gsons_alive_4 = mean(grandsons_alive == 4) bysort AGE_IN_SUI: egen p_gsons_alive_gt_5 = mean(grandsons_alive >= 5) line p_gsons_alive_0 p_gsons_alive_1 p_gsons_alive_2 p_gsons_alive_3 p_gsons_alive_4 p_gsons_alive_gt_5 AGE_IN_SUI if first_in_age, scheme(s1mono) ytitle("Prop. of men with # of living grandsons") xtitle("Age in sui") legend(order(1 "0" 2 "1" 3 "2" 4 "3" 5 "4" 6 "5+"))
Locating members of the same generation • Brothers • Males in the same YEAR who have the same FATHER_ID • Sisters • Never-married females in the same YEAR who have the same FATHER_ID • Sisters-in-law • Married or widowed females in the same YEAR who have the same FATHER_ID • Male cousins • Males in the same YEAR with the same GRANDFATHER_ID but a different FATHER_ID • Male second cousins • Males in the same YEAR with the same F_ID_3 but a different GRANDFATHER_ID and FATHER_ID
Numbers of living brothers use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if FATHER_ID != "-99" & PRESENT & SEX == 2 bysort FATHER_ID YEAR: generate brothers_alive = _N-1 bysort FATHER_ID YEAR: keep if _n == 1 keep FATHER_ID YEAR brothers_alive save brothers_alive, replace use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if FATHER_ID != "-99" & SEX == 2 merge m:1 FATHER_ID YEAR using brothers_alive, keep(match master) replace brothers_alive = 0 if brothers_alive == . keep if AGE_IN_SUI > 0 & AGE_IN_SUI <= 80 bysort AGE_IN_SUI: generate byte first_in_age = _n == 1 bysort AGE_IN_SUI: egenmean_brothers_alive = mean(brothers_alive) line mean_brothers_alive AGE_IN_SUI if first_in_age, scheme(s1mono) ytitle("Mean number of living brothers") xtitle("Age in sui") bysort AGE_IN_SUI: egen p_brothers_alive_0 = mean(brothers_alive == 0) bysort AGE_IN_SUI: egen p_brothers_alive_1 = mean(brothers_alive == 1) bysort AGE_IN_SUI: egen p_brothers_alive_2 = mean(brothers_alive == 2) bysort AGE_IN_SUI: egen p_brothers_alive_3 = mean(brothers_alive == 3) bysort AGE_IN_SUI: egen p_brothers_alive_4 = mean(brothers_alive == 4) bysort AGE_IN_SUI: egen p_brothers_alive_gt_5 = mean(brothers_alive >= 5) line p_brothers_alive_0 p_brothers_alive_1 p_brothers_alive_2 p_brothers_alive_3 p_brothers_alive_4 p_brothers_alive_gt_5 AGE_IN_SUI if first_in_age, scheme(s1mono) ytitle("Prop. of men with specified # of living brothers") xtitle("Age in sui") legend(order(1 "0" 2 "1" 3 "2" 4 "3" 5 "4" 6 "5+"))
Numbers of living cousins use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if GRANDFATHER_ID != "-99" & FATHER_ID != "-99" & PRESENT == 1 & SEX == 2 bysort FATHER_ID YEAR: generate brothers_alive = _N bysort GRANDFATHER_ID YEAR: generate cousins_alive = _N - brothers_alive bysort GRANDFATHER_ID YEAR: keep if _n == 1 keep GRANDFATHER_ID YEAR cousins_alive save cousins_alive, replace use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if GRANDFATHER_ID != "-99" & FATHER_ID != "-99" & SEX == 2 merge m:1 GRANDFATHER_ID YEAR using cousins_alive, keep(match master) replace cousins_alive = 0 if cousins_alive == . keep if AGE_IN_SUI > 0 & AGE_IN_SUI <= 80 bysort AGE_IN_SUI: generate byte first_in_age = _n == 1 bysort AGE_IN_SUI: egenmean_cousins_alive = mean(cousins_alive) line mean_cousins_alive AGE_IN_SUI if first_in_age, scheme(s1mono) ytitle("Mean number of living cousins") xtitle("Age in sui") bysort AGE_IN_SUI: egen p_cousins_alive_0 = mean(cousins_alive == 0) bysort AGE_IN_SUI: egen p_cousins_alive_1 = mean(cousins_alive == 1) bysort AGE_IN_SUI: egen p_cousins_alive_2 = mean(cousins_alive == 2) bysort AGE_IN_SUI: egen p_cousins_alive_3 = mean(cousins_alive == 3) bysort AGE_IN_SUI: egen p_cousins_alive_4 = mean(cousins_alive == 4) bysort AGE_IN_SUI: egen p_cousins_alive_gt_5 = mean(cousins_alive >= 5) line p_cousins_alive_0 p_cousins_alive_1 p_cousins_alive_2 p_cousins_alive_3 p_cousins_alive_4 p_cousins_alive_gt_5 AGE_IN_SUI if first_in_age, scheme(s1mono) ytitle("Prop. of men with specified # of living cousins") xtitle("Age in sui") legend(order(1 "0" 2 "1" 3 "2" 4 "3" 5 "4" 6 "5+"))
Second cousins use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear merge 1:1 RECORD_NUMBER using "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0004\27063-0004-Data.dta", keepusing(F_ID_3) keep if PRESENT == 1 & SEX == 2 & F_ID_3 != "-99" & FATHER_ID != "-99" & GRANDFATHER_ID != "-99" bysort GRANDFATHER_ID YEAR: generate cousins = _N bysort F_ID_3 YEAR: generate second_cousins = _N - cousins bysort AGE_IN_SUI: generate byte first_in_age = _n == 1 bysort AGE_IN_SUI: egensecond_cousins_mean = mean(second_cousins) line second_cousins_mean AGE_IN_SUI if first_in_age & AGE_IN_SUI >= 1 & AGE_IN_SUI <= 80, scheme(s1mono) ytitle("Mean number of second cousins") xtitle("Age in sui")
Coresidence • Can also group on HOUSEHOLD_ID to distinguish between sets of kin living in separate households • For that matter, could also group on village identifier to identify kin in other versus same village • Coresidence only makes sense from 1789 on
Second cousinsby co-residence use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear merge 1:1 RECORD_NUMBER using "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0004\27063-0004-Data.dta", keepusing(F_ID_3) keep if PRESENT == 1 & SEX == 2 & F_ID_3 != "-99" & FATHER_ID != "-99" & GRANDFATHER_ID != "-99" & HOUSEHOLD_ID != "-99" bysort GRANDFATHER_ID YEAR: generate cousins = _N bysort GRANDFATHER_ID HOUSEHOLD_ID YEAR: generate cousins_hh = _N bysort F_ID_3 YEAR: generate second_cousins = _N - cousins bysort F_ID_3 HOUSEHOLD_ID YEAR: generate second_cousins_hh = _N - cousins_hh bysort AGE_IN_SUI: generate byte first_in_age = _n == 1 bysort AGE_IN_SUI: egensecond_cousins_mean = mean(second_cousins) bysort AGE_IN_SUI: egensecond_cousins_hh_mean = mean(second_cousins_hh) line second_cousins_meansecond_cousins_hh_mean AGE_IN_SUI if first_in_age & AGE_IN_SUI >= 1 & AGE_IN_SUI <= 80, scheme(s1mono) ytitle("Mean") xtitle("Age in sui") legend(order(1 "Second cousins" 2 "Second cousins in household"))
Variables measured according to location within a generation • Can sort members grouped by GRANDFATHER_ID, FATHER_ID etc. to measure characteristics relative to other members of the same generation • Sort men with the same FATHER_ID by BIRTHYEAR to order brothers according to seniority • Can count up unmarried older brothers (for example) with a running total of MARITAL_STATUS == 2 within FATHER_ID and YEAR, minus 1 for unmarried men.
Older unmarried brothers use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if SEX == 2 & FATHER_ID != "-99" & PRESENT generate byte unmarried = MARITAL_STATUS == 2 bysort FATHER_ID YEAR (BIRTHYEAR): gen older_unmarried_brothers = sum(unmarried) replace older_unmarried_brothers = older_unmarried_brothers - 1 if MARITAL_STATUS == 2 tab older_unmarried_brothers bysort AGE_IN_SUI: egenmean_older_unmarried_brothers = mean(older_unmarried_brothers) bysort AGE_IN_SUI: generate byte first_in_age = _n == 1 line mean_older_unmarried_brothers AGE_IN_SUI if first_in_age & AGE_IN_SUI >= 1 & AGE_IN_SUI <= 80, scheme(s1mono) ytitle("Mean number of older unmarried brothers") xtitle("Age")
Kin in other generations • Uncles • Men whose FATHER_ID is the same as the individual’s GRANDFATHER_ID • But whose PERSON_ID is not the individual’s FATHER_ID • Nephews • Men GRANDFATHER_ID is the same as the individual’s FATHER_ID • But whose FATHER_ID is not the individual’s PERSON_ID
Uncles use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if SEX == 2 & PRESENT & FATHER_ID != "-99" bysort FATHER_ID YEAR: generate uncles = _N bysort FATHER_ID YEAR: keep if _n == _N keep FATHER_ID YEAR uncles rename FATHER_ID GRANDFATHER_ID save uncles, replace use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if SEX == 2 & PRESENT & FATHER_ID != "-99" bysort PERSON_ID YEAR: keep if _n == _N generate byte father_alive = 1 keep PERSON_ID FATHER_ID YEAR father_alive rename FATHER_ID GRANDFATHER_ID rename PERSON_ID FATHER_ID save father, replace
use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if GRANDFATHER_ID != "-99" & FATHER_ID != "-99" keep if SEX == 2 & PRESENT merge m:1 GRANDFATHER_ID YEAR using uncles, keep(match master) drop _merge merge m:1 GRANDFATHER_ID FATHER_ID YEAR using father, keep(match master) replace father_alive = 0 if father_alive == . replace uncles = 0 if uncles == . tab uncles father_alive replace uncles = uncles - father_alive bysort AGE_IN_SUI: generate first_in_age = _n == 1 bysort AGE_IN_SUI: egenmean_uncles = mean(uncles) line mean_uncles AGE_IN_SUI if first_in_age & AGE_IN_SUI >= 1 & AGE_IN_SUI <= 80, scheme(s1mono) xtitle("Age in sui") ytitle("Mean number of uncles")
Kin with specified characteristics • Use egen to count up kin with specified characteristics • Following example counts up uncles with position • Could just as well count up uncles meeting any criteria of interest • Age, marital status, etc.
Uncles with position use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if SEX == 2 & PRESENT & FATHER_ID != "-99" bysort FATHER_ID YEAR: egenuncles_with_position = total(HAS_POSITION == 1) bysort FATHER_ID YEAR: keep if _n == _N keep FATHER_ID YEAR uncles rename FATHER_ID GRANDFATHER_ID save uncles_with_position, replace use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if SEX == 2 & PRESENT & FATHER_ID != "-99" & HAS_POSITION bysort PERSON_ID YEAR: keep if _n == _N generate byte father_has_position = 1 keep PERSON_ID FATHER_ID YEAR father_has_position rename FATHER_ID GRANDFATHER_ID rename PERSON_ID FATHER_ID save father_has_position, replace
use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if GRANDFATHER_ID != "-99" & FATHER_ID != "-99" keep if SEX == 2 & PRESENT merge m:1 GRANDFATHER_ID YEAR using uncles_with_position, keep(match master) drop _merge merge m:1 GRANDFATHER_ID FATHER_ID YEAR using father_has_position, keep(match master) replace father_has_position = 0 if father_has_position == . replace uncles_with_position = 0 if uncles_with_position == . tab uncles_with_positionfather_has_position replace uncles_with_position = uncles_with_position - father_has_position bysort AGE_IN_SUI: generate first_in_age = _n == 1 bysort AGE_IN_SUI: egenmean_uncles_with_position = mean(uncles_with_position) line mean_uncles AGE_IN_SUI if first_in_age & AGE_IN_SUI >= 1 & AGE_IN_SUI <= 80, scheme(s1mono) xtitle("Age in sui") ytitle("Mean number of uncles with position")
Fixed characteristics of individuals in previous times • So far all examples have been characteristics of kin in the same year • For analysis of influence of early-life characteristics, can construct measures of interest at one age (e.g. number of brothers when first observed) and copy forward to later records. • In some cases, would like to collapse information from multiple records of father, grandfather, etc. to produce a single variable • Typical example: did a father or grandfather hold a position at any point in his life • Regardless of whether he is still alive • Indeed, regardless of whether grandfather died before index individual was born
Father, grandfather ever held position use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if SEX == 2 & HAS_POSITION == 1& PRESENT bysort PERSON_ID: keep if _n == 1 rename HAS_POSITION FATHER_EVER_HAD_POSITION keep PERSON_ID FATHER_EVER_HAD_POSITION rename PERSON_ID FATHER_ID save father_ever_held_position rename FATHER_ID GRANDFATHER_ID rename FATHER_EVER_HAD_POSITION GF_EVER_HAD_POSITION save gf_ever_held_position use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta", clear keep if SEX == 2 & FATHER_ID != "-99" & GRANDFATHER_ID != "-99" keep if PRESENT merge m:1 FATHER_ID using father_ever_held_position, keep(match master) replace FATHER_EVER_HAD_POSITION = 0 if FATHER_EVER_HAD_POSITION == . drop _merge merge m:1 GRANDFATHER_ID using gf_ever_held_position, keep(match master) drop _merge replace GF_EVER_HAD_POSITION = 0 if GF_EVER_HAD_POSITION == . generate age_group = 1 + 5*int((AGE_IN_SUI-1)/5) generate ever_married = MARITAL_STATUS != 2 keep if MARITAL_STATUS >= 1 xi:logit HAS_POSITION i.age_group FATHER_EVER_HAD_POSITION GF_EVER_HAD_POSITION if AGE_IN_SUI >= 21 & AGE_IN_SUI <= 50 & HAS_POSITION >= 0
Logistic regression Number of obs = 363157 LR chi2(7) = 9814.74 Prob > chi2 = 0.0000 Log likelihood = -33618.005 Pseudo R2 = 0.1274 ------------------------------------------------------------------------------ HAS_POSITION | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Iage_gro~_6 | (omitted) … _Iage_gr~_41 | -.163177 .0378302 -4.31 0.000 -.2373228 -.0890313 _Iage_gr~_46 | (omitted) … _Iage_gr~721 | (omitted) FATHER_EVE~N | 2.337024 .0264432 88.38 0.000 2.285196 2.388852 GF_EVER_HA~N | .7563838 .0323022 23.42 0.000 .6930727 .8196948 _cons | -3.813408 .0280527 -135.94 0.000 -3.86839 -3.758426 ------------------------------------------------------------------------------