set more off capture drop _all cd "c:\DATA\NBMaine\2008_08\Final\5Outcomes" capture log close log using McFadden, replace set memory 250000 set matsize 5000 ************************************** *McFadden Base *This is a five-outcome version that incorporates the decision to work 0 weeks *Works with log wages *(just enters log wage levels as summarizing the attractiveness of each positive weeks category; * the unknown income associated with nonwork enters the constant term; implicitly we assume it's the same * for everyone, or randomly assigned. *Predicts Wages from all years and regions separately *(discards top and bottom 2.5% of calculated wages first to eliminate cases of clear meas. error) *then imputes UI benefits and counterfactual part-year income using each obs's predicted wage ***** *Like the June 08 version, this version also calculates UI benefits at every integer level of weeks worked, ** then takes weighted averages within broad intervals using 1980 weeks-worked distributions within intervals.... ***** *ages 25-59 only, also excludes the self-employed and unpaid family workers *********************** *CHOOSE MEN (0) OR WOMEN (1) HERE: scalar femalemodel= 0 ************************ use "c:\DATA\NBMaine\2008_08\Final\MENB microdata counties.dta" **************** *SEX: gen byte female = 0 replace female = 1 if sex_me == 2 replace female = 1 if sex_nb == 1 if femalemodel==1 { global sex "Women" keep if female==1 } if femalemodel==0 { global sex "Men" keep if female==0 } ************************** *************************** *COUNTRIES: Maine = 1, NB = 2 *************************** gen byte nb = country == 2 gen byte me = country == 1 label define country 1 "Maine" 2 "NB" label values country country tab country *KEEP AGES 25-59 only drop if age_me < 25 & country==1 drop if age_me > 59 & country==1 drop if age_nb < 25 & country==2 drop if age_nb > 59 & country==2 ************* *YEAR DUMMIES *For simplicity '71/'81/'91 for NB labelled as '70/'80/'90 ************* gen year = . replace year = 1970 if year_me == 97 replace year = 1980 if year_me == 98 replace year = 1990 if year_me == 99 replace year = 1970 if year_nb == 1970 replace year = 1980 if year_nb == 1980 replace year = 1990 if year_nb == 1990 gen byte y1970 =0 replace y1970=1 if year == 1970 gen byte y1980 =0 replace y1980=1 if year == 1980 gen byte y1990 =0 replace y1990 =1 if year == 1990 ************** *WEEKS WORKED: ************** *NOTES ON THE WEEKS VARIABLES: * wkswrk: categorical (0-6) ME only, all years * wks1971: categorical (0-6) NB 1971 only * numweeks: actual no of weeks, NB 81 and 91 only * wkswork: actual no of weeks, ME 80 and 90 only * weeks: categorical variable, generated below (1-4), consistent across all years and countries *CREATE THE CONSISTENT "WEEKS" VARIABLE (Five categories, consistent across all regions and years): d wk* gen weeks=. label define weeks 0 "0" 1 "1-13" 2 "14-26" 3 "27-39" 4 "40-52" label values weeks weeks ****** *MAINE ****** tab wksint_me year if me==1, missing tab wks_me year if me==1, missing *0 weeks worked replace weeks = 0 if wksint_me == 0 *1-13 weeks worked replace weeks = 1 if wksint_me == 1 *14-26 weeks worked replace weeks = 2 if wksint_me == 2 *27-39 weeks worked replace weeks = 3 if wksint_me == 3 *40-52 weeks worked replace weeks = 4 if wksint_me == 4 replace weeks = 4 if wksint_me == 5 replace weeks = 4 if wksint_me == 6 tab year weeks if me==1, missing *** *NB *** tab wksint71_nb if nb==1 & year==1970, missing tab wks_nb if nb==1 & year==1980, missing tab wks_nb if nb==1 & year==1990, missing *1971 *0 weeks worked replace weeks = 0 if wksint71_nb == 0 | wksint71_nb == 1 *1-13 weeks worked replace weeks = 1 if wksint71_nb == 2 *14-26 weeks worked replace weeks = 2 if wksint71_nb == 3 *27-39 weeks worked replace weeks = 3 if wksint71_nb == 4 *40-52 weeks worked replace weeks = 4 if wksint71_nb == 5 replace weeks = 4 if wksint71_nb == 6 *1981/1991 *0 weeks worked replace weeks=0 if wks_nb == 0 | wks_nb == 99 *1-13 weeks worked replace weeks=1 if wks_nb >=1 & wks_nb <= 13 *14-26 weeks worked replace weeks=2 if wks_nb >=14 & wks_nb <= 26 *27-39 weeks worked replace weeks=3 if wks_nb >=27 & wks_nb <= 39 *40-52 weeks worked replace weeks=4 if wks_nb >=40 & wks_nb <= 52 sort year by year: tab nb weeks, row **************************** *GENERATE BASIC DEMOGRAPHICS **************************** gen byte married = 0 replace married = 1 if (marital_me == 1 | marital_me == 2) replace married = 1 if marital_nb == 2 gen age = . replace age = age_me if me==1 replace age = age_nb if nb==1 gen age_sq = (age*age)/100 gen byte children = 0 replace children = 1 if kids_me >= 1 & kids_me <= 9 replace children = 1 if (child71_nb >= 2 & child71_nb <= 15) & year_nb == 1970 & female == 1 replace children = 1 if fsize71_nb >=2 & female == 0 & year_nb == 1970 & married == 1 replace children = 1 if house_nb ==3 & year_nb == 1980 replace children = 1 if house_nb ==4 & year_nb == 1980 replace children = 1 if house_nb ==8 & year_nb == 1980 replace children = 1 if house_nb ==3 & year_nb == 1990 replace children = 1 if house_nb ==4 & year_nb == 1990 replace children = 1 if house_nb ==7 & year_nb == 1990 replace children = 1 if house_nb ==8 & year_nb == 1990 replace children = 1 if house_nb ==9 & year_nb == 1990 replace children = 1 if house_nb ==10 & year_nb == 1990 gen byte inschool = 0 replace inschool = 1 if attend_me == 2 replace inschool = 1 if (attend_nb == 1 | attend_nb == 2) & year==1970 replace inschool = 1 if (attend_nb == 2 | attend_nb == 3) & year==1980 replace inschool = 1 if (attend_nb == 2 | attend_nb == 3) & year==1990 ********** *EDUCATION ********** *NOTES: *For education, we can't get the two countries completely comparable after highschool for 1970 Cdn census *US is: *1) Grade 11 and lower *2) Grade 12 with _and_ without high school diploma *3) Some college, no degree _and_ 'occupation associate degrees' (whatever that is) *4) Academic associate degree, BA, MA, etc. *US 1970/80 censuses have 1-3 years and 4 years with no more specific info on credentials until 1990 gen byte nohigh = 0 gen byte highschl = 0 gen byte somepost = 0 gen byte degree = 0 *ME replace nohigh =1 if educ_me < 7 & year == 1970 replace highschl =1 if educ_me == 7 & year == 1970 replace somepost =1 if educ_me == 8 & year == 1970 replace degree =1 if educ_me == 9 & year == 1970 replace nohigh =1 if educ_me < 7 & year == 1980 replace highschl =1 if educ_me == 7 & year == 1980 replace somepost =1 if educ_me == 8 & year == 1980 replace degree =1 if educ_me == 9 & year == 1980 replace nohigh =1 if educ_me < 7 & year == 1990 replace highschl =1 if educ_me == 7 & year == 1990 replace somepost =1 if educ_me == 8 & year == 1990 replace degree =1 if educ_me == 9 & year == 1990 *NB, 1970 replace nohigh =1 if educ71_nb < 6 & year == 1970 replace highschl =1 if (educ71_nb == 6 | educ71_nb == 7) & year == 1970 *Set somepost=1 if you're 1 to 2 years or 3 to 4 years with no degree (for Canada) *For US it's 1-3 years replace somepost =1 if (educ71_nb == 8 | educ71_nb == 9) & year == 1970 replace degree =1 if educ71_nb >= 10 & educ71_nb <= 12 & year == 1970 *Allocate 6,7,8 to high school categories based on highest grade *NB, 1980 replace nohigh = 1 if school_nb < 3 & year_nb == 1980 replace nohigh = 1 if (school_nb == 3 & high_nb < 6) & year_nb == 1980 replace highschl = 1 if (school_nb == 3 & high_nb > 5) & year_nb == 1980 replace highschl = 1 if school_nb == 4 & year_nb == 1980 *We're putting secondary trades certificates in here but 70% have less than grade 12 so judgement call replace highschl = 1 if school_nb == 5 & year_nb == 1980 replace nohigh = 1 if (school_nb ==6 & high_nb < 6) & year_nb == 1980 replace highschl =1 if (school_nb == 6 & high_nb > 5) & year_nb == 1980 replace nohigh = 1 if (school_nb ==7 & high_nb < 6) & year_nb == 1980 replace highschl =1 if (school_nb == 7 & high_nb > 5) & year_nb == 1980 replace nohigh = 1 if (school_nb ==8 & high_nb < 6) & year_nb == 1980 replace highschl =1 if (school_nb == 8 & high_nb > 5) & year_nb == 1980 replace somepost = 1 if (school_nb > 8 & school_nb < 11) & year_nb == 1980 replace degree = 1 if school_nb == 11 & year_nb == 1980 *NB, 1990 replace nohigh = 1 if school_nb < 3 & year_nb == 1990 replace nohigh = 1 if (school_nb == 3 & high_nb < 6) & year_nb == 1990 replace highschl = 1 if (school_nb == 3 & high_nb > 5) & year_nb == 1990 replace highschl = 1 if school_nb == 4 & year_nb == 1990 replace highschl = 1 if school_nb == 5 & year_nb == 1990 replace nohigh = 1 if (school_nb ==6 & high_nb < 6) & year_nb == 1990 replace highschl =1 if (school_nb == 6 & high_nb > 5) & year_nb == 1990 replace nohigh = 1 if (school_nb ==7 & high_nb < 6) & year_nb == 1990 replace highschl =1 if (school_nb == 7 & high_nb > 5) & year_nb == 1990 replace nohigh = 1 if (school_nb ==8 & high_nb < 6) & year_nb == 1990 replace highschl =1 if (school_nb == 8 & high_nb > 5) & year_nb == 1990 replace somepost = 1 if (school_nb >= 9 & school_nb <= 10) & year_nb == 1990 replace degree = 1 if (school_nb >= 11 & school_nb <= 14) & year_nb == 1990 ************************ ************************ ************************ sort year by year: ci nohigh highschl somepost degree if nb==1 by year: ci nohigh highschl somepost degree if nb==0 *********** *Industries (for D Stats only) *********** ************************** *Basic industry categories *For comparability with aggregate analysis gen byte agric = 0 gen byte primary=0 gen byte manuf=0 gen byte const=0 gen byte trans=0 gen byte trade=0 gen byte finance=0 gen byte services=0 gen byte public=0 gen byte na=0 ************************** *ME *Looks like the only difference between the 1950 industry classification and the 1 digit SICs for Canada *census is logging replace na = 1 if ind50_me == 0 replace agric=1 if ind50_me == 105 replace primary =1 if ind50_me >= 116 & ind50_me <= 236 *re-classify logging replace primary =1 if ind50_me == 306 replace const =1 if ind50_me == 246 replace manuf = 1 if ind50_me >= 307 & ind50_me <= 499 replace trans =1 if ind50_me >= 506 & ind50_me <= 598 replace trade =1 if ind50_me >=606 & ind50_me <= 699 replace finance =1 if ind50_me >=716 & ind50_me <= 756 replace services =1 if ind50_me >= 806 & ind50_me <= 899 replace public = 1 if ind50_me >= 906 & ind50_me <= 936 *NB *1970 replace agric = 1 if ind_nb == 1 & year == 1970 replace primary = 1 if ind_nb == 2 & year == 1970 replace primary = 1 if ind_nb == 3 & year == 1970 replace primary = 1 if ind_nb == 4 & year == 1970 replace manuf =1 if ind_nb == 5 & year == 1970 replace const =1 if ind_nb == 6 & year == 1970 replace trans =1 if ind_nb == 7 & year == 1970 replace trade=1 if ind_nb == 8 & year == 1970 replace finance=1 if ind_nb == 9 & year == 1970 replace services=1 if ind_nb == 10 & year == 1970 replace public=1 if ind_nb == 11 & year == 1970 replace na = 1 if ind_nb == 12 & year == 1970 *1980 replace agric = 1 if ind_nb == 1 & year == 1980 replace primary = 1 if ind_nb == 2 & year == 1980 replace manuf =1 if ind_nb == 3 & year == 1980 replace const =1 if ind_nb == 4 & year == 1980 replace trans =1 if ind_nb == 5 & year == 1980 replace trans =1 if ind_nb == 6 & year == 1980 replace trans =1 if ind_nb == 7 & year == 1980 replace trade=1 if ind_nb == 8 & year == 1980 replace trade=1 if ind_nb == 9 & year == 1980 replace finance=1 if ind_nb ==10 & year == 1980 replace services=1 if (ind_nb >=11 & ind_nb<=17) & year == 1980 replace public=1 if ind_nb ==18 & year == 1980 replace na = 1 if ind_nb ==19 & year == 1980 *1990 replace agric = 1 if ind_nb == 1 & year == 1990 replace primary = 1 if ind_nb == 2 & year == 1990 replace manuf =1 if ind_nb == 3 & year == 1990 replace const =1 if ind_nb == 4 & year == 1990 replace trans =1 if ind_nb == 5 & year == 1990 replace trans =1 if ind_nb == 6 & year == 1990 replace trade=1 if ind_nb == 7 & year == 1990 replace trade=1 if ind_nb == 8 & year == 1990 replace finance=1 if ind_nb == 9 & year == 1990 replace services=1 if ind_nb ==10 & year == 1990 replace services=1 if (ind_nb >=13 & ind_nb <=16) & year == 1990 replace public=1 if ind_nb ==11 & year == 1990 replace public=1 if ind_nb ==12 & year == 1990 replace na = 1 if ind_nb ==17 & year == 1990 gen industry = . replace industry = 1 if agric == 1 replace industry = 2 if primary == 1 replace industry = 3 if manuf == 1 replace industry = 4 if const == 1 replace industry = 5 if trans == 1 replace industry = 6 if trade == 1 replace industry = 7 if finance == 1 replace industry = 8 if services == 1 replace industry = 9 if public == 1 replace industry = . if na == 1 tab industry, missing replace industry=. if weeks==0 tab industry, missing replace agric = . if industry== . replace primary = . if industry== . replace manuf = . if industry== . replace const = . if industry== . replace trans = . if industry== . replace trade = . if industry== . replace finance = . if industry== . replace services = . if industry== . replace public = . if industry== . label define industry 1 agr 2 prim 3 manuf 4 const 5 trans 6 trade 7 fin 8 serv 9 pub label values industry industry log close log using DStats, replace ****************** *Descriptive Statistics: ****************** display "$sex" sort me year by me year: ci age nohigh highschl somepost degree married children inschool agric-public *Within-category weeks worked distributions for 1980, needed to compute UI policy variable by category: sort weeks by weeks: tab wks_nb if year==1980 & nb==1 by weeks: tab wks_me if year==1980 & me==1 table year weeks if nb==1, c(mean wks_nb) table year weeks if me==1, c(mean wks_me) *Share zero weeks, for Table 6: gen byte zeroweeks = weeks==0 sort country year by country year: ci zeroweeks log close log using McFadden, append ************* *WEEKLY WAGES ************* *The "wages" variable gives total annual earnings in all countries/years: sort nb year by nb year: sum wages_nb wages_me gen wages = . replace wages = wages_nb if nb==1 & wages_nb~=0 replace wages = wages_me if me==1 & wages_me~=0 by nb year: sum wages gen wklywage = . *1980/90, ME replace wklywage = wages/wks_me if year>1970 & country==1 table country year, c(mean wklywage) *1981/91, NB replace wklywage = wages/wks_nb if year>1970 & country==2 table country year, c(mean wklywage) *In 1970, don't have continuous weeks data for either country *The following are the 1980 within-category means of weeks (from DStats.smcl) *MEN: *nb: 9.78, 20.55, 32.64, 50.55 *me: 7.58, 22.26, 33.34, 50.85 *WOMEN: *nb: 8.29, 19.97, 33.27, 50.29 *me: 7.12, 20.77, 33.98, 50.15 *When necessary (only used to calculate means for wage inflation/deflation), use 1980 midpoints to calculate * weekly wages in 1970: *1970, NB and ME: if femalemodel==0 { replace wklywage = wages/9.78 if (weeks==1 & nb==1 & year==1970) replace wklywage = wages/20.55 if (weeks==2 & nb==1 & year==1970) replace wklywage = wages/32.64 if (weeks==3 & nb==1 & year==1970) replace wklywage = wages/50.55 if (weeks==4 & nb==1 & year==1970) replace wklywage = wages/7.58 if (weeks==1 & me==1 & year==1970) replace wklywage = wages/22.26 if (weeks==2 & me==1 & year==1970) replace wklywage = wages/33.34 if (weeks==3 & me==1 & year==1970) replace wklywage = wages/50.85 if (weeks==4 & me==1 & year==1970) } if femalemodel==1{ replace wklywage = wages/8.29 if (weeks==1 & nb==1 & year==1970) replace wklywage = wages/19.97 if (weeks==2 & nb==1 & year==1970) replace wklywage = wages/33.27 if (weeks==3 & nb==1 & year==1970) replace wklywage = wages/50.29 if (weeks==4 & nb==1 & year==1970) replace wklywage = wages/7.12 if (weeks==1 & me==1 & year==1970) replace wklywage = wages/20.77 if (weeks==2 & me==1 & year==1970) replace wklywage = wages/33.98 if (weeks==3 & me==1 & year==1970) replace wklywage = wages/50.15 if (weeks==4 & me==1 & year==1970) } *TRIMMING *Drop top 2.5% and bottom 2.5% of calculated weekly wages in each country/year: centile wklywage if country==1 & year==1970, centile( 2.5 97.5 ) drop if country==1 & year==1970 & (wklywager(c_2)&wklywage~=.)) centile wklywage if country==1 & year==1980, centile( 2.5 97.5 ) drop if country==1 & year==1980 & (wklywager(c_2)&wklywage~=.)) centile wklywage if country==1 & year==1990, centile( 2.5 97.5 ) drop if country==1 & year==1990 & (wklywager(c_2)&wklywage~=.)) centile wklywage if country==2 & year==1970, centile( 2.5 97.5 ) drop if country==2 & year==1970 & (wklywager(c_2)&wklywage~=.)) centile wklywage if country==2 & year==1980, centile( 2.5 97.5 ) drop if country==2 & year==1980 & (wklywager(c_2)&wklywage~=.)) centile wklywage if country==2 & year==1990, centile( 2.5 97.5 ) drop if country==2 & year==1990 & (wklywager(c_2)&wklywage~=.)) *********************************** table country year, c(mean wklywage) *Save mean wages by country and year, for later use: sum wklywage if nb==1 & year==1970 scalar W70NB = r(mean) sum wklywage if nb==1 & year==1980 scalar W80NB = r(mean) sum wklywage if nb==1 & year==1990 scalar W90NB = r(mean) sum wklywage if me==1 & year==1970 scalar W70ME = r(mean) sum wklywage if me==1 & year==1980 scalar W80ME = r(mean) sum wklywage if me==1 & year==1990 scalar W90ME = r(mean) *PREDICT WEEKLY WAGES: *This just uses the workers to predict wages for all, including the nonworkers *sum wklywage highschl somepost degree inschool children married age age_sq if nb==1 & year==1970 reg wklywage highschl somepost degree inschool children married age age_sq if nb==1 & year==1970 predict what if nb==1 & year==1970 *sum wklywage highschl somepost degree inschool children married age age_sq if nb==1 & year==1980 reg wklywage highschl somepost degree inschool children married age age_sq if nb==1 & year==1980 predict W if nb==1 & year==1980 replace what=W if nb==1 & year==1980 drop W *sum wklywage highschl somepost degree inschool children married age age_sq if nb==1 & year==1990 reg wklywage highschl somepost degree inschool children married age age_sq if nb==1 & year==1990 predict W if nb==1 & year==1990 replace what=W if nb==1 & year==1990 drop W *sum wklywage highschl somepost degree inschool children married age age_sq if nb==0 & year==1970 reg wklywage highschl somepost degree inschool children married age age_sq if nb==0 & year==1970 predict W if nb==0 & year==1970 replace what=W if nb==0 & year==1970 drop W *sum wklywage highschl somepost degree inschool children married age age_sq if nb==0 & year==1980 reg wklywage highschl somepost degree inschool children married age age_sq if nb==0 & year==1980 predict W if nb==0 & year==1980 replace what=W if nb==0 & year==1980 drop W *sum wklywage highschl somepost degree inschool children married age age_sq if nb==0 & year==1990 reg wklywage highschl somepost degree inschool children married age age_sq if nb==0 & year==1990 predict W if nb==0 & year==1990 replace what=W if nb==0 & year==1990 drop W rename wklywage indiv_wklywage gen wklywage = what gen wklywageW = what if indiv_wklywage~=. sort year country by year country: sum indiv_wklywage wklywage wklywageW ******************************** **************************** *UI Rules: do UIRules **************************** *********************************** *Counterfactual versions of the income differential variable, for policy experiments: do AltUIRules *********************************** do EstModel