Leveraging Administrative Data for Agricultural Statistics: Insights from the 2010 Agricultural Census
190 likes | 267 Vues
Explore how administrative data was utilized to compile agricultural statistics in the 2010 Census of Agriculture, including data sources, analysis methods, merging processes, lessons learned, and future directions.
Leveraging Administrative Data for Agricultural Statistics: Insights from the 2010 Agricultural Census
E N D
Presentation Transcript
Using administrative data to compile agricultural statistics Experiences from the Census of Agriculture 2010 Fiona O’Callaghan, CSO 29th September 2011
Outline • Background • Available data • Analysis/aggregation • Merging of data • Lessons learnt • Future developments • Summary
Background • CSO Statement of Strategy includes as a high level goals • Minimise response burden, and extend the statistical use of administrative records • Improve the scope, quality & timeliness of our statistics • Achieve greater efficiencies using best practices • SPAR Report - Statistical Potential of Administrative Records
Background • Up until 2010, CSO conducted 2 annual farm surveys (June & Dec) with sample sizes ranging from 15,000 to 20,000 farms. Farm Structure Survey approx 50,000 farms. • June survey: average response burden approx. 30 mins. • December survey: average response burden approx. 18 mins. • COA 2010: average response burden 26 mins.
Available Data • CSO identified six major data holdings within DAFF which contained relevant information • Animal Identification and Movement System (AIM) • Single Payment System (SPS) • Organic Database • REPS database • Animal Health Computer System (AHCS) • Corporate Client System (CCS)
Available Data • Three of these databases were used for COA 2010 • CCS was used in developing the register • AIM was used for cattle • SPS was used for crops/cereals
Analysis/aggregation • Corporate Client System • Contains name, address, DOB, herd number etc. • This was merged with the existing CSO Agriculture Register to form a new Register for COA 2010 • Issues involving the “unique” identifier, Herd Number, resulting in duplicates • Result – 153,904 Census forms issued
Analysis/aggregation • AIM – involves the use of electronic means to capture data on animal movements through computer links established at livestock markets, meat plants, and export points • Data available since 2002 • At an aggregate level, the AIM figures for the bovine population have been consistently higher than the CSO estimates for the corresponding date (on average approx. 5% higher) • Preliminary comparative analysis performed by CSO in 2008 and 2009 • Eliminated 11 cattle questions from Census form
Analysis/aggregation • AIM data consists of the following variables • Tag number • Herd number • DOB • Gender • Breed • Breed Type (Beef/Dairy) • Animal Class (Cow, bull, etc.) • Date of calving event
Analysis/aggregation • Need to convert this information into totals for the following categories Breeding Cattle Dairy Cows Other Cows Dairy Heifers* Other Heifers Bulls Other Cattle Male: 2 years and over Female: 2 years and over Male: 1-2 years Female: 1-2 years Male: under 1 year Female: under 1 year * Heifers in calf intended for the dairy herd
Analysis/aggregation • Information on heifers-in-calf not available in AIM database, but a proxy can be estimated. • Other categories can be derived directly using gender, DOB, animal class. • Eurostat requires further breakdown of categories into animals for slaughter – currently developing a methodology to model this.
Analysis/aggregation • SPS • Information on every eligible parcel of land • XY coordinates, Herd Number, Area, Use • Preliminary analysis performed in 2009 • Eliminated 14 crops/cereals questions from Census form
Analysis/aggregation • SPS • XY coordinates used to assign NUTS Region codes at farm level • Approx. 45% farms are spread over >1 DED - in these instances the DED containing the largest area owned was assigned
Merging of data • Three separate data files • Census returns • AIM data • SPS data • “Unique” identifier – Herd Number – but many instances where one farmer could be associated with more than one Herd Number
Merging of data • Labour intensive task of matching by name & address • Issues with Father & Son with same name & address • Different versions of names on different databases – Seamus/James, Sean/John etc. • Non-unique addresses • Farms that returned to CSO as Retired/Dead/Not a Farm etc. but active on admin. data
Lessons Learnt • More collaboration between DAFF & CSO • Confidentiality – one-way transfer of information • Parallel pilot run in 2009
Future Developments • Beyond the SPAR Initiative – cross sector efficiencies Piggy-backing on SFP online applications to collect remaining survey items on June Survey • Exploiting geo-coordinates • To produce interactive maps • To link with other databases • Create area frame designs
Summary • Positive development for Farmers & CSO • Reduction in response burden – 25 questions dropped • Reduction in editing & processing of data • Result - a high quality register of agricultural holdings, and high quality data