140 likes | 144 Vues
This pilot project explores the use of big data in the mining and manufacturing production index. It includes the establishment of a roadmap for big data use, beginning of big data use, countermeasures taken by KOSTAT, and the pilot project itself.
E N D
Mining and Manufacturing Production Index PilotProject for Big Data Use April 2013 Ki-bong Park
Contents Establishment of a Big Data Roadmap of the Korean Government Beginning of Big Data Use KOSTAT Countermeasures against Big Data Use KOSTATPilot Project for Big Data Use in Official Statistics Future Plans
1. Establishment of a Big Data Roadmap of the Korean Government • International trend of establishing big data-related strategies at the national level • Establishment of a ‘Big Data Master Plan’ of the Korean Government • U.S.A.: ‘Big Data Research & Development Initiatives’ , investment in big data research & development(March 2012) • England: ‘Open Data Strategy’, Strengthen data access (March 2012) • Japan: Published ‘Strategy for Big Data Use’, Selected as a major goal for ‘Active Japan’(May 2012) • Cope with big data issues- March 15, 2012: Big data national strategy forum (Participation by industries, universities, research institutes and governmental agencies)- Sep. 16, 2012: Big data forum (Participation by the public and private sectors) • Establish a ‘Big Data Master Plan’ (National Science & Technology Commission, Nov. 28, 2012)- Step-by-step big data use by considering benefits and utilities- Establishment of infrastructure for big data sharing, provision of technical support and expert training 1 / 12
2. Beginning of Big Data Use • Big data use when analyzing welfare and complaints from the general public • Beginning of big data use in terms of policy support and customized services- (Policy support) Link and share data that is located in several ministries, and support policy-making via data analysis- (Customized services) Provide user-focused services via an integrated analysis (e.g. qualification and history) • Public sector • Holding World-class ‘IT infrastructure’ and ‘data production and dissemination’ - World-best infrastructure including high-speed Internet, 3G smart phones and LTE phones- Sharp increase in mobile data due to the popularity of smart phones and SNS • Poor data use by businesses when considering the production and dissemination of a huge amount of data- Just a few enterprises properly understand and use big data. - Mobile and portal service providers started to use big data based on their own data. • Private sector 2 / 12
2. Case 1 Beginning of Big Data Use- • Complaints analysis system • of the Anti-Corruption Civil Rights Commission • Daejeon and Chungcheong • Collect, classify, analyze and forecast online complaints of the public to facilitate communication between the government and the public • Produce a complaints calendar by month and region after analyzing past complaints and produce a complaints map showing social issues형태로 제공 • Seoul and Gyeonggi • Strengthen policies Unemployment benefits, use of public transportation Apartment sales 3 / 12
2. Case 2 Beginning of Big Data Use - • Give assistance when making policies for adolescents • by analyzing social data • Policies for adolescents are too passive to prevent their extreme behaviors. • Help to make policies to prevent suicides of adolescents by analyzing buzz patterns related to suicides in social data and reflecting their situations and psychology • Indentify a danger sign via a suicide context analysis • Identify harmful contents by analyzing URLs that are spread via SNS • Cooperation with big mouths in a network e.g.) Suicide context pool e.g.) Online spread of harmful contents e.g.) Ripple effect of big mouths in a network (Source) Pilot Project of the National Information Society Agency (July ~ Sep. 2012) 4 / 12
3. KOSTAT Countermeasures against Big Data Use • KOSTAT Countermeasures against Big Data Use • Pilot project for big data use • Research on big data use in official statistics and pilot production of statistics • Establish a task force responsible for big data use and train human resources “Earlier dissemination of the Mining and Manufacturing Production Index by using media data” Support rapid and accurate production of the Mining and Manufacturing Production Index Collection and analysis of Internet data Media data is automatically collected and used for objective editing (checking changes in establishments and items, finding outliers) before finalizing the Index. Provision of visualized analysis function Easier understanding of time-series data (index and volume) by providing visualized analysis function 5 / 12
4. KOSTAT Pilot Project for Big Data Use in Official Statistics (1/5) • AS-IS andTO-BE of the Mining and Manufacturing Production Index Workflow • AS-IS • TO-BE (Big data use) Visualized analysis of survey data • Data collection and input • (3rd ~19th every month) Volume table-focused edit Volume information by establishment and item Index information by industry and item • Finalization of input and the preliminary index • (20th every month) Inquiry over the phone • Edit and inquiry • (20th ~ 23rd every month) Analysis of media non-typical data • Finalization of the index and data upload • (23rd every month) Individual verification of Internet data Integrated Internet information on items and establishments • Analysis of the index and publication of a report • (24th ~26th every month) 9 / 12
4. KOSTAT Pilot Project for Big Data Use in Official Statistics (2/5) • Analysis range (Media data) Notices on the related websites and Internet news articles ※ 4 industry groups (C21, C24, C26, C28, 162items, 1438 establishments)that are surveyed in the Monthly Survey of Mining and Manufacturing (Survey data) Time-series index and volume data by industry, item and establishment from 2005 to the current month for the Monthly Survey of Mining and Manufacturing 6 / 12
4. KOSTAT Pilot Project for Big Data Use in Official Statistics (3/5) • Collection and analysis of media data • Data collection • Data analysis Scrolling of articles and documents on the Internet from the previous month to the current month- Collect Internet data that contains words such as increase and decrease for survey items and establishments- Scroll Internet news on a real-time basis- Analyze attached documents in the PDF or MS word format on the websites and load data into the analysis server Analysis of non-typical data Integrated analysis of attached documents from the websites and Internet news articles- Improve retrieval accuracy by registering search words (e.g. items) in advance- Provide website and Internet news according to the order of accuracy 7 / 12
4. KOSTAT Pilot Project for Big Data Use in Official Statistics (4/5) • Collection and analysis of survey data • Data collection • Data analysis Online link to the DB for the Mining and Manufacturing Survey System Visualization of time-series index and volume data by industry, establishment and item (e.g. in graphs) Visualization of typical data - Volume graph by item and establishment(Volume/month-on-month/year-on-year) - Index graph by industry and item(Volume/month-on-month/year-on-year) Index by industry Index by item Volume by establishment Volume by item 8 / 12
4. KOSTAT Pilot Project for Big Data Use in Official Statistics(5/5) 10 / 12
5. Future Plans • Big data governance is recommended to ensure better production and dissemination of official statistics by using big data. • Define business processes and responsibilities • Results of pilot projects • Examine applicability and efficiency in statistical production/ accumulate big data use techniques • Establishment of infrastructure for statistical production via big data use, and ongoing research • Expand the system (e.g. expand analysis range) • Expand to all industries in Monthly Survey of Mining and Manufacturing • Expand to the Monthly Service Industry Survey 11 / 12