1 / 30

Using Data Science as Evidence in Public Policy With Big Data and Elections

Using Data Science as Evidence in Public Policy With Big Data and Elections. Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community http://semanticommunity.info/ AOL Government Blogger http://gov.aol.com/bloggers/brand-niemann/ November 1-2, 2012

jensen
Télécharger la présentation

Using Data Science as Evidence in Public Policy With Big Data and Elections

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community http://semanticommunity.info/ AOL Government Blogger http://gov.aol.com/bloggers/brand-niemann/ November 1-2, 2012 http://semanticommunity.info/CNSTAT

  2. Start by Asking Questions • Which by State, Congressional District, and which by time? • Which is the easiest to reformat? • Which is the most interesting? • Where have the candidates been? • Which data is free? • Etc. Note: Drew Conway (@drewconway) speaking about the joys, challenges, and power of data science. "Data science, as a discipline, is fundamentally about human behavior.” http://semanticommunity.info/AOL_Government/2012_Recorded_Future_User_Conference

  3. Then Look for the Evidence • Brainstorm: • What Have I Done Before? • 2012 Annual Statistical Abstract: • Chapter 7. Elections • Google Searches: • Election and Voting Data • Conferences: • National Academy Seminars • Television: • Debates, etc.

  4. Begin With the End In Mind(Stephen Covey) • Story (publicity and money) • Research Notes (document what I did and learned) • Conditioned Data Sets (added value) • Spotfire Dashboard (cool visualizations) • Lecture to Students at George Mason University (help them learn what a data scientist/data journalist does)

  5. My 5-Step Method • So what I like to do to illustrate (data science) and explain (data journalism) in the following (like a recipe): • Put the Best Content into a Knowledge Base (e.g. MindTouch) • The 2012 Annual Statistical Abstract, CNSTAT, etc. • Put the Knowledge Base into a Spreadsheet (Excel) • Linked Data to Subparts of the Knowledge Base • Put the Spreadsheet into a Dashboard (Spotfire) • Data Integration and Interoperability Interface • Put the Dashboard into a Semantic Model (Excel) • Data Dictionaries and Models • Put the Semantic Model into Dynamic Case Management (Be Informed) • Structured Process for Updating Data in the Dashboard

  6. Knowledge Base http://semanticommunity.info/CNSTAT

  7. 2012 Annual Statistical Abstract:Chapter 7. Elections (Visualizations) http://semanticommunity.info/FedStats.net#Section_7_ELECTIONS

  8. 2012 Annual Statistical Abstract:Chapter 7. Elections (Metadata) http://semanticommunity.info/FedStats.net#Section_7._Elections

  9. FedStat.net: Commemorating over 135 years of making statistics available to citizens everywhere http://semanticommunity.info/FedStats.net#Story

  10. FedStats.gov Remains Rich Source Of Government Data For Citizens http://gov.aol.com/2012/07/26/fedstats-gov-remains-rich-source-of-government-data-for-citizens/

  11. 2012 Annual Statistical Abstract http://www.census.gov/compendia/statab/

  12. Data From CD-ROM to My Server http://semanticommunity.net/StatAbs2012/

  13. Spreadsheet http://semanticommunity.info/@api/deki/files/19606/Elections2012.xls

  14. Welcome to the Campaign 2012 Interactive Dashboard My Note: Not like the next slide! http://campaign2012.c-span.org/electoral-college-map

  15. CNN Electoral Map http://www.cnn.com/ELECTION/2012/ecalculator

  16. CNN Electoral Map in Excel http://semanticommunity.info/@api/deki/files/19606/Elections2012.xls

  17. CNN Electoral Map in Spotfire https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

  18. Data Set Inventory and Results http://semanticommunity.info/CNSTAT#Story

  19. 2012 Annual Statistical Abstract Election Tables Metadata https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

  20. Table 397. Participation in Elections for President and U.S. Representatives and Table 402. Vote Cast for President, by Major Political Party https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

  21. Table 405. Electoral Vote Cast for President by Major Political Party--States https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

  22. Table 408. Apportionment of Membership in House of Representatives, by State https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

  23. Table 410. Vote Cast by Congressional Districts: 2010 https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

  24. Cover Page https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

  25. Conclusions and Suggestions • I had the pleasure of attending three very interesting and related professional statistical meetings recently that showed that statisticians really care about current issues. • This made me appreciate that elections are a big data problem that is approached in three basic ways: Historical elections data, Collection and modeling of polling survey data before the election, and Use of social media. • So I used inventoried the historical and polling survey data (I could get) to aid in selection and visualization in a dashboard and found I needed both Congressional and State boundary files as shown in a table. • So imagine an election season in which we had less or no polls to influence voters so they could focus on the candidates and the issues and then we got an amazing example of big data processing just after the polls closed (by gentleman's agreement with Congress) which we could all participate in by seeing the precinct voting results posted to Twitter and processed by many apps that developers had developed to bring us interesting and useful results. I am eager to see that to happen in 2014 and 2016! • I will be updating these results with the final 2012 elections data and providing another story.

  26. Extra Slides • Boundary Files: • US States Repositioned • US Counties Repositioned • US Congressional Districts 1 • US Congressional Districts 2 • Sources: • Spotfire • https://silverspotfire.tibco.com/us/library • US Census • http://www.census.gov/cgi-bin/geo/shapefiles2010/main

  27. US States Repositioned https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

  28. US Counties Repositioned https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

  29. US Congressional Districts 1 https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

  30. US Congressional Districts 2 https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire

More Related