1 / 10

Open Data – reflections from behind the Big Firewall

Open Data – reflections from behind the Big Firewall. Or, may you be cursed to live in interesting times. Open Contributed C ontent will become a core, strategic, economic resource – and the most accessible & scalable resource we possess.

eagan
Télécharger la présentation

Open Data – reflections from behind the Big Firewall

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Open Data – reflections from behind the Big Firewall Or, may you be cursed to live in interesting times

  2. Open Contributed Content will become a core, strategic, economic resource – and the most accessible & scalable resource we possess. Mobility, Openness & Connectionwill matter more than Presence & Rigid Structures Open Data …. Why bother? In 2013 expect generation of >850 Exabytes of Internet data. Mostly user contributed content (versus traditional enterprise sources). On-demand interaction will increasingly be the norm for a global community of virtual innovators … who expect their user experience to be as simple as ‘using an appliance’ Global access to technology is already driving trends like ‘virtual citizenship’, ‘virtual employment’ & ‘social innovation’

  3. Open Data and Economicsor …. ‘Greater Fool Investing’ …..!!’ Open data is a potential new 'raw material' for economic growth. It requires effort to produce and maintain.Unlike traditional raw materials like oil, gas and minerals, its value increases fastest when it is open and shareable. Open Data alone does not generate direct economic benefit sufficient to offset production & operational costs … the question is … can it generate sufficient ‘value’ to be sustainable? Incentives must be in place to sustain “economically significant” amounts of Open Data Some bright lights … but we need answers before we run out of steam!! Bubble … "trade in high volumes at prices that are considerably at variance with intrinsic values".

  4. How Private is Private? • Privacy is not absolute, it is a balance between Risk and Utility • Open Data usage is inherently contradictory • Social media usage -> Maximize Utility + (Largely) Ignore Risk • Enterprise usage -> Maximize Utility + Minimize Risk • Who carries liability in case of dispute? Uncertainty in usage policies is a substantial form of business risk Recognize in policy and legislation that privacy is mutable - based on context ✔Available Open Data useful to identify & characterize group behaviors ✖Negative usage for ‘nuisance’ providers to identify high-value targets • {∃(high value residences)} ∩ {∃(long emergency response time)} ∩ {∃(many local area crimes)} • {area where people might buy home security products} • (all available on open data sites near you)

  5. A Fun Use Case

  6. Challenges for Privacy in an Open Data World And I haven’t even mentioned Trust, Provenance, Security, ……

  7. Research impact: what we have learned so farThere are plenty of interesting challenges!! Selected research results: -Live deployment in Dublin -Won prize in Semantic Web Challenge -Paper at ISWC -Paper at Hypertext -Invited paper at Journal of Web Semantics • Data • 100’s of datasets, 1000’s of files • Very open domain(s) • Very expensive to normalize • Scaling complexity from high dimensionality • Approach • Pay-as-you go approach, only process what you need • Do not stick to a common model, use any you can find • Generate interesting views and feed them to “analytics” • Lessons learned • Multiple models, depending on context • Need to do things incrementally • Lightweight generally better than heavyweight Documents + Metadata Links Views Structure Entities Insight …. Pay-as-you-go, Gain-as-you-go

  8. Dublinked - Towards a robust test-bed for Open Data Research Open REST Web Services API Research Testbed Challenges include .. Privacy & Security Publication & Annotation Scalable privacy and security of resources Automated assimilation and sharing of resources Search & Query Catalog & Navigation Visualization & Analytics Knowledge Representation & Reasoning Robust models to organize and represent resources and their context Represent knowledge efficiently for continuous machine reasoning and diagnosis Enterprise Platform Dublin City IBM IOC Interaction with Industry Solutions IBM Connections Social Media & Collaboration Compose resources for development, mash-up & visualization IBM Research IBM Enterprise Cloud Scalable compute, storage & network infrastructure Key IBM Products & Services Provider 1…N Enterprise Citizen Partners & People

  9. What we do: Learning Systems to Help Diagnose the City Problem How can we provide City decision makers with explanations and diagnoses for events by applying machine reasoning techniques to a fusion of massive, rich, complex and dynamic data? How can we move from explanation to prediction? • Challenges • Identifyingrelevant data and information • Capturing and representing anomalies • Correlating knowledge on heterogeneous data sources • Advanced fusion of heterogeneous data from multiple sources • Goals • Identification of the nature and cause of changes • Explaining logical connection of knowledge across space and time • Move from explanation to prediction Detection to Diagnosis? Anomaly Detected: Delayed buses, congested roads

  10. Outline Research Roadmap Dynamic Distributed Information Analytics • Life analytics (social/health/public safety) • High-risk/time-critical alerting • Cross-agency Alerting Use Cases • Cross Web-Enterprise Analytics • Many-agency Analytics • Public Safety Integrator • Fine-grain Access Control • Streaming Analytics • Distributed Reasoning • Context Mining • Linked Data Cloud Context Retrieval • Cross-agency Context Retrieval • Cross-agency Analytics • Provenance • Privacy • High-volume distributed querying • Wide-scale distributed querying • Distributed Entity Linking • Lightweight Distributed Information Access • Contextual Access • Basic Access Control • Distributed Entity Consolidation • Graph Access Technology Data Warehouse

More Related