1 / 12

WP6: Software Platform and Tools

WP6: Software Platform and Tools. Lead: UDE Partners: UMA, CICE, FriontiersIn Month 1 - Month 30. Overview. Bundles a ll activities related to the provision of a software platform hosting tools and services for data mining, crawling and social network analysis

debra
Télécharger la présentation

WP6: Software Platform and Tools

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WP6: Software Platformand Tools Lead: UDE Partners: UMA, CICE, FriontiersIn Month 1 - Month 30

  2. Overview • Bundles all activities related to the provision of a software platform hosting tools and services for data mining, crawling and social network analysis • Relies on existing tools, either free and open software or tools owned by the partners • First part: definition of crawling, data mining, storage strategy • Second part: Data transformation for SNA, definition of network based role model and evaluation of these models

  3. Specific objectives • Selection and evaluation of mining strategies • Specification of crawling approach and integration of crawlers • Specification and configuration of a software platform • Preparation / transformation of data for SNA • Specification and modelling of roles and constellations (SNA) • Data analyses and evaluation • Model revision and software adaptation

  4. T6.1 Crawler and mining strategy • Specify requirements for crawling and data mining based on the focused data sources and social models • flexible with respect crawling strategies to be adaptable also to the needs other work packages (esp. the case studies) • integrated and controlled by a framework which handles the storage of retrieved web objects and the notification of newly found relevant data and changes in the data sources. Responsible: UMA (2PM) Contributors: UDE (1PM), CICE (1PM)

  5. T6.2 Semantic evaluation and filtering • Categorize and filter data retrieved from the various data sources • relies on techniques adopted from the field of knowledge discovery in databases (KDD) • encompass the pre-processing of given data in terms of statistical sampling, cleaning and transformation of the data into adequate representations for the subsequent algorithms Responsible: UDE (2PM) Contributors: UMA (1PM)

  6. T6.3 Framework for storage, notification and triggering • Retrigger crawler due changes in data corpus over time • Re-triggering based on a "when appropriate" strategy • recognition of specific events such as new conference announcements or availability of proceedings. • Notify its users about new and relevant findings Responsible: UDE (2PM) Contributors: UMA (2PM), CICE (2PM)

  7. T6.4 Data transformation and structural modeling for SNA • Define a common data format for sharing within consortium based on the identification of relevant communities and their "traces" (communication, co-publications etc.), and based on the general conceptual model (WP 2) • Define and specify typical roles and constellations (e.g. broker) based on SNA techniques (e.g. blockmodeling) • Continuously verification of social indicators Responsible: UDE (2PM) Contributors: UMA (2PM)

  8. T6.5 Software platform • Configure an integrated software platform for crawling/data mining and SNA based on the initial specifications • input relates to the transformation from relevant data sources (specified in T6.4) • output is concerned with visualisation and reporting • Revised and adapt platform according to emerging issues and needs (esp. considering the case studies) Uses freely available (open) software and software owned by the partners (mainly UDE) Responsible: UDE (7PM) Contributors: UMA (4PM), CICE (1PM)

  9. T6.6 Data analysis and evaluation • Test platform with standard cases based on specifications of WP 4 (Measurements and Social Indicators) • early phase: test functioning of the platform and its components (from T6.5) and adequacy of the semantic filters (T6.2) and structural definitions (T6.4). • later stage: evaluate actual performance and community developments in association with the case studies and with WP 4. Responsible: UDE (3PM) Contributors: Frontiersln (2PM), UMA (1PM), CICE (1PM)

  10. Deliverables and Milesones Deliverables • 6.1 Mining strategy and requirements specification for the software platform (RP:UDE,RV:UMA, C: all in /M5) • 6.2 First version of structural definitions (RP: UDE, RV: UMA, C: all in / M10) • 6.3 Configuration, test of the platform and first evaluation report (RP:UDE,RV:CICE,C: all in /M22) • 6.4 Final report and system (RP:UDE,RV:CICE,C: all in /M30) Milestones • MS2, SISOB System first prototype, month 15 • MS3, SISOB Final System, month 30

  11. Tools • Open Source Crawler • DMD –Data-Multiplexer-Demultiplexer • WOS2Pajek, Pajek, and UCINET • CFinder

  12. Challenges • Data model adequate to different data sources • Data model supporting multilevel analysis according to multivocality in project • Merging different types of data • Cleaning data • e. g. researchers having different email • e. g. researchers writing their names in different ways • How to get data from Web 2.0 Platforms like Mendeley

More Related