Exploring Data Collection and Analysis Techniques Using Web Crawling and API Integration

Eddie Aronovich eddiea@cs.tau.ac.il Tools presentation

Once upon a time

“command line” input Files Web crawling (pull) Web sensors (using API - push) “Evolution of the input”

LinkedIn MAP Gapminder - http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen.html - http://www.ted.com/talks/nicholas_christakis_the_hidden_influence_of_social_networks.html Evolution of the output (multiple dimensions)

Twitter • http://api.twitter.com/1/users/show.json?screen_name=TheMarker • Format the output (json) https://dev.twitter.com/docs/api/1/get/search • FB • /usr/bin/python fbconole.py fql("SELECT uidFROM user WHERE username='ariel.bardavid.5'" https://developers.facebook.com/docs/reference/apis/ API examples

import json from pprint import pprint json_data=open('json_data') data = json.load(json_data) pprint(data) json_data.close() Python code for json format

wget + parser (html2txt) ETL (Extract, Transform, Load) Structured vs. Unstructured data Web crawling

Scripting • bash • sed • awk • cron (and scratch space) • Hadoop • Condor Some general tools

Collect Data (and extract it) Analyze Data Build a model Run the model Collect more data Overview

Exploring Data Collection and Analysis Techniques Using Web Crawling and API Integration

Exploring Data Collection and Analysis Techniques Using Web Crawling and API Integration

Presentation Transcript

Requirements Management with Tools – A panel presentation

Unit Five Presentation Tools

Web Based Presentation Tools

Open Forum Presentation BI Tools Upgrade Project

Open Health Tools Distributed Terminology System Presentation

Presentation Tools

Video Presentation Tools for the Flipped Classroom

Open Health Tools Membership Presentation

IBC PowerPoint and Excel Tools Presentation

SECOORA Geo Tools Presentation

Presentation of the tools developed (part 1)

SolidWorks Presentation and Communication Tools

Presentation of the tools developed (part 1)

Task 3: PowerPoint (presentation tools)

Data presentation tools for Public Health

PRESENTATION ON CASE TOOLS

1255 Technical Training Service Tools Overview Presentation

Top 5 Online Presentation Tools

Teaching with Presentation Tools & Apps 2013

Multiple Benefits of Presentation Tools

Data presentation tools for Public Health