Modeling YouTube QoE based on Crowdsourcing and Laboratory User Studies

Modeling YouTube QoE based on Crowdsourcing and Laboratory User Studies Tobias Hoßfeld, Raimund Schatz STSM 15.8.-30.9.2011 http://www3.informatik.uni-wuerzburg.de/research/fia http://www3.informatik.uni-wuerzburg.de/staff/hossfeld

QoE Issue: Waiting, Waiting, Waiting… Waiting Time Perception Stalling

Research Activities Related to STSM Application-Level Measurements • bottleneck scenario with constant bandwidth • video characteristics • realistic stalling patterns Monitoring and Stalling Detector • heuristics fit QoS • information extraction approach leads to exact QoE results video player parameter, initial buffer 2sec variable video bit rate V; high stalling frequency for V=B used stalling lengthin tests: 1-6sec QoE management stalling as key influence factor QoE Modeling • only stalling relevant, not content, demographics, etc. • users “accept” almost no or only short stalling • crowdsourcing supports i:lab Optimization and Dimensioning • initial delay (GI/GI/1): T0/D<5% • bandwidth provisioning: 120%V • TCP better UDP in bottleneck mapping between QoS(e.g. bandwidth B) and QoE

Executive Summary of STSM Developed Test Design Application Measurements • Remote users • ‘Reliability’ questions • App./user monitoring • Preloading of data Realistic parametersfor temporal stimuli Conducted Crowd-sourcing Tests Laboratory Study • Data analysis • Identification of reliableusers • Key influences factorsvia machine learning • Fitting with fundamentalrelationships • Reliable users • Different demographics • Different test setting, e.g.longer user tests Derived QoEModel • Mapping function: stalling and QoE • Acceptance vs. perception • Comparison crowdsourcingwith laboratory results

Crowdsourcing Workflow 1 2 • Challenge: identify unreliable QoE results Countermeasures: • proper test design (gold standard data, consistency questions, content questions, application monitoring) • filtering data and analyzing QoE results 4 3 5  Methods also applicable to e.g. field trials!

Crowdsourcing: Unreliable workers • LEVEL 1: ‘reliability’ questions • - wrong answers to content questions • different answers to the same questions • always selected same option • consistency questions: specified the wrong country/continent LEVEL 2: ‘QoE’ question - did not notice stalling - perceived non-existent stalling LEVEL 3: ‘application/user’ monitoring - did not watch all videos completely • SOS hypothesis indicates unreliable test • Many user ratings rejected •  further improvementsrequired • User warnings („Test not done carefully“)  rejection rate decreased about 50% • Filtering may be too strict  application layer monitoring not reliable C1 C2 C3 C4 C5 C6 C7 Facebook

Crowdsourcing vs. Laboratory Studies • Key influence factors on YouTube QoE stalling frequency and stalling duration determine the user perceived quality • Lab studies within ACE 2.0 at FTW’s i:Lab • Similar shapes of curves in laboratory and crowdsourcing study 4 seconds of stalling

Conclusions • Most of relevant stimuli of Internet applications are of temporal nature • QoE models have to be extended in temporal dimension: stalling, waiting times, service interruptions • Gap between user perception and user acceptance, differences in lab and crowdsourcing (WG3) • ‘Failed’ subjective studies for analysis of reliability (WG4) • Standards to detect unreliable subjects (WG5) • Crowdsourcing appears promising • Tests are conducted fast at low costs • Possibility to access different user groups (in terms of expectations/social background) • But new challenges are imposed WG1:“Web and cloud apps” WG2: “Crowd-sourcing”

Outcome of STSM • “Quantification of YouTube QoE via Crowdsourcing” by Tobias Hoßfeld, Raimund Schatz, Michael Seufert, Matthias Hirth, Thomas Zinner, Phuoc Tran-Gia, IEEE International Workshop on Multimedia Quality of Experience - Modeling, Evaluation, and Directions (MQoE 2011), Dana Point, CA, USA, December 2011. • “FoG and Clouds: On Optimizing QoE for YouTube” by Tobias Hoßfeld, Florian Liers, Thomas Volkert, Raimund Schatz, accepted at 5th KuVS GI/ITG Workshop "NG Service Delivery Platforms", at DOCOMO Euro-Labs, Munich, Germany • “Quality of Experience of YouTube Video Streaming for Current Internet Transport Protocols” by Tobias Hoßfeld and Raimund Schatz, currently under submission at ACM Computer Communications Review; a technical report of University of Würzburg is available containing the numerical results, Technical Report No. 482: “Transport Protocol Influences on YouTube QoE”, July 2011. • " ‘Time is Bandwidth’? Narrowing the Gap between Subjective Time Perception and Quality of Experience” by Sebastian Egger, Peter Reichl, Tobias Hoßfeld, Raimund Schatz, submitted to IEEE ICC 2012 - Communication QoS, Reliability and Modeling Symposium • “Challenges of QoE Management for Cloud Applications” by Tobias Hoßfeld, Raimund Schatz, Martin Varela, Christian Timmerer, submitted to IEEE Communications Magazine, Special Issues on QoE management in emerging multimedia services • “Recommendations and Comparison of Subjective User Tests via Crowdsourcing and Laboratories for online video streaming”, intended for submission • “Impact of Fake User Ratings on QoE”, intended for Journal submission.

Modeling YouTube QoE based on Crowdsourcing and Laboratory User Studies

Modeling YouTube QoE based on Crowdsourcing and Laboratory User Studies

Presentation Transcript

Laboratory Based Case Studies

User studies

ChemModLab: A Web-based Cheminformatics Modeling Laboratory

Modeling and generation of network traffic based on user behavior

User Modeling

Laboratory studies

User Modeling

User Modeling

Simulation Based Study of QoE

Laboratory studies

User Generated Content and Crowdsourcing

User Studies

User Modeling Thoughts on LMs

User studies

ApiDB: User Studies and Impact on Development

User Testing and Modeling

User Modeling

User Modeling

Laboratory and animal studies

Titan UVIS Airglow Spectra: Modeling and Laboratory Studies

User Modeling

User studies