Effective Oral Presentations: Design and Delivery Tips

E3 Finish-up; Intro to Clustering & Unsup. • Kearns & Singh, “Near-Optimal Reinforcement Learning in Polynomial Time.”Machine Learning 49, 2002. • Class Text: Sec 3.9; 5.9; 5.10; 6.6

Administrivia • Office hours next week (Nov 24) • Truncated on early end • From “whenever I get in” ‘til noon • Or by appt.

Oral presentations: Tips #1 • Not too much text on any slide • No paragraphs!!! • Not even full sentences (usually) • Be sure text is readable • Fonts big enough • Beware of Serifed Fonts

Oral presentations: Tips #1 This is a deliberately bad example of presentation style. Note that the text is very dense, there’s a lot of it, the font is way too small, and the font is somewhat difficult to read (the serifs are very narrow and the kerning is too tight, so the letters tend to smear together when viewed from a distance). It’s essentially impossible for your audience to follow this text while you’re speaking. (Except for a few speedreaders who happen to be sitting close to the screen.) In general, don’t expect your audience to read the text on your presentation -- it’s mostly there as a reminder to keep them on track while you’re talking and remind them what you’re talking about when they fall asleep. Note that these rules of thumb also apply well to posters. Unless you want your poster to completely standalone (no human there to describe it), it’s best to avoid large blocks of dense text.

Oral presentations: Tips #1 • Also, don’t switch slides too quickly...

Exercise • Given: MDP M=〈S,A,T,R〉; discount factor γ, max absolute rwd Rmax=maxS{|R(s)|} • Find: A planning horizonHγmax such that if the agent plans only about events that take place within Hγmax steps, then the agent is gauranteed to miss no more than ε • I.e., For any trajectory of length Hγmax, hγH, the value difference between hγH and hγ∞ is less than ε:

E3 • Efficient Explore & Exploit algorithm • Kearns & Singh, Machine Learning 49, 2002 • Explicitly keeps a T matrix and a R table • Plan (policy iter) w/ curr. T & R -> curr. π • Every state/action entry in T and R: • Can be marked known or unknown • Has a #visits counter, nv(s,a) • After every 〈s,a,r,s’〉 tuple, update T & R (running average) • When nv(s,a)>NVthresh , mark cell as known & re-plan • When all states known, done learning & have π*

The E3 algorithm • Algorithm: E3_learn_sketch // only an overview • Inputs: S, A,γ (0<=γ<1), NVthresh, Rmax, Varmax • Outputs: T, R, π* • Initialization: • R(s)=Rmax // for all s • T(s,a,s’)=1/|S| // for all s,a,s’ • known(s,a)=0; nv(s,a)=0; // for all s, a • π=policy_iter(S,A,T,R)

The E3 algorithm • Algorithm: E3_learn_sketch // con’t • Repeat { • s=get_current_world_state() • a=π(s) • (r,s’)=act_in_world(a) • T(s,a,s’)=(1+T(s,a,s’)*nv(s,a))/(nv(s,a)+1) • nv(s,a)++; • if (nv(s,a)>NVthresh) { • known(s,a)=1; • π=policy_iter(S,A,T,R) • } • } Until (all (s,a) known)

Choosing NVthresh • Critical parameter in E3: NVthresh • Affects how much experience agent needs to be confident in saying a T(s,a,s’) value is known • How to pick this param? • Want to ensure that curr estimate, , is close to true T(s,a,s’) with high prob: • How to do that?

5 minutes of math... • General problem: • Given a binomially distributed random variable, X, what is the probability that it deviates very far from its true mean? • R.v. could be: • Sum of many coin flips: • Average of many samples from a transition function:

5 minutes of math... • Theorem (Chernoff bound): Given a binomially distributed random variable, X, generated from a sequence of n events, the probability that X is very far from its true mean, , is given by:

5 minutes of math... • Consequence of the Chernoff bound (informal): • With a bit of fiddling, you can show that: • The probability that the estimated mean for a binomially distributed random variable falls very far from the true mean falls off exponentially quickly with the size of the sample set

Chernoff bound & NVthresh • Using Chernoff bound, can show that a transition can be considered “known” when: • Where: • N=number of states in M, |S| • δ=amount you’re willing to be wrong by • ε=prob that you got it wrong by more than δ

Poly time RL • A further consequence (once you layer on a bunch of math & assumptions): • Can learn complete model in at most • steps • Notes: • Polynomial in N, 1/ε, and 1/δ • BIG polynomial, nasty constants

Take-home messages • Model based RL is a different way to think of the goals of RL • Get better understanding of world • (Sometimes) provides stronger theoretical leverage • There exists a provably poly time alg. for RL • Nasty polynomial, tho. • Doesn’t work well in practice • Still, nice explanation of why some forms of RL work

Unsupervised Learning:Clustering & Model Fitting

The unsupervised problem • Given: • Set of data points • Find: • Good description of the data

Typical tasks • Given: many measurements of flowers • What different breeds are there? • Given: many microarray measurements, • What genes act the same? • Given: bunch of documents • What topics are there? How are they related? Which are “good” essays and which are “bad”? • Given: Long sequences of GUI events • What tasks was user working on? Are they “flat” or hierarchical?

Effective Oral Presentations: Design and Delivery Tips

Effective Oral Presentations: Design and Delivery Tips

Presentation Transcript

Cluster and Outlier Analysis

BINF636 Clustering and Classification

Clustering of non-numerical data

Failover Clustering: Pro Troubleshooting in Windows Server 2008 R2

unsupervised learning - clustering

Lecture 9: Gene expression analysis/Clustering

1/31/13 Period 4

807 - TEXT ANALYTICS

Computer Science CPSC 502 Lecture 6 Finish Planning Start Logic ( Ch 5)

6.3 Elastic and Inelastic Collisions

Clustering IV

Flip character

الجلسة الرابعة التحليل العنقودي Clustering Analysis تشرح لكل الفئات

Recursive Bipartite Spectral Clustering for Document Categorization

Clustering Methods

Small Galaxy Groups Clustering and the Evolution of Galaxy Clustering

Precedence Diagram

Neighbourhood Groups Task and Finish Group

W3C - Intro and beyond

Precedence Diagram

Segmentation and Clustering