Artificial Intelligence in Game Design

Artificial Intelligence in Game Design Introduction to Learning

Learning and Games • Learning in AI: • Creating new rules automatically • Observation of world • Examples of good/bad actions to take • Major goal of AI • Would be very useful in gaming • Automatic adaptation to player tactics • Infinite replayability • Would be impossible for player to createstrategy that would win forever You can defeat me now, but I shall return smarter than ever!

Learning and Games • Fairer and more plausible NPC behavior • Characters should have same learning curve as players • Start out inexperienced • Become more competent over time • Example: Simple “cannon fire game” • Could use physics to compute exact angle, but would win first turn! • Character should miss badly at first • Should “learn” to get closer over time

Learning in AI • Basic components: Critic Determines how good or bad action was Often in terms of some error Current Rules Actions indicated by rules Inputs from environment Learning Element Determines how to change rules in order to decrease error

Learning in AI • Learning algorithms in AI • Neural networks • Probabilistic learning • Genetic learning algorithms • Common attributes • Requires time • Usually thousands of cycles • Results are unpredictable • Will create any rules that decrease error,not necessarily the ones that make the most sense in a game • Still very limited • No algorithm to automatically generate something as complex as a FSM Not what you want to happen in a game! “Create an opponent that can defeat Data”

Online Learning • Learning most useful if occurs during game • Must be as efficient as possible • Simple methods best • Hill climbing • N-Gram prediction • Decision tree learning • Most successful methods often specific to game • Example: Negative influences at character destruction locations Other units steer around this area Unknown enemy unit Our unit destroyed

Scripted Learning • Can “fake” appearance of learning • Player performs action • Game AI knows best counteraction, but does not perform it • Game AI allows player certain number of that action before beginning to perform counteraction • Like timeout • Number could be chosen at random • Gives appearance that character has “learned” to perform counteraction Player allowed to attack from right for certain number of turns AI begins to defend from right after that point

Scripted Learning • Scripting in cannon game: • Compute actual best trajectory using physics • Add error factor E to computation • Decrease error E over time at rate ΔE • Test different values of ΔE to make sure learns at “same rate” as typical player • Can also different values of ΔE to set “difficulty” level Correct trajectory Small E Large E

Hill Climbing Simple technique for learning optimal parameter values Character AI described in terms of configuration of parameter valuesV = (v1, v2, … vn) Example: Action probabilities for Oswald V = (Pleft, Pright, Pdefend) Oswald’s current V = (0.45, 0.30, 0.25)

Hill Climbing Each configuration of parameter values V = (v1, v2, … vn)has error measure E(V ) Often an estimate based on success of last action(s) Example: Total damage taken by Oswald – Total damage caused by Oswald’s last 3 actions Good enough for hill climbing Goal of learning: Find V such that E(V ) isminimized Or at least “good enough” Configuration with low error measure

Hill Climbing Hill climbing works best for Single parameter Correctness measure which is easy to compute Example: “cannon game” Only parameter: AngleӨof cannon Error measure: Distance between target andactual landing point Ө Error

Error Space Graphical representation of relationship between parameter value and correctness Hill climbing = finding “lowest point” in this space Error Optimal Ө Ө Error = 0 Maximum correctness Ө

Hill Climbing Algorithm Assumption:If small change in one direction increases correctnessThen will eventually reach optimal value if keep changing in that direction Error Direction of decreasing error Ө Ө1 Ө2 Ө3 Ө3 Ө2 Ө1

Hill Climbing Algorithm Ө-ε Ө+ε Ө Ө + δ • Estimate direction of slope in local area of error space • Must sample values near E(Ө) • E(Ө + ε) • E(Ө - ε) • Move in direction of decreasing error • Increase/decrease Ө by some given step sizeδ • If E(Ө + ε) < E(Ө - ε) then Ө = Ө + δ • Else Ө = Ө – δ

Multidimensional Error Space I need to increase both the angle and the charge Ө1 C1 • Exploring multiple parameters simultaneously • Probabilities for Attack Left, Attack Right, Defend • Ability to control “powder charge” C for cannon as well as angle Ө • Vary parameters slightly in all dimensions • E(Ө + ε, C + ε) • E(Ө + ε, C – ε) • E(Ө – ε, C + ε) • E(Ө – ε, C – ε) • Choose combination with lowest error

Multidimensional Error Space I could also move up a hill, or check the wind direction… Ө1 C1 • Can have too many parameters • n parameters = n dimensional error space • Will usually “wander” space, never finding good values • If using learning keep problem simple • Few parameters (one or two best) • Make sure parameters have independent effect on error • Increased charge, angle both increase distance

Hill Climbing Step Size This guy isan idiot! Ө2 Ө1 Ө2 This guy isan idiot! Ө1 • Choosing a good step sizeδ • Too small: learning takes too long • Too large: learning will “jump over” optimal value

Hill Climbing Step Size Ө3 Ө2 Ө1 • Adaptive Resolution • Keep track of previous error E (ӨT-1 ) • If E (ӨT ) < E (ӨT-1 ) assume moving in correct direction • Increase step size to get there faster • δ = δ + κ

Hill Climbing Step Size Ө3 Ө2 Ө1 • If E (ӨT ) > E (ӨT-1 ) assume overshot optimal value • Decrease step size to avoid overshooting on way back • δ = δ × ρ, ρ < 1 • Idea: decrease step size fast • Main goal: Make character actions plausible to player • Should make large changes if miss badly • Should make small changes if near target

Local Minima in Error Space Major assumption: Error space monotonically decreases as move towards goal Other factors may cause error to increase in local areas Ө2 Ө1 Ө2 Ө1 Appears to be worse than first shot!

Local Minima in Error Space Local minima in error space Places where apparent error increases as get closer to optimum value Simple hill climbing can get stuck Error Local minima Optimal Ө Ө Hill climbing will not escape!

Local Minima in Error Space Solutions: Momentum term: Current change based on previous changes as well as current error Define momentum termα Proportion of previous change to current change α < 1 Previous change: ΔӨT-1 Current change Cbased on error either δor –δ Current change ΔӨT= α ΔӨT-1+ (1 – α) C

Local Minima in Error Space “Speeds up” with multiple changes in same direction Will continue to go in same direction for several step even if error indicates change in other direction Idea: Momentum will “run through” local minima Error Local minima Optimal Ө Ө Momentum decreases, but still escapes local minimum Momentum builds down hill

Local Minima in Error Space May need to restart with different initial value Use randomness Something very different from last starting point Plausible behavior – if current actions not working, try something new Very different result Multiple shots with same result

Memory and Learning Ө3 Ө2 Ө1 Ө4 • What if player moves? • Should not have to restart learning process • Should keep appearance that character is slowly improving aim • Should be able to quickly adapt to changes in player strategy

Memory and Learning Ө3 Ө2 Ө1 D(Ө1) D(Ө2) D(Ө3) Closest to new player location is Ө2 • Remember previous actions and effects • Store each angle Өtried and resulting distance D(Ө) • If player moves to location L, start from Өwhose D(Ө) is closest to L D(Ө1) D(Ө2) D(Ө3)

Artificial Intelligence in Game Design