1 / 26

Trading Agent Competition (TAC)

Trading Agent Competition (TAC). Jon Lerner, Silas Xu, Wilfred Yeung CS286r, 3 March 2004. TAC Overview. International Competition Intended to spur research into trading agent design First held in July 2000 TAC Classic and TAC SCM Scenarios. TAC Classic.

jam
Télécharger la présentation

Trading Agent Competition (TAC)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Trading Agent Competition(TAC) Jon Lerner, Silas Xu, Wilfred Yeung CS286r, 3 March 2004

  2. TAC Overview • International Competition • Intended to spur research into trading agent design • First held in July 2000 • TAC Classic and TAC SCM Scenarios

  3. TAC Classic • Each team in charge of virtual travel agent • Agents try to find travel packages for virtual clients • All clients wish to travel over same five day period • Clients not all equal, each has different preferences for certain types of travel packages

  4. Travel Packages • Each contains flight info, hotel type, and entertainment tickets • To gain positive utility from client, agents must construct feasible packages. Feasible means: • Arrival date strictly less than departure date • Same hotel reserved during all intermediate nights • At most one entertainment event per night • At most one of each type of entertainment ticket

  5. Flights • Clients have preferences for ideal arrival/departure dates • Infinite supply of flights sold through continuously clearing auctions • Prices set by a random walk • Prices later set to drift upwards to discourage waiting • No resale or exchange of flights permitted

  6. Hotels • Two hotels – high quality and low quality, 16 rooms per hotel per night • Sold through ascending, multi-unit, sixteenth-price auctions: one auction for all rooms for single hotel on single night • Periodically a random auction closes to encourage agents to bid • Clients have different values for high and low quality hotels

  7. Entertainment • Three types of entertainment available • Clients have value for each type • Each agent has initial endowment of tickets • Buy and sell tickets through continuous double auction

  8. Agent Themes • Agents have to address: • When to Bid • What to Bid On • How Much to Bid • Combinatorial preferences, but not combinatorial auctions

  9. Strategies • What strategies come to mind? • What AI techniques might be useful? • Simple vs. Complicated Strategies • How quickly should you adapt as game progresses? • Use of historical data vs. Focus on current game only • Play the game vs. Play the players

  10. living agents (Living Systems AG)Winner: TAC 2001 • Makes two assumptions • 1. Steadily increasing flight prices favor early decisions for flight tickets. • 2. Especially the good performing teams are following a strategy to maximize their own utility. They are not trying to take the risk to reduce other team’s utility. • Simple strategy • Makes substantial use of historical data. • Barely any monitoring/adapting to changing conditions • Benefits from other agents’ complicated algorithms to control price; Open-loop, Play the Players

  11. living agents: Determining Hotel and Flight Bids • Assume hotel auction will clear at historical levels • Using these as hotel prices, initial flight prices, and client preferences, determine optimal client trips • Immediately place bids based on this optimum • Purchase corresponding flights immediately • Place offers for required hotels at prices high enough to ensure successful acquisition

  12. Entertainment Auction • Immediately makes fixed decision as to which entertainment to attempt to buy/sell assuming the historical clearing price of about $80. • Opportunistically buy and sell around this point • Put in final reservation prices at seven minute mark.

  13. How good is living agents? • Risky • If hotel bids are not high enough, fails to complete trips, resulting in huge loss of points. • If hotel clears at living agents’ bid, potentially pays much more than necessary • After placing initial bid, does not monitor hotel or flight auctions at all • Clearly not all agents could use this strategy (Hotel auctions) • Simple • Buys flights immediately, avoiding cost of waiting • Relies on historical data • Contains information from many games • But how sensitive is evolution of game to changes in client preferences, or changes in opponents’ strategy?

  14. Applicability • Use of historical data for predictive information • Feasibility of simple strategies that ignore feedback • Play against the players (not prices), under the assumption that other agents keep things relatively efficient.

  15. ATTac (AT&T Research)Winner: TAC 2002 • Uses sophisticated machine-learning techniques to predict future hotel prices based on the current situation • Buys flights based on cost-benefit analysis of committing versus waiting • Minute-by-minute reoptimization of bids based on holdings and predictions

  16. The heart of ATTac • Assumption: Because of many unknowns, exactly predicting the price of a hotel room is hopeless. • Instead, regard the closing price as a random variable that needs to be estimated, conditional on our current state of knowledge • Number of minutes remaining in game • Ask price of each hotel • Flight prices • Historical Date • Construct a model of the probability distribution over clearing prices (based on a boosting algorithm), stochastically sample prices, and compute expected profit

  17. The high-level algorithm • Denote the most profitable allocation of goods at any time by G* • When first flight quotes are posted: • Compute G* with current holdings and expected prices • Buy the flights in G* for which the expected cost of postponing commitment exceeds the expected benefit of postponing commitment • Starting 1 minute before each hotel close: • Compute G* with current holdings and expected prices • Buy the flights in G* for which expected cost of postponing commitment exceeds expected benefit of postponing commitment • Bid hotel room expected marginal values given holdings, new flights, and expected hotel purchases • Last minute: Buy remaining flights as needed by G* • In parallel (continuously): Buy/sell entertainment tickets base on their expected values

  18. The boosting algorithm: solving conditional density estimation problems • Start with ordered pairs (x,y), with x being a vector that describes auction-specific features, y being the difference between closing price and current price • Aim of boosting is, given current x, to estimate the conditional distribution of y • Construct conditional distribution function that minimize the sum of negative log likelihood of y given x, for all training samples. • Use this condition distribution function to map x to y

  19. living agents vs. ATTac • Two very different approaches • Statistically insignificant difference in scores in TAC2001

  20. Open and Closed Loop Processes • Closed-loop: system feeds information back into itself. Examines the world in an effort to validate the world model. • appropriate for real-world environments in which feedback is necessary to validate agent actions. • Open-loop: no feedback from the environment to the agent. Output from processes are considered complete upon execution. • appropriate for simulated rather than real environments (tasks not performed perfectly by agent generally.) • generally more efficient for the same reason.

  21. Walverine: (Closed-loop) • Model Based: Flight and Hotel • Predicts hotel prices by Walrasian equilibrium • Derives expected demand from 64 clients’ preferences and initial flight prices, which influence clients’ choice of travel days, and • Construct bids that max expected value of bid • Model Free: Entertainment • Q-Learning from thousands of auction instances (aside on model vs model-free learning) • No empirically tuned parameters

  22. SouthamptonTAC: (Closed-loop) • Adaptive agent, varies strategy to mkt cond. • 3 classifications for environments: • Non-competitive (agent gets hotel at low prices) • Semi-competitive (medium prices) • Competitive (prices of hotels high) • Based on curr game and outcomes of recent games • Non-competitive: • Buys all flights at beginning of game • Never change itinerary of clients

  23. SouthamptonTAC: (Closed-loop) • Competitive: • Rapidly rising prices – buy at beginning • Stagnant prices – buy near the end • Fuzzy reasoning to predict hotel clearing prices • 3 rule bases • Factors inc: price of hotel, counterpart, price change in prev minute, price change in counterpart hotel in prev minute • Continuously assesses game type

  24. ROXY-BOT: (Open-loop) • Two phase bidding policy: • Solve completion problem • Optimization based on a tree structure using beam search that only partially expands the tree. [Greenwald] • Valuate goods in that set • Marginal utility calculator MU(x) = V(N) – V(N|x) • Computing Prices: (historical data) • Point estimates (’00) • Estimated price distributions (’01) • Averaging MU across many samples of estimated price dist • Monte-Carlo simulation to evaluate bidding policy (’02)

  25. Whitebear (Winner in ’02, Open-loop) • Flights: • A: buy everything • B: buy only what is absolutely necessary • Combination: buy everything except dangerous tickets • Hotels: (predictions simply historical averages) • A: bid small increment greater than current prices • B: bid marginal utility • Combination: Use A, unless MU is high, use B • Domain specific, extensive experimentation • No necessarily optimal set of goods, no learning

  26. Summary: Open vs Closed • All else equal open-strategy better: • Simple • Avoids waiting costs (higher prices) • Predictability of price is determining factor • Perfectly predictable – open-loop • Large price variance – closed-loop • Open-loop picks the good at the start and may pay a lot • Small price variance – optimal closed loop • But complexity for potentially small benefit

More Related