 
                
                E N D
Chapter 7 Nonlinear Optimization Models
Introduction • In many complex optimization problems, the objective and/or the constraints are nonlinear functions of the decision variables. Such optimization problems are called nonlinear programming (NLP) problems. In this chapter, we discuss a variety of interesting problems with inherent nonlinearities, from product pricing to portfolio optimization to rating sports teams.
Introduction continued • A model can become nonlinear for several reasons, including the following: • There are nonconstant returns to scale, which means that the effect of some input on some output is nonlinear. • In pricing models, where the goal is to maximize revenue (or profit), revenue is price multiplied by quantity sold, and price is typically the decision variable. Because quantity sold is related to price through a demand function, revenue is really price multiplied by a function of price, and this product is a nonlinear function of price.
Introduction continued • Analysts often try to find the model that best fits observed data. To measure the goodness of the fit, they typically sum the squared differences between the observed values and the model’s predicted values. Then they attempt to minimize this sum of squared differences. The squaring introduces nonlinearity. • In one of the most used financial models, the portfolio optimization model, financial analysts try to invest in various securities to achieve high return and low risk. The risk is typically measured as the variance (or standard deviation) of the portfolio, and it is inherently a nonlinear function of the decision variables (the investment amounts).
Introduction continued • As these examples illustrate, nonlinear models are common in the real world. In fact, it is probably more accurate to state that truly linear models are hard to find. • The real world often behaves in a nonlinear manner, so when you model a problem with LP, you are typically approximating reality. • By allowing nonlinearities in your models, you can often create more realistic models. Unfortunately, this comes at a price - nonlinear optimization models are more difficult to solve.
Basic ideas of nonlinear optimization • When you solve an LP problem with Solver, you are guaranteed that the Solver solution is optimal. • When you solve an NLP problem, however, Solver sometimes obtains a suboptimal solution.
Basic ideas of nonlinear optimization continued • For the figure graphed below, points A and C are called local maxima because the function is larger at A and C than at nearby points. • However, only point A maximizes the function; it is called the global maximum. • The problem is that Solver can get stuck near point C, concluding that C maximizes the function.
Convex and concave functions • Solver is guaranteed to solve certain types of NLPs correctly. • To describe these NLPs, we need to define convex and concave functions. • A function of one variable is convexin a region if its slope (rate of change) in that region is always nondecreasing. Equivalently, a function of one variable is convex if a line drawn connecting two points on the curve never lies below the curve.
Convex and concave function continued • In contrast, the function is concave if its slope is always nonincreasing, or equivalently, if a line connecting two points on the curve never lies above the curve.
Convex and concave functions continued • It can be shown that the sum of convex functions is convex and the sum of concave functions is concave. • If you multiply any convex function by a positive constant, the result is still convex, and if you multiply any concave function by a positive constant, the result is still concave. • However, if you multiply a convex function by a negative constant, the result is concave, and if you multiply a concave function by a negative constant, the result is convex.
Problems that solvers always solve correctly • In some situations, if certain conditions hold, Solver is guaranteed to find the global optimum. • Conditions for maximization problems: both conditions below have to be true. • The objective function is concave or the logarithm of the objective function is concave, and • The constraints are linear. • Conditions for minimization problems: • The objective function is convex, and • The constraints are linear.
When assumptions do not hold • There are many problems for which the conditions outlined previously do not hold or cannot be verified. • Because there is then some doubt whether Solver’s solution is the optimal solution, the best strategy is to • Try several possible starting values for the changing cells, • Run Solver from each of these, and • Take the best solution Solver finds.
When assumptions do not hold continued • In general, if you try several starting combinations for the changing cells and Solver obtains the same optimal solution in all cases, you can be fairly confident - but still not absolutely sure - that you have found the optimal solution to the NLP. • On the other hand, if you try different starting values for the changing cells and obtain several different solutions, then all you can do is keep the best solution you have found and hope that it is indeed optimal.
Multistart option • There is a welcome new feature in Solver for Excel 2010, the Multistart option. • Because it is difficult to know where to start, the Multistart option provides an automatic way of starting from a number of starting solutions. • It selects several starting solutions automatically, runs the GRG nonlinear algorithm from each, and reports the best solution it finds.
Multistart option continued • To use the Multistart option, select the GRG Nonlinear method in the Solver dialog box, click on Options and then on the GRG Nonlinear tab. You can then check the Use Multistartbox, as shown here.
Pricing models • Setting prices on products and services is becoming a critical decision for many companies. • A good example is pricing hotel rooms and airline tickets. To many airline customers, ticket pricing appears to be madness on the part of the airlines (how can it cost less to fly thousands of miles to London than to fly a couple of hundred miles within the United States?), but there is a method to the madness. • In this section, we examine several pricing problems that can be modeled as NLPs.
Multiple product purchases • Many products create add-ons to other products. • For example, if you own a men’s clothing store, you should recognize that when a person buys a suit, he often buys a shirt or a tie. • Failure to take this into account causes you to price your suits too high—and lose potential sales of shirts and ties. • Example 7.3 illustrates the idea.
Peak-load and off-peak demands • In many situations, there are peak-load and off-peak demands for a product. • In such a situation, it might be optimal for a producer to charge a larger price for peak-load service than for off-peak service. • Example 7.4 illustrates this situation.
Advertising response and selection models • In Chapter 4, we discussed an advertising allocation model (Example 4.1), where the problem was basically to decide how many ads to place on various television shows to reach the required number of viewers. • One assumption of that model was that the “advertising response” - that is, the number of exposures - is linear in the number of ads. This means that if one ad gains, say, one million exposures, then 10 ads will gain 10 million exposures. • This is a questionable assumption at best.
Advertising response and selection models continued • More likely, there is a decreasing marginal effect at work, where each extra ad gains fewer exposures than the previous ad. • In fact, there might even be a saturation effect, where there is an upper limit on the number of exposures possible and, after sufficiently many ads, this saturation level is reached.
Advertising response and selection models continued • In this section, we look at two related examples. • In the first example, a company uses historical data to estimate its advertising response function - the number of exposures it gains from a given number of ads. This is a nonlinear optimization model. • This type of advertising response function is used in the second example to solve a nonlinear version of the advertising selection problem from Chapter 4. Because the advertising response functions are nonlinear, the advertising selection problem is also nonlinear. • This model is demonstrated by Example 7.5
Advertising selection model • Now that you know how a company can estimate the advertising response function for any type of ad to any group of customers, you can use this type of response function in an advertising selection model. • This model is shown in Example 7.6
Facility location models • Suppose you need to find a location for a facility such as a warehouse, a tool crib in a factory, or a fire station. • Your goal is to locate the facility to minimize the total distance that must be traveled to provide required services. • Facility location problems such as these can usually be set up as NLP models. • Example 7.7 is typical.
Models for rating sports teams • Sports fans always wonder which team is best in a given sport. Was USC, LSU, or Oklahoma number one during the 2003 NCAA football season? • You might be surprised to learn that Solver can be used to rate sports teams. • We illustrate one method for doing this in example 7.8
Methodology for rating sports teams • We first need to explain the methodology used to rate teams. Suppose that a team plays at home against another team. Then our prediction for the point spread of the game (home team score minus visitor team score) is Predicted point spread = Home team rating - Visitor team rating + Home team advantage • The home team advantage is the number of points extra for the home team because of the psychological (or physical) advantage of playing on its home field. Football experts claim that this home team advantage in the NFL is about 3 points. However, we will estimate it, as well as the ratings.
Methodology for rating sports teams continued • We define the prediction error to bePrediction error = Actual point spread - Predicted point spread • We determine ratings that minimize the sum of squared prediction errors. To get a unique answer to the problem, we need to “normalize” theratings - that is, fix the average rating at some nominal value.
Portfolio optimization models • Given a set of investments, how do financial analysts determine the portfolio that has the lowest risk and yields a high expected return? • This question was answered by Harry Markowitz in the 1950s. For his work on this and other investment topics, he received the Nobel Prize in economics in 1990.
Portfolio optimization models continued • The ideas discussed in this section are the basis for most methods of asset allocation used by Wall Street firms. • Asset allocation models are used, for example, to determine the percentage of assets to invest in stocks, gold, and Treasury bills.
Portfolio selection models • Most investors have two objectives in forming portfolios: to obtain a large expected return and to obtain a small variance (to minimize risk). • The problem is inherently nonlinear because variance is a nonlinear function of the investment amounts. • The most common way of handling this two-objective problem is to require a minimal expected return and then minimize the variance subject to the constraint on the expected return. • Example 7.9 illustrates how to accomplish this in Excel.
Estimating the beta of a stock • For financial analysts, it is important to be able to predict the return on a stock from the return on the market, that is, on a market index such as the S&P 500 index. • Here, the return on an investment over a time period is the percentage change in its value over the time period. • There is a variable called beta (β), which is never known but can only be estimated. It measures the responsiveness of a stock’s return to changes in the market return. • The returns on stocks with large positive or negative betas are highly sensitive to the business cycle.
Estimating the beta of a stock continued • Sharpe’s capital asset pricing model (CAPM) implies that stocks with large beta values are riskier and therefore must yield higher returns than those with small beta values. • This implies that if you can estimate beta values more accurately than people on Wall Street, you can better identify overvalued and undervalued stocks and make a lot of money. • There are four possible criteria for choosing the unknown estimates of beta.
Estimating the beta of a stock continued • Criterion 1: Sum of squared errors (Least Squares) Here the objective is to minimize the sum of the squared errors over all observations, the same criterion used elsewhere in this chapter. The sum of the squared errors is a convex function of the estimates a and b, so Solver is guaranteed to find the (unique) estimates of αand βthat minimize the sum of squared errors. The main problem with the least squares criterion is that outliers, points for which the error in Equation (7.12) is especially large, exert a disproportionate influence on the estimates of α and β.
Estimating the beta of a stock continued • Criterion 2: Weighted sum of squared errors Criterion 1 gives equal weights to older and more recent observations. It seems reasonable that more recent observations have more to say about the beta of a stock, at least for future predictions, than older observations. To incorporate this idea, a smaller weight is attached to the squared errors for older observations. Although this method usually leads to more accurate predictions of the future than least squares, the least squares method has many desirable statistical properties that weighted least squares estimates do not possess.
Estimating the beta of a stock continued • Criterion 3: Sum of absolute errors (SAE) Instead of minimizing the sum of the squared errors, it makes sense to minimize the sum of the absolute errors for all observations. This is often called the sum of absolute errors (SAE) approach. This method has the advantage of not being greatly affected by outliers. Unfortunately, less is known about the statistical properties of SAE estimates. Another drawback to SAE is that there can be more than one combination ofa and b that minimizes SAE. However, SAE estimates have the advantage that they can be obtained with linear programming.
Estimating the beta of a stock continued • Criterion 4: Minimax A final possibility is to minimize the maximum absolute error over all observations. This method might be appropriate for a highly risk-averse decision maker. This minimax criterion can also be implemented using LP. • Example 7.10 illustrates how Solver can be used to obtain estimates of α and βfor these four criteria.
Conclusion • A large number of real-world problems can be approximated well by linear models. • However, many problems are also inherently nonlinear. • We have illustrated several such problems in this chapter, including the important class of portfolio selection problems where the risk, usually measured by portfolio variance, is a nonlinear function of the decision variables.