International Series in. Operations Research & Management Science. Robert J. Vanderbei. Foundations and Extensions. Fourth Edition. Linear. Programming. Lecture Notes-Undergraduate level (pdf format) This book is an introductory graduate textbook on linear programming although upper-level graduate students. Linear Programming: Foundations and Extensions is an introduction to the field of Pages PDF · The Simplex Method. Robert J. Vanderbei. Pages
|Language:||English, Spanish, Portuguese|
|Distribution:||Free* [*Register to download]|
LINEAR PROGRAMMING Foundations and Extensions Third Edition Robert J. Vanderbei Dept. of Operations Research and Financial Engineering Princeton. gaulecvebota.ga: Linear Programming: Foundations and Extensions Operations Research & Management Science) (): Robert J Vanderbei: Books. Mar 21, 𝗣𝗗𝗙 | his final solution is then an optimal solution. To start the Linear Programming: Foundations and Extensions Robert Vanderbei.
Let N denote the set of nonbasic indices of i. Suppose at this point that the sequence x k has two or more limit points. This is a contradication and consequently there can be only one limit point. Stopping Rules. In the previous section we described the behavior of the algorithm as it approaches a limit point. In this section we will analyze in more detail this behavior in a small neighborhood around a limit point. This information is used to establish stopping rules for our algorithm.
Therefore, 19 min c. I f 12 is bounded, then M is finite. Note that this assumption is stronger than the boundedness assumption in the previous section. Hence the proposition follows from formula A conservative approach would be to choose a very large number. This makes the precise inequality in Proposition 6 into an approximate inequality. From 18 we see that w x 9 b is an estimate of the optimal value of the objective function.
For the rest of this section we show that this tends to be a better estimate than c. For practical purposes, it may be useful to monitor the duality gap, c- x - w.
For the remainder of this section we impose Assumptions 1 - 3 of the previous section. Hence we see that cB" B - l b is the optimal value of the objective function. Substituting this into 22 and simplifying, we get 2 T 23 w x. Note that this neighborhood can be small if the problem is nearly degenerate. Upper Bound Constraints. If x e S fq F, then it is easy to show that x is optimal cf. Proposition 3 under Assumptions 1 - 3 of Section 4. Algebraic Comparison with the Simplex Method. There are significant parallels, and differences, between our algorithm and the simplex method.
We have found the relationship between the two algorithms noteworthy. The parallels are summarized in Table 1. This is the vector of residuals from the weighted least- squares solution to the problem of minimizing the function IIDx c- Arw with respect to w.
Table 1. Comparison between simplex and our algorithm. Simplex method Our algorithm 1 Initial x is a feasible vertex. See Proposition 6 for an e-optimal stopping rule.
Freedman The main difference lies in how the reduced costs are used to generate a direction p. The simplex method dismisses some information contained in the reduced costs, while our algorithm does not. The price paid is that each iteration of the simplex method takes order mn computations, while each iteration of our algorithm takes order m2n computations. Computational Experience. We have coded our algorithm and the revised simplex method in Pascal.
Both algorithms solve the canonical linear program 1.
To get an interior feasible point for our algorithm, the procedure adds a column to the A matrix as described in Section 2. At each iteration we check to see if the variable that would become zero by taking a step all the way to the wall i.
We have only considered dense problems in our comparisons. For such prob- lems, perhaps the best algorithm for computing the vector w of dual variables is the OR algorithm for solving least-squares problems.
This is the algorithm we have implemented. We have borrowed this code from  to which we refer the reader for details. In both algorithms we have used for infinity. We generated random problems as follows. The elements of the constraint matrix A are independent real values that are uniform on the interval [0, 1.
All random variables were generated using the random procedure in Pascal. Choosing all elements of A to be nonnegative, guarantees that the problem is bounded. The elements of the cost vector c were also independent and uniform on [0, 1. This guarantees that the problem is feasible.
To get an idea of how our algorithm will do on larger problems, we generated random problems. We collected the time and the number of iterations for each problem and for each algorithm. We did a regression on the logarithm of the time number of iterations as a linear function of the logarithms of m and n. The results are shown in Table 2.
The regression is based on data collected from the computer programs, running on a Tandy computer. The most important observation is that, for the regression on run time, Table 2. Performance comparison for dense constraint matrices. Method Time minutes Iterations Simplex method 0. Roughly speaking, this says that to get a relative improve- ment of a factor of 10 of our algorithm over the simplex method one must increase m and n by a factor of Two final observations are that the extrapolated crossover point is sensitive to the shape of the constraint matrix, and typically lies beyond the range of our data.
Lastly, these results may not apply to large sparse problems or problems with special structure. Further algorithmic work and computational testing is needed before conclusions regarding the practical impact of Karmarkar's algorithm can be drawn.
We thank N. Karmarkar and M.
Todd Cornell University for helpful discussions. In particular, Proposition 5 is due to M. We are also grateful to N. Megiddo for carefully reading the paper and for suggesting several significant improvements. References  L.
Atkinson and P. Cavalier and A. Karmarkar, A new polynomial-time algorithm for linear programming, Combinatorica, 4 , Todd and B. Burrell, An extension of Karmarkar's algorithm for linear programming using dual variables, Technical Report No. The Phase I algorithm either proves that the problem is infeasible or produces a basic feasible solution.
The Phase II algorithm either discovers that the problem is unbounded or finds a basic optimal solution. These statements depend, of course, on applying a variant of the simplex method that does not cycle, which we now know to exist.
Geometry As we saw in the previous chapter, the set of feasible solutions for a problem in two dimensions is the intersection of a number of halfplanes, i. In three dimensions, the situation is similar. Consider, for example, the following problem: The same is true for each of the other four inequalities.
The feasible set consists of those points in space that satisfy all five inequalities, i. This set is the polyhedron shown in Figure 3. This polyhedron is bordered by five facets, each facet being a portion of one of the planes that was defined by replacing a constraint inequality with an equation.
The facets acquire a particularly simple description if we introduce slack variables into the problem: Indeed, each facet corresponds precisely to some variable either original or slack vanishing.
The correspondences can be continued. Indeed, each edge of the polyhedron corresponds to a pair of variables vanishing. Going further yet, each vertex of the polyhedron corresponds to three variables vanishing. Every basic feasible solution involves two basic variables and three nonbasic variables. Furthermore, the three nonbasic variables are, by definition, zero in the basic feasible solution.
Therefore, for this example, the basic feasible solutions stand in one-to-one correspondence with the vertices of the polyhedron. But consider now the following problem: Algebraically, the only difference between this problem and the previous one is that the right-hand side of the first inequality is now a 2 instead of a 3.
But look at the 6. The set of feasible solutions for the degenerate problem given by 3. This vertex does not correspond to one basic feasible solution; rather, there are four degenerate basic feasible solutions, each representing it. Indeed, dictionaries 3. We end by considering the geometric effect of the perturbation method for resolving degeneracy.
By perturbing the right-hand sides, one moves the planes that determine the facets. If the moves are random or chosen with vastly different magnitudes all small , then one would expect that each vertex in the perturbed problem would be determined by exactly three planes.
That is, degenerate vertices from the original problem get split into multiple nearby vertices in the perturbed problem. For example, problem 3. Note how the degenerate vertex in Figure 3. The simple pivot tool with the Lexicographic labels can be used to check your arithmetic: NOTES 43 3. Show that this problem has feasible solutions but no vertex solutions. How does this reconcile with the fundamental theorem of linear programming Theorem 3.
Notes The first example of cycling was given by Hoffman Early proofs of the fundamental theorem of linear programming Theorem 3. Hence, finding such variants occupied the attention of early researchers in linear programming. The perturbation method was first suggested by A.
Orden and developed independently 44 3. The essentially equivalent lexicographic method first appeared in Dantzig et al. Theorem 3. For an extensive treatment of degeneracy issues see Gal CHAPTER 4 Efficiency of the Simplex Method In the previous chapter, we saw that the simplex method with appropriate pivoting rules to guarantee no cycling will solve any linear programming problem for which an optimal solution exists.
In this chapter, we investigate just how fast it will solve a problem of a given size. Performance Measures Performance measures can be broadly divided into two types: Similarly, an average-case analysis looks at the average amount of effort, averaging over all problems of a given size. Worst-case analyses are generally easier than average-case analyses. The reason is that, for worst-case analyses, one simply needs to give an upper bound on how much effort is required and then exhibit a specific example that attains this bound.
There are two serious difficulties here. The first is that it is not clear at all how one should model the space of random problems. Secondly, given such a model, one must be able to evaluate the amount of effort required to solve every problem in the sample space. Therefore, worst-case analysis is more tractable than average-case analysis, but it is also less relevant to a person who needs to solve real problems.
In this chapter, we shall give a worst-case analysis of the simplex method. Later, in Chapter 12, we shall present results of empirical studies that indicate the average behavior over finite sets of real problems. Such studies act as a surrogate for a true average-case analysis.
Measuring the Size of a Problem Before looking at worst cases, we must discuss two issues. First, how do we specify the size of a problem? Two parameters come naturally to mind: However, we should mention some drawbacks associated with this choice. First of all, it would be preferable to use only one number to indicate size. Since the data for a problem consist of the constraint coefficients together with the right-hand side and objective function coefficients, perhaps we should use the total number of data elements, which is roughly mn.
Efficient implementations do indeed take advantage of the presence of lots of zeros, and so an analysis should also account for this. Hence, a good measure might be simply the number of nonzero data elements.
This would definitely be an improvement, but one can go further. On a computer, floating-point numbers are all the same size and can be multiplied in the same amount of time. But if a person is to solve a problem by hand or use unlimited precision computation on a computer , then certainly multiplying 23 by 7 is a lot easier than multiplying This measure is popular among most computer scientists and is usually denoted by L.
However, with a little further abstraction, the size of the data, L, is seen to be ambiguous. As we saw in Chapter 1, real-world problems, while generally large and sparse, usually can be described quite simply and involve only a small amount of true input data that gets greatly expanded when setting the problem up with a constraint matrix, right-hand side, and objective function.
So should L represent the number of bits needed to specify the nonzero constraint coefficients, objective coefficients, and right-hand sides, or should it be the number of bits in the original data set plus the number of bits in the description of how this data represents a linear programming problem? No one currently uses this last notion of problem size, but it seems fairly reasonable that they should or at least that they should seriously consider it. Anyway, our purpose here is merely to mention that these important issues are lurking about, but, as stated above, we shall simply focus on m and n to characterize the size of a problem.
Measuring the Effort to Solve a Problem The second issue to discuss is how one should measure the amount of work required to solve a problem. Unfortunately, there are hopefully many readers of this text, not all of whom use the exact same computer.
Even if they did, computer technology changes rapidly, and a few years down the road everyone would be using something entirely different. So the time needed to solve a problem, while the most desirable measure, is not the most practical one here.
Fortunately, there is a fairly reasonable substitute. Algorithms are generally iterative processes, and the time to solve a problem can be factored into the number of iterations required to solve the problem times the amount of time required to do each iteration. The first factor, the number of iterations, does not depend on the computer and so is a reasonable surrogate for the actual time. This surrogate is useful when comparing various algorithms within the same general class of algorithms, in which the time per iteration can be expected to be about the same among the algorithms; however, it becomes meaningless when one wishes to compare two entirely different algorithms.
For now, we shall measure the amount of effort to solve a linear programming problem by counting the number of iterations needed to solve it. Well, we have already seen that for some pivoting rules it can cycle, and hence the worst-case solution time for such variants is infinite.
However, what about noncycling variants of the simplex method? And how big is it? It should be noted that, even though typographically compact, the expression 2n is huge even when n is not very big. We shall now give an example, first discovered by V. Klee and G. The example is quite simple to state: It is instructive to look more closely at the constraints. The first constraint simply says that x1 is no bigger than one. With this in mind, the second constraint says that x2 has an upper bound of about , depending on how big x1 is.
Similarly, the third constraint says that x3 is roughly no bigger than 10, again, this statement needs some adjustment depending on the sizes of x1 and x2. Therefore, the constraints are approximately just a set of upper bounds, which means that the feasible region is virtually an n-dimensional hypercube: For this reason, the feasible region for the Klee—Minty problem is often referred to as the Klee—Minty cube. An n-dimensional hypercube has 2n vertices, and, as we shall see, the simplex method with the largest-coefficient rule will start at one of these vertices and visit every vertex before finally finding the optimal solution.
The right-hand sides still grow by huge amounts as i increases. Finally, we wish to add a constant to the objective function so that the Klee—Minty problem can finally be written as maximize 4. In Exercise 4. Using the largest coefficient rule, the entering variable is x1. From the fact that each subsequent bi is huge compared with its predecessor it follows that w1 is the leaving variable.
A few observations should be made. First, every pivot is the swap of an xj with the corresponding wj. Also note that the final dictionary could have been reached from the initial dictionary in just one pivot if we had selected x3 to be the entering variable. But the largestcoefficient rule dictated selecting x1. It is natural to wonder whether the largestcoefficient rule could be replaced by some other pivot rule for which the worst-case behavior would be much better than the 2n behavior of the largest-coefficient rule.
So far no one has found such a pivot rule. However, no one has proved that such a rule does not exist either. Finally, we mention that one desirable property of an algorithm is that it be scale invariant. This means that should the units in which one measures the decision variables in a problem be changed, the algorithm would still behave in exactly the same manner. The simplex method with the largest-coefficient rule is not scale invariant.
Now, the largest-coefficient rule picks variable x3 to enter. Variable w3 leaves, and the method steps to the optimal solution in just one iteration. There exist pivot rules for the simplex method that are scale invariant.
But Klee—Minty-like examples have been found for most proposed alternative pivot rules whether scale invariant or not. In fact, it is an open question whether there exist pivot rules for which one can prove that no problem instance requires an exponential number of iterations as a function of m or n. Fix k and consider the pivot in which xk enters the basis and wk leaves the basis. Show that the resulting dictionary is of the same form as before. Show that by ignoring feasibility preservation of intermediate dictionaries this dictionary can be arrived at in exactly k pivots.
For a survey of probabilistic methods, the reader should consult Borgwardt b. For many years it was unknown whether linear programming had polynomial complexity. The Klee—Minty examples 54 4. In , Khachian gave a new algorithm for linear programming, called the ellipsoid method, which is polynomial and therefore established once and for all that linear programming has polynomial complexity. The collection of all problem classes having polynomial complexity is usually denoted by P.
An important problem in theoretical computer science is to determine whether or not P is a strict subset of N P. The study of how difficult it is to solve a class of problems is called complexity theory.
The dual of this dual linear program is the original linear program which is then referred to as the primal linear program. It turns out that every feasible solution for one of these two linear programs gives a bound on the optimal objective function value for the other. These ideas are important and form a subject called duality theory, which is the topic we shall study in this chapter.
Motivation—Finding Upper Bounds We begin with an example: But how good is this bound? Is it close to the optimal value? To answer, we need to give upper bounds, which we can find as follows.
We have localized the search to somewhere between 9 and These bounds leave a gap within which the optimal solution lies , but they are better than nothing. Furthermore, they can be improved. To get a better upper bound, we again apply the same upper bounding technique, but we replace the specific numbers we used before with variables and then try to find the values of those variables that give us the best upper bound.
So we start by multiplying the two constraints by nonnegative numbers, y1 and y2 , respectively. The fact that these numbers are nonnegative implies that they preserve the direction of the inequalities. Therefore, we are naturally led to the following optimization problem: This problem is called the dual linear programming problem associated with the given linear programming problem.
In the next section, we will define the dual linear programming problem in general. The Dual Problem Given a linear programming problem in standard form, maximize 5. Since we started with 5. Our first order of business is to show that taking the dual of the dual returns us to the primal. To see this, we first must write the dual problem in standard form. That is, we must change the minimization into a maximization and we must change the first set of greater-thanor-equal-to constraints into less-than-or-equal-to.
Of course, we must effect these changes without altering the problem. To change a minimization into a maximization, we note that to minimize something it is equivalent to maximize its negative and then negate the answer: The Weak Duality Theorem As we saw in our example, the dual problem provides upper bounds for the primal objective function value. This result is true in general and is referred to as the Weak Duality Theorem: The proof is a simple chain of obvious inequalities: The second inequality, of course, holds for similar reasons.
The weak duality theorem tells us that the set of primal values lies entirely to 3. The primal objective values are all less than the dual objective values. An important question is whether or not there is a gap between the largest primal value and the smallest dual value.
As we shall see shortly, these sets are both closed intervals perhaps of infinite extent , and the right endpoint of the primal set butts up against the left endpoint of the dual set see Figure 5. That is, there is no gap between the optimal objective function value for the primal and for the dual. The lack of a gap between primal and dual objective values provides a convenient tool for verifying optimality. To see that the primal solution is optimal, consider any other feasible solution x1 , x2 ,.
An analogous argument shows that the dual solution is also optimal for the dual problem. Both these solutions are feasible, and both yield an objective value of Hence, the weak duality theorem says that these solutions are optimal. The Strong Duality Theorem The fact that for linear programming there is never a gap between the primal and the dual optimal objective values is usually referred to as the Strong Duality Theorem: In such cases, it is better to illustrate the idea with a simple example.
Anyone who has taken a course in linear algebra probably already appreciates such a statement. In any case, it is true here as we explain the strong duality theorem. The main idea that we wish to illustrate here is that, as the simplex method solves the primal problem, it also implicitly solves the dual problem, and it does so in such a way that 5. To see what we mean, let us return to the example discussed in Section 5. Since the inequality constraints in the dual problem are greaterthan constraints, each dual slack is defined as a left-hand side minus the corresponding right-hand side.
Therefore, the primal and dual dictionaries are written as follows: Also note that the numbers in the dual dictionary are simply the negative of the numbers in the primal dictionary arranged with the rows and columns interchanged. Our goal now is to apply the simplex method to the primal problem and at the same time perform the analogous pivots on the dual problem.
We shall discover that the negative-transpose property persists throughout the iterations. Since the primal dictionary is feasible, no Phase I procedure is necessary.
For the first pivot, we pick x3 as the entering variable x1 has the largest coefficient, but x3 provides the greatest one-step increase in the objective.
With this choice, the leaving variable must be w2. Similarly, row w2 in the primal corresponds to column y2 in the dual. Hence, to make an analogous pivot in the dual dictionary, we select y2 as the entering variable and z3 as the leaving variable.
While this choice of entering and leaving variable may seem odd compared to how we have chosen entering and leaving variables before, we should note that our earlier choice was guided by the desire to increase the objective function while preserving feasibility. Here, the dual dictionary is not even feasible, and so such considerations are meaningless. Once we give up those rules for the choice of entering and leaving variables, it is easy to see that a pivot can be performed with any choice of entering and leaving variables provided only that the coefficient on the entering variable in the constraint of the leaving variables does not vanish.
Such is the case with the current choice. Hence, we do the pivot in both 62 5. Note that these two dictionaries still have the property of being negative-transposes of each other. For the next pivot, the entering variable in the primal dictionary is x2 this time there is no choice and the leaving variable is w1. In the dual dictionary, the corresponding entering variable is y1 and the leaving variable is z2. This primal dictionary is optimal, since the coefficients in the objective row are all negative.
Looking at the dual dictionary, we see that it is now feasible for the analogous reason. In fact, it is optimal too. Finally, both the primal and dual objective function values are The situation should now be clear. Given a linear programming problem, which is assumed to possess an optimal solution, first apply the Phase I procedure to get a basic feasible starting dictionary for Phase II. Then apply the simplex method to find an optimal solution.
Each primal dictionary generated by the simplex method implicitly defines a corresponding dual dictionary as follows: As long as the primal dictionary is not optimal, the implicitly defined dual dictionary will be infeasible. But once an optimal primal dictionary is found, the corresponding dual dictionary will be feasible.
Since its objective coefficients are always nonpositive, this feasible dual 4. Furthermore, at each iteration, the current primal objective function value coincides with the current dual objective function value. To keep notations uncluttered, we shall consider only four generic entries in a table of coefficients: A little thought and perhaps some staring at the examples above reveals that a pivot produces the following changes: These effects can be summarized on our generic table as follows: By induction we then conclude that, if we start with this property, it will be preserved throughout the solution process.
Since the strong duality theorem is the most important theorem in this book, we present here a careful proof. Those readers who are satisfied with the above discussion may skip the proof. Suppose we apply the simplex method. We know that the simplex method produces an optimal solution whenever one exists, and we have assumed that one does indeed exist.
Hence, the final dictionary will be an optimal dictionary for the primal problem. To this end, we write the objective function two ways: We can also equate the constant terms on the two sides. Hence, 5. These last two sets of inequalities are precisely the conditions that guarantee dual feasibility. This completes the proof. But what if the primal problem does not have an optimal solution?
For example, suppose that it is unbounded. The unboundedness of the primal together with the weak duality theorem tells us immediately that the dual problem must be infeasible. Similarly, if the dual problem is unbounded, then the primal problem must be infeasible.
It is 66 5. But it turns out that there is a fourth possibility that sometimes occurs—it can happen that both the primal and the dual problems are infeasible. For example, consider the following problem: It is easy to see that both this problem and its dual are infeasible.
Duality theory is often useful in that it provides a certificate of optimality. For example, suppose that you were asked to solve a really huge and difficult linear program. Now, how are you going to convince your boss that your solution is correct?
Do you really want to ask her to verify the correctness of your computer programs? The answer is probably not. And in fact it is not necessary. Certificates of optimality have also been known to dramatically reduce the amount of time certain underpaid professors have to devote to grading homework assignments!
Since the dual of the dual is the primal, applying the simplex method to the dual also solves both the primal and the dual problem. Sometimes it is easier to apply the simplex method to the dual, for example, if the dual has an obvious basic feasible solution but the primal does not.
We take up this topic in the next chapter. Complementary Slackness Sometimes it is necessary to recover an optimal dual solution when only an optimal primal solution is known.
The following theorem, known as the Complementary Slackness Theorem, can help in this regard. Then x and y are optimal for their respective problems if and only if 5. We begin by revisiting the chain of inequalities used to prove the weak duality theorem: Of course, the statement that at least one of these two numbers vanishes can be succinctly expressed by saying that the product vanishes. An analogous analysis of inequality 5.
This then completes the proof. In fact, since the primal solution is assumed to be nondegenerate, it follows that the m basic variables will be strictly positive. The complementary slackness theorem then tells us that the corresponding dual variables must vanish.
We are then left with just n equations in n unknowns, which we would expect to have a unique solution that can be solved for. If there is a unique solution, all the components should be nonnegative. The Dual Simplex Method In this section, we study what happens if we apply the simplex method to the dual problem. As we saw in our discussion of the strong duality theorem, one can actually apply the simplex method to the dual problem without ever writing down the dual problem or its dictionaries.
Instead, the so-called dual simplex method is seen simply as a new way of picking the entering and leaving variables in a sequence of primal dictionaries. We begin with an example: As before, we have recorded the negative of the dual objective function, since we prefer to maximize the objective function appearing in a dictionary. More importantly, note that the dual dictionary is feasible, whereas the primal one is not. This suggests that it would be sensible to apply the simplex method to the dual.
Let us do so, but as we go we shall keep track of the analogous pivots applied to the primal dictionary. For example, the entering variable in the initial dual dictionary is y2 , and the leaving variable then is z1. Of course, since w2 is basic and x1 is nonbasic, w2 must be the leaving variable and x1 the entering variable—i. Continuing to work on the dual, we now see that y3 is the entering variable and y2 leaves. Hence, for the primal we use w3 and w2 as the leaving and entering variable, respectively.
Now we notice that both dictionaries are optimal. Of course, in each of the above dictionaries, the table of numbers in each dual dictionary is the negative-transpose of the corresponding primal table. Therefore, we never need to write the dual dictionary; the dual simplex method can be entirely described in terms of the primal dictionaries.
Indeed, first we note that the dictionary must be dual feasible. This means that all the coefficients of the nonbasic variables in the primal objective function must be nonpositive.
Given this, we proceed as follows. First we select the leaving variable by picking that basic variable whose constant term in the dictionary is the most negative if there are none, then the current dictionary is optimal. Then we pick the entering variable by scanning across this row of the dictionary and comparing ratios of the coefficients in this row to the corresponding coefficients in the objective row, looking for the largest negated ratio just as we did in the primal simplex method.
Once the entering and leaving variable are identified, we pivot to the next dictionary and continue from there. The reader is encouraged to trace 7. A Dual-Based Phase I Algorithm The dual simplex method described in the previous section provides us with a new Phase I algorithm, which if nothing else is at least more elegant than the one we gave in Chapter 2.
Let us illustrate it using an example: Clearly, neither the primal nor the dual dictionary is feasible. But by changing the primal objective function, we can easily produce a dual feasible dictionary. Then the corresponding initial dual dictionary is feasible. In fact, it coincides with the dual dictionary we considered in the previous section, so we already know the optimal 72 5. This primal dictionary is optimal for the modified problem but not for the original problem.
However, it is feasible for the original problem, and we can now simply reinstate the intended objective function and continue with Phase II. The entering variable is x2. Looking for a leaving variable, we discover that this problem is unbounded. Of course, more typically one would expect to have to do several iterations of Phase II to find the optimal solution or show unboundedness. Here we just got lucky that the game ended so soon. It is interesting to note how we detect infeasibility with this new Phase I algorithm.
The modified problem is guaranteed always to be dual feasible. It is easy to see that the primal problem is infeasible if and only if the modified problem is dual unbounded which the dual simplex method will detect just as the primal simplex method detects primal unboundedness.
The two-phase algorithm we have just presented can be thought of as a dual— primal algorithm, since we first apply the dual simplex method to a modified dual feasible problem and then finish off by applying the primal simplex method to the original problem, starting from the feasible dictionary produced by Phase I. One could consider turning this around and doing a primal—dual two-phase algorithm.
Here, the right-hand side of the primal problem would be modified to produce an obvious primal feasible solution. The primal simplex method would then be applied. The optimal solution to this primal problem will then be feasible for the original dual problem but 8.
But then the dual simplex method can be applied, starting with this dual feasible basis until an optimal solution for the dual problem is obtained. The Dual of a Problem in General Form In Chapter 1, we saw that linear programming problems can be formulated in a variety of ways. In this section, we shall derive the form of the dual when the primal problem is not necessarily presented in standard form. First, let us consider the case where the linear constraints are equalities and the variables are nonnegative: Since there are two sets of m inequality constraints, we need two sets of m dual variables.
Note what has changed from when we were considering problems in standard form: And that is the message: Employing the symmetry between the primal and the dual, we can say more: These rules are summarized in Table 5. Resource Allocation Problems Let us return to the production facility problem studied in Chapter 1. Rules for forming the dual. In the real world, the market is always essentially in equilibrium.
Nonetheless, it continually experiences small perturbations that ripple through it and move the equilibrium to new levels. These perturbations can be from several causes, an important one being innovation. One possible innovation is to improve the production process.
Now, suddenly there is a windfall profit for each unit of product j produced. To be concrete, let us assume that the time lag is about one month depending on the industry, this lag time could be considered too short or too long. Suppose also that the production manager decides to produce xj units of product j and that all units produced are sold immediately at their market value.
Then the total revenue 1One could take the prices of raw materials as fixed and argue that the value of the final products will fall. The point is simply that the difference between the price of the raw materials and the price of the final products must narrow due to this innovation. Before studying these optimizations, let us first rewrite the windfall in a more convenient form.
As in Chapter 1, let yi denote the increase in the price of raw material i. Now let us return to the competing optimizations.
Looking at 5. This is just our usual primal linear programming problem. This is the problem that the production manager needs to solve in anticipation of adversarial suppliers. Rearranging the terms in 5. Hence, the first term in 5. This means than an equilibrium can be reestablished by setting the production levels and the price hikes according to the optimal solutions to these two linear programming problems.
Lagrangian Duality The analysis of the preceding section is an example of a general technique that forms the foundation of a subject called Lagrangian duality, which we shall briefly describe.
Let us start by summarizing what we did. It was quite simple. In the general case, one needs to consider each step carefully. The max—min problem is called the primal problem, and the min— max problem is called the dual problem. However, it may or may not be true that these two problems have the same optimal objective values. In fact, the subject is interesting because one can indeed state specific, verifyable conditions for which the two problems do agree. Also, one would like to be able to solve the inner optimizations explicitly so that the primal problem can be stated as a pure maximization problem and the dual can be stated as a pure minimization problem.
This, too, is often doable. There are various ways in which one can extend the notions of duality beyond the context of linear programming. The one just described is referred to as Lagrangian duality. It is perhaps the most important such extension. Exercises In solving the following problems, the advanced pivot tool can be used to check your arithmetic: Suppose that, in solving this problem, you have arrived at the following dictionary: Which are nonbasic? Is it feasible?
Is it degenerate? Which will leave? Will the pivot be degenerate? Then replace the minimization in the dual with a maximization to get a new linear programming problem, which we can write in standard form as follows: But the first and the last entry in this chain of inequalities are equal. Therefore, all these inequalities would seem to be equalities. What is the error in this logic? Can you state a correct nontrivial theorem that follows from this line of reasoning?
Can you give an example where the four inequalities are indeed all equalities? The problem then 84 5. Also, we assume that problem 5. An MIT graduate student was trying to make ends meet on a very small stipend. Next, he made a list of his favorite foods which, except for pizza and due mostly to laziness and ineptitude in the kitchen, consisted almost entirely of frozen prepared meals.
He then went to the local grocery store and made a list of the unit price for each of his favorite foods. In addition to prices, he also looked at the labels and collected information about how much of the critical nutrients are contained in one serving of each food.
Let us denote by aij the amount of nutrient i contained in food j. Fortunately, he was able to call his favorite pizza delivery service and get similar information from them. In terms of this 86 5. Can you introduce another person into the above story whose problem would naturally be to solve the dual? Using elementary calculus 1. Be sure to check the signs of the second derivatives for both the inner and the outer optimizations.
Using elementary calculus, show that Lh is strongly convex. Show that the Legendre transform of the Legendre transform of a function is the function itself. This can be proved from scratch but it is easier to use the result of part 2 above.
Notes The idea behind the strong duality theorem can be traced back to conversations between G. Dantzig and J. The term primal problem was coined by G. The dual simplex method was first proposed by Lemke The solution to Exercise 5. In this chapter, we shall recast everything into matrix notation. At the same time, we will emphasize the close relations between the primal and the dual problems. Matrix Notation As usual, we begin our discussion with the standard-form linear programming problem: Such is the case now, and so we introduce slack variables as follows: As before, we denote by B the set of indices corresponding to the basic variables, and we denote by N the remaining nonbasic indices.
In component notation, the ith component of Ax can be broken up into a basic part and a nonbasic part: Then we write A in a partitioned-matrix form as follows: Instead, it is the A matrix with its columns rearranged in such a manner that all the columns associated with basic variables are listed first followed by the nonbasic columns.
Nonetheless, as 2. Then the following separation of Ax into a sum of two terms is true and captures the same separation into basic and nonbasic parts as we had in 6. The Primal Simplex Method A dictionary has the property that the basic variables are written as functions of the nonbasic variables. The fact that the basic variables xB can be written as a function of the nonbasic variables xN is equivalent to the fact that the matrix B is invertible, and hence, 6.
The fact that B is invertible means that its m column vectors are linearly independent and therefore form a basis for Rm — this is why the basic variables are called basic, in case you were wondering.
Similarly, the objective function can be written as 6. Comparing against the component-form notation of Chapter 2 see 2. The basic solution associated with dictionary 6. However, to have the negative-transpose property, it is important to correctly associate complementary pairs of variables. So first we recall that, for the current discussion, we have appended the primal slack variables to the end of the original variables: Also recall that the dual slack variables are complementary to the original primal variables and that the original dual variables are complementary to the primal slack variables.
Therefore, to maintain the desired complementarity condition between like indices in the primal and the dual, we need to relabel the dual variables and append them to the end of the dual slacks: With this relabeling of the dual variables, the dual dictionary corresponding to 6. Using 6.
The associated dual dictionary then has a very symmetric appearance: The primal simplex method can be described briefly as follows. Two bases are said to be adjacent to each other if they differ in only one index. That is, given a basis B, an adjacent basis is determined by removing one basic index and replacing it with a nonbasic index. The index that gets removed corresponds to the leaving variable, whereas the index that gets added corresponds to the entering variable.
One step of the simplex method is called an iteration. We now elaborate further on the details by describing one iteration as a sequence of specific steps. The current solution is optimal. Step 1. Check for Optimality. If zN To see this, first note that the simplex method always maintains primal feasibility and complementarity. Hence, all that is required for optimality is dual feasibility.
Select Entering Variable. Also, if the maximum is less than or equal to zero, we can stop here—the primal is unbounded. Step 5. Select Leaving Variable. Step 6. To see how, it is convenient to look at the dual dictionary. Step 7. Compute Dual Step Length. Update Current Primal and Dual Solutions. We now have everything we need to update the data in the dictionary: Update Basis. Finally, we update the basis: We close this section with the important remark that the simplex method as presented here, while it may look different from the component-form presentation given in Chapter 2, is in fact mathematically identical to it.
That is, given the same set of 96 6. An Example In case the reader is feeling at this point that there are too many letters and not enough numbers, here is an example that illustrates the matrix approach to the simplex method. Corresponding to these sets, we have the submatrices of A: First Iteration.
Since zN has some negative components, the current solution is not optimal. Step 2. Step 3. Second Iteration. Third Iteration. Fourth Iteration. Since zN has all nonnegative components, the current solution is optimal. It is undoubtedly clear at this point that the matrix approach, as we have presented it, is quite a bit more tedious than the dictionary manipulations with which we are quite familiar. The reason is that, with the dictionary approach, dictionary entries get updated from one iteration to the next and the updating process is fairly easy, whereas with the matrix approach, we continually compute everything from scratch and therefore end up solving many systems of equations.
However, before we take up such practical considerations, let us finish our general discussion of the simplex method by casting the dual simplex method into matrix notation and discussing some related issues. The Dual Simplex Method In the presentation of the primal simplex method given in the previous section, we tried to make the symmetry between the primal and the dual problems as evident as possible. One advantage of this approach is that we can now easily write down the dual simplex method.
Note that for the dual simplex method, dual feasibility and complementarity are maintained from the beginning, and the algorithm terminates once a primal feasible solution is discovered. Compute Primal Step Length. The primal and the dual simplex methods. Step 9. To further emphasize the similarities between the primal and the dual simplex methods, Figure 6. Two-Phase Methods Let us summarize the algorithm obtained by applying the dual simplex method as a Phase I procedure followed by the primal simplex method as a Phase II.
Hence, the initial dictionary reads: If b has all nonnegative components and cN has all nonpositive components, then this dictionary is optimal—the problem was trivial. Suppose, however, that one of these two vectors but not both has components of the wrong sign.
For example, suppose that b is okay all nonnegative components but cN has some positive components. Then this dictionary is primal feasible, and we can start immediately with the 6.
On the other hand, suppose that cN has all nonpositive components but b has some negative ones. Then the starting dictionary is dual feasible, and we can commence immediately with the dual simplex algorithm. The last, and most common, case is where both b and cN have components of the wrong sign. In this case, we must employ a two-phase procedure. There are two choices. We could temporarily replace cN with another vector that is nonpositive.
Then the modified problem is dual feasible, and so we can apply the dual simplex method to find an optimal solution of this modified problem. After that, the original objective function could be reinstated. With the original objective function, the optimal solution from Phase I is most likely not optimal, but it is feasible, and therefore the primal simplex method can be used to find the optimal solution to the original problem.
The other choice would be to modify b instead of cN , thereby obtaining a primal feasible solution to a modified problem. Then we would use the primal simplex method on the modified problem to obtain its optimal solution, which will then be dual feasible for the original problem, and so the dual simplex method can be used to finish the problem. Negative Transpose Property In our discussion of duality in Chapter 5, we emphasized the symmetry between the primal problem and its dual.
This symmetry can be easily summarized by saying that the dual of a standard-form linear programming problem is the negative transpose of the primal problem.
Now, in this chapter, the symmetry appears to have been lost. It seems strange. How are these two basis matrices related? The purpose of this section is to make this connection clear. The first n columns of it are the initial nonbasic variables and the last m columns are the initial basic columns. The first n columns of it are the initial basic variables for the dual problem and the last m columns are the initial nonbasic columns.
Which variables are basic? Is the primal solution associated with this dictionary feasible? Is it optimal? Consider the situation in which x3 and x5 are basic and all other variables are nonbasic. Write down: Consider the following max-min problem: Write it explicitly. Notes In this chapter, we have accomplished two tasks: There are times when one thing has two names.
So far in this book, we have discussed essentially only one algorithm: But this one algorithm is sometimes referred to as the simplex method and at other times it is referred to as the revised simplex method. The distinction being made with this new name has nothing to do with the algorithm. Rather it refers to the specifics of an implementation. The first, called sensitivity analysis or postoptimality analysis addresses the following question: The second subject addresses situations in which one wishes to solve not just one linear program, but a whole family of problems parametrized by a single real variable.
We shall study parametric analysis in a very specific context in which we wish to find the optimal solution to a given linear programming problem by starting from a problem whose solution is trivially known and then deforming this problem back to the original problem, maintaining as we go optimality of the current solution. The result of this deformation approach to solving a linear programming problem is a new variant of the simplex method, which is called the parametric self-dual simplex method.
We will see in later chapters that this variant of the simplex method resembles, in certain respects, the interior-point methods that we shall study. Sensitivity Analysis One often needs to solve not just one linear programming problem but several closely related problems. There are many reasons that this need might arise. For example, the data that define the problem may have been rather uncertain and one may wish to consider various possible data scenarios.
Or perhaps the data are known accurately but change from day to day, and the problem must be resolved for each new day. Whatever the reason, this situation is quite common. So one is led to ask whether it is possible to exploit the knowledge of a previously obtained optimal solution to obtain more quickly the optimal solution to the problem at hand.
Of course, the answer is often yes, and this is the subject of this section. We shall treat a number of possible situations. All of them assume that a problem has been solved to optimality. This means that we have at our disposal the final, optimal dictionary: It is natural to ask how the dictionary at hand could be adjusted to become a valid dictionary for the new problem.
Therefore, after recomputing zN feasible, and so there is no need for a Phase I procedure: Hence, the new dictionary will be dual feasible, and so we can apply the dual simplex method to arrive at the new optimal solution fairly directly. Therefore, changing just the objective function or just the right-hand side results in a new dictionary having nice feasibility properties. Even even the constraint matrix too?
In this case, everything changes: Nonetheless, as long as the new basis matrix B is nonsingular, we can make a new dictionary that preserves the old classification into basic and nonbasic variables.
The new dictionary will most likely be neither primal feasible nor dual feasible, but if the changes in the data are fairly small in magnitude, one would still expect that this starting dictionary will get us to an optimal solution in fewer iterations than simply starting from scratch. While there is no guarantee that any of these so-called warm-starts will end up in fewer iterations to optimality, extensive empirical evidence indicates that this procedure often makes a substantial improvement: Often one does not wish to solve a modification of the original problem, but instead just wants to ask a hypothetical question: Let us illustrate these calculations with an example.
Consider the following linear programming problem: Suppose we want to know how much the coefficient of 5 on x1 in the objective function can change without altering the optimality of this basis.
We partition c according to the final optimal basis. Parametric Analysis and the Homotopy Method In this section, we illustrate the notion of parametric analysis by applying a technique called the homotopy method to get a new algorithm for solving linear programming problems.
The homotopy method is a general technique in which one creates a continuous deformation that changes a given difficult problem into a related but trivially solved problem and then attempts to work backwards from the trivial problem to the difficult problem by solving hopefully without too much effort all the problems in between.
We start with an example. Suppose we wish to solve the following linear programming problem: This dictionary is neither primal nor dual feasible. This violation suggests that we perform a dual pivot with x4 serving as the leaving variable. The algorithm we have just illustrated is called the parametric self-dual simplex method. It has some attractive features.