Optimization Methods
Lecture Outline
• Linear Programming
• Integer Problems
• Branch and Bound Algorithms
• Nonlinear Problems
• Kuhn-Tucker Conditions
• Lagrangian Function
• Computational Intelligence
• Concluding Remarks
Lecture Outline
• Linear Programming
• Integer Problems
• Branch and Bound Algorithms
• Nonlinear Problems
• Kuhn-Tucker Conditions
• Lagrangian Function
• Computational Intelligence
• Concluding Remarks
Linear Programming
• A linear programming (LP) problem may be
defined as the problem of maximizing or
minimizing a linear function subject to linear
constraints, the constraints may be equalities or
inequalities
• Many optimization problems, especially in
economics, can be formulated as linear programs
• The most popular solution method of linear
programming is Simplex
• The author of Simplex – George Dantzig –
obtained in 1975 the Nobel prize in economy
Formulations (1)
x = [x
1
, x
2
,…, x
n
]
T
column vector of variables (n
elements)
b = [b
1
, b
2
,…, b
m
]
T
column vector of constants
(m elements)
c = [c
1
, c
2
,…, c
n
] row vector of constants (n
elements)
matrix of constants (m x n
elements)
mn
m
n
n
a
...
a
...
...
...
a
...
a
a
...
a
A
1
2
21
1
11
Formulations (2)
min cx
subject to
Ax = b
x 0
Formulations (3)
indices
j = 1,2,…,n
variables
i = 1,2,…,mconstraints
constants
a
ij
coefficient of variable j in constraint i
b
i
right-hand side of constraint i
c
j
cost of variable j
Formulations (4)
variables
x
j
variable j
objective
min z =
j
c
j
x
j
constraints
j
a
ij
x
j
= b
i
for
i = 1,2,…,m
x
j
0
for
j = 1,2,…,n
Formulations (5)
Inequality constraints of type
j
a
ij
x
j
b
i
for i = 1,2,…,m
can be writen as equality constraints using non-
negative slack variables in the following way
j
a
ij
x
j
+ z
j
= b
i
, z
i
0 for i = 1,2,…,m
Inequality constraints of type
j
a
ij
x
j
b
i
for i = 1,2,…,m
can be writen as equality constraints using non-
negative slack variables in the following way
j
a
ij
x
j
– z
i
= b
i
, z
i
0 dla i = 1,2,…,m
Linear Programming Properties
Theorem. The set of feasible solution of linear
programm is a convex set
Theorem. Optimal points of the linear
programming problem are placed in extreme
points (vertices) of the feasible solutions
polyhedra. If the optimal point is reached in more
than one vertex, the same value is obtained in
any linear combination of these points
Example (1)
max 2x
1
+ x
2
s.t.
x
1
+ x
2
6
-2x
1
+ x
2
3
x
1
0
x
2
0
solution
x
1
= 6 x
2
= 0
Example (2)
max x
1
+ 2x
2
s.t.
x
1
+ x
2
6
-2x
1
+ x
2
3
x
1
0
x
2
0
rozwiązanie
x
1
= 1 x
2
= 5
Simplex Algorithm
• Simplex method was developed in 1947 by Georga
Dantziga
• The main idea is the search along vertices and
edges of the solution space polytype
• The algorithm moves to vertices with higher
objective function value
• If a local maximum is reached, by convexity it is also
the global optimum and the algorithm terminates
• It also can terminate when an unbounded edge is
visited, concluding that the problem has no solution
• The algorithm always terminates since the number of
vertices in the polytope is finite
Duality of Linear Programming
(1)
• Many optimization algorithms applies the duality
of Linear Programming
• For a LP problem called the primal problem, a
dual problem can be formulated
• Correspondence between properties of the
primal problem and the dual problem are used in
the construct of optimization methods in order to
reduce the solution time
Duality of Linear Programming
(2)
Primal problem
variables
x
j
variable j
objective
min
j
c
j
x
j
constraints
j
a
ij
x
j
b
i
for i = 1,2,
…,m
x
j
0 for j = 1,2,…,n
Dual problem
variables
i
dual variable i
objective
max
i
b
i
i
constraints
i
a
ij
i
c
j
for j = 1,2,…,n
i
0 for i = 1,2,…,m
Duality of Linear Programming
(3)
Primal problem
min cx
s.t.
Ax b
x 0
Dual Problem
max b
s.t.
A c
0
Duality of Linear Programming
(4)
Theorem. The dual problem of the dual problem is
the primal problem
Theorem. If x is a feasible solution of the primal
problem and is a feasible solution of the dual
problem, then the value of the primal problem
objective function cannot be lower than the value
of the dual problem objective function
cx b
Duality of Linear Programming
(5)
Theorem. (strong duality). If for a pair of feasible
solutions (x
0
,
0
) hold the following condition cx
0
=
b
0
then (x
0
,
0
) is a pair of optimal solutions
Theorem. Either both problems (primal and dual)
are solvable simultaneously, or one of the two is
degenerate in the sense that it does not admit
feasible vectors
Lecture Outline
• Linear Programming
• Integer Problems
• Branch and Bound Algorithms
• Nonlinear Problems
• Kuhn-Tucker Conditions
• Lagrangian Function
• Computational Intelligence
• Concluding Remarks
Integer Problems
• Problems with some continous variables and
some integer variables are called Mixed Integer
Programming (MIP)
• Problems with all integer variables are called
Integer Programming (IP)
• IP and MIP problems are mostly NP-hard
• A special class of integer problems are problems
with binary variables
• The basic method to solve integer problems is
branch and bound approach
MIP Formulation (1)
indices
j = 1,2,…,n
variables
i = 1,2,…,mconstraints
constants
a
ij
coefficient of variable j in constraint i
b
i
right-hand side of constraint i
c
j
cost of variable j
MIP Formulation (2)
variables
x
j
variable j
objective
min z =
j
c
j
x
j
constraints
j
a
ij
x
j
= b
i
for
i = 1,2,…,m
x
j
{0,1} for
j = 1,2,…,k
x
j
0
for
j = k + 1, k + 2,…,n
IP Formulation
variables
x
j
variable j
objective
min z =
j
c
j
x
j
constraints
j
a
ij
x
j
= b
i
for
i = 1,2,…,m
x
j
{0,1,...,M}
for
j = 1,2,…,n
Binary MIP Relaxation
Let N
0
and N
1
denote sets of of variables with set
values 0 and 1, respectively. Let N
U
be a set of
variables with not fixed values
A MIP problem can be relaxed to a LP problem as
follows
x
j
continuous
for
j N
U
0 x
j
1 for
j N
U
x
j
= 0
for
j N
0
x
j
= 1
for
j N
1
A solution of the above problem (if exists) is a lower
bound of the orignal problem
Example (1)
min z = -6x
1
– 5x
2
s.t. 3x
1
+ x
2
11
-x
1
+ 2x
2
5
x
1
, x
2
0
LP solution
(not integer)
x
1
= 2
3
/
7
x
2
= 3
5
/
7
z = -33
1
/
7
Example (2)
min z = -6x
1
– 5x
2
s.t. 3x
1
+ x
2
11
-x
1
+ 2x
2
5
x
1
, x
2
0,
integer
solution
x1 = 3 x2 = 2
z = -28
Lecture Outline
• Linear Programming
• Integer Problems
• Branch and Bound Algorithms
• Nonlinear Problems
• Kuhn-Tucker Conditions
• Lagrangian Function
• Computational Intelligence
• Concluding Remarks
Branch and Bound (1)
• Branch and bound is an general algorithm that
can be applied to integer and mixed integer
problems
• The main idea of the approach is a systematic
search over the solution space and elimination
some subsets of the solution space, which cannot
provide better solution than already found
• The main elements of the algorithm are
bounding and branching
Branch and Bound (2)
• Bounding enables to verify if the current solution
can yield objective value better than the current best
value
• To find the lower (upper) bound usually a relaxed
version of the problem is solved
• Branching is applied to select a variable that is
used to split the problem into two smalleer
subproblems
• The efficiency of the method depends strongly on
the branching and bounding functions
• The subsquent steps of the algorithm can be
presented in a form of solution tree
Branch and Bound (3)
procedure BBB (N
U
,N
0
,N
1
)
begin
solution (N
U
,N
0
,N
1
, x, z) {solving the relaxed linear poblem};
if N
U
= ∅ or for all i N
U
x
i
are binary then
if z < z
best
then begin z
best
:= z; x
best
:= x end
else {in the current solution the value of x
i
is not binary for some i N
U
}
if z ≥ z
best
then
return {bounding}
else
begin {branching}
choose i N
U
such that x
i
is fractional;
BBB (N
U
\{i},N
0
{i},N
1
);
BBB (N
U
\{i},N0,N
1
{i})
end
end { procedure }
Branch and Cut
• Branch and cut is an improved branch and bound
method
• The effectivness of the branch and cut method
depends strongly on the bounds
• Additional constraints called cuts or valid
inequalities are introduced to the problem
• Cut inequalities enable calculation of more effective
bounds
• Global cuts are valid for all feasible integer solutions
and are added to the problem formulation
• Local cuts are valid only for the currently considered
branch and bound subtree
Example (1)
min z = -6x
1
– 5x
2
s.t. 3x
1
+ x
2
11
-x
1
+ 2x
2
5
x
1
, x
2
0,
integer
LP solution
(not integer)
x
1
= 2
3
/
7
x
2
= 3
5
/
7
z = -33
1
/
7
Example (2)
The problem is split
on variable x
1,
i.e., two
subproblems are
created with the
following
constraints
x
1
3
x
1
2
Example (3)
min z = -6x
1
– 5x
2
s.t.
3x
1
+ x
2
11
-x
1
+ 2x
2
5
x
1
3
x
1
, x
2
0,
integer
LP solution (integer)
x
1
= 3 x
2
= 2 z = -28
Completed
subproblem
Example (4)
min z = -6x
1
– 5x
2
s.t. 3x
1
+ x
2
11
-x
1
+ 2x
2
5
x
1
2
x
1
, x
2
0, integer
LP solution (not integer)
x
1
= 2 x
2
= 3,5 z =
-29,5
Since the solution is
not integer, the
algorithm continues
Example (5)
New cut is introduced
min z = -6x
1
– 5x
2
s.t.
3x
1
+ x
2
11
-x
1
+ 2x
2
5
x
1
2
2x
1
+ x
2
7 (cut)
x
1
, x
2
0, integer
LP solution (not integer)
x
1
= 1,8
x
2
= 3,4 z =
-27,8
The objective is larger
than the best
solution. The
algorithm stops
Lecture Outline
• Linear Programming
• Integer Problems
• Branch and Bound Algorithms
• Nonlinear Problems
• Kuhn-Tucker Conditions
• Lagrangian Function
• Computational Intelligence
• Concluding Remarks
Nonlinear Programming
• A nonlinear programming (NLP) problem may
be defined as the problem of maximizing or
minimizing a function subject to some
constraints, at least the objective or one of the
constraints is not linear
• If the objective function and constraints have
special properties (e.g., there are convex)
optimality criteria for the problem can be
formulated
• The linearization of the nonlinear functions can
be applied as an approximation solution method
Formulations (1)
NLP with inequality constraints
min f(x)
xR
n
s.t.
g
i
(x) 0
for i I
NLP with equality constraints
min f(x)
xR
n
s.t.
h
j
(x) = 0
for j E
Formulations (2)
NLP with inequality and equality constraints
min f(x)
xR
n
s.t.
g
i
(x) 0
for i I
h
j
(x) = 0
for j E
NLP Optimality (1)
Definition. All points that statisfy constraints of the
NLP are called feasible points
Definition. Direction d from point x satisfying all
constraints is feasible, if there exists
> 0 that
for any 0
<
the point x +
d is feasible
(holds all constraints of the NLP)
Definition. In a feasible point x constraints of the
NLP can be
• Active included in a set A(x) = {i I : g
i
(x) = 0}
• Nonactive, for which g
i
(x) < 0
NLP Optimality (2)
Let functions f : X = R
n
R
1
and g
i
: X = R
n
R
1
be
differentiable and let x
be a local minimum of the
NLP
Let A(x
) be a set of active constraints in x
The feasible direction d in x
must satisfy
g
i
(x
+
d) 0 for each i A(x
)
Using Taylor's theorem we can write
g
i
(x
+
d) = g
i
(x
) +
g
i
(x
)
T
d + O(
) 0
Thus, the sufficient condition for d to be feasible is
g
i
(x
)
T
d 0 for each i A(x
)
NLP Optimality (3)
Directions satysfying the above condition are
feasible only when some additional conditions
called regularity conditions holds
If point x
is a minimum, then on each feasible
direction the objective function must increase
Therefore, in optimal point x
for each feasible
direction d the following condition must be
satisfied
f(x
)
T
d 0
NLP Optimality (4)
Farkas' lemma. Conditions g
i
(x
)
T
d 0 for each i
A(x
) and f(x
)
T
d 0 means that the objective
function derivative vector f(x
) is in a convex
cone defined by vectors {-g
i
(x
), i A(x
)}
In other words, the vector of the objective function
in the optimum point can ge writen as {-g
i
(x
), i
A(x
)} of derivatives in this point
f(x
) =
iA(x)
i
g
i
(x
)
Coefficients
are called Lagrange multipliers
For each nonactive constraint, multipliers are 0
Lecture Outline
• Linear Programming
• Integer Problems
• Branch and Bound Algorithms
• Nonlinear Problems
• Kuhn-Tucker Conditions
• Lagrangian Function
• Computational Intelligence
• Concluding Remarks
Kuhn-Tucker Conditions
• The Kuhn-Tucker theorem is a theorem in
nonlinear programming
• Kuhn-Tucker theorem states that if a regularity
condition holds and the objective function f and
the functions g
i
and h
j
are convex, then a solution
x which satisfies the conditions g
i
and h
j
for a
vector of multipliers lambda is a global minimum
• The Kuhn-Tucker theorem is a generalization of
Lagrange multipliers
• Farkas's lemma is key in proving this theorem
Regularity conditions
• All constraints g
i
are linear functions
• All constraints g
i
are convex functions and there
is a point x, that g
i
(x) < 0 holds for all i I
• The gradients of the active inequality constraints
and the gradients of the equality constraints are
linearly independent
Formulations (1)
Necessary Kuhn-Tucker conditions for a problem
with inequality constraints
1. f(x) +
iI
i
g
i
(x) = 0
2. g
i
(x) 0 for i I
3.
i
0
for i I
4.
i
g
i
(x) = 0
for i I
or
iI
i
g
i
(x) = 0
In order for a solution point to satisfy the above KKT
conditions, it should satisfy some regularity
condition
Formulations (2)
Necessary Kuhn-Tucker conditions for a problem
with equality constraints
1. f(x) +
jE
j
h
j
(x) = 0
2. h
j
(x) = 0 for j E
In order for a solution point to satisfy the above KKT
conditions, it should satisfy some regularity
condition
Formulations (3)
Necessary Kuhn-Tucker conditions for a problem
with equality and inequality constraints
1. f(x) +
jE
j
h
j
(x) +
iI
i
g
i
(x) = 0
2. g
i
(x) 0 for i I
3. h
j
(x) = 0 for j E
4.
i
0
for i I
5.
i
g
i
(x) = 0
for i I
In order for a solution point to satisfy the above KKT
conditions, it should satisfy some regularity
condition
Example (1)
indices
i = 1,2,…,n
links (variables)
constants
c
i
capacity of link i
D
suma of all link flows
variables
f
i
flow in link i
Example (2)
objective
minimize T =
i
f
i
/(c
i
– f
i
)
constraints
f
i
c
i
for i = 1,2,…,n
(constraint g
i
)
D –
i
f
i
= 0
(constraint g)
Example (3)
Each constraint is linear, the objective funtion is
convex, thus Kuhn-Tucker conditions are sufficient
and and the global minimum can be found
KT constraints can be writen as follows
c
i
/(f
i
– c
i
)
2
–
+
i
= 0 for i = 1,2,…,n (objective)
(f
i
– c
i
)
i
= 0
for i = 1,2,…,n
(constraint g
i
)
(D –
i
f
i
)
= 0
(constraint g)
i
0
for i = 1,2,…,n
Example (4)
Using the first constraint we obtain
If we omit „+” we have, f
i
c
i,
, what means that
constraint g
i
is satsified for i = 1,2,…,n
Due to the construct of the objective function we
must have
f
i
< c
i
, what yields the following condition
i
= 0 dla i = 1,2,…,n
Consequently, we can write
i
i
i
i
λ
λ
c
c
f
λ
c
c
f
i
i
i
Example (5)
Using this formula in constraint g we have
Next, using the above formula we obtain
The solution holds all Kuhna-Tucker constraints,
thus it is optimal
2
i i
i
i
D
c
c
λ
i
i
i i
i
i
i
c
D
c
c
c
f
Lecture Outline
• Linear Programming
• Integer Problems
• Branch and Bound Algorithms
• Nonlinear Problems
• Kuhn-Tucker Conditions
• Lagrangian Function
• Computational Intelligence
• Concluding Remarks
Lagrangian Function (1)
Lagrangian function is defined in the following
way
L(x,) = f(x) +
jE
j
h
j
(x) +
iI
i
g
i
(x) = 0
where is a vector of multipliers, R
m
Definition. Point <x
,
>,
0,
R
m
, x
X is a
saddle point of L(x,) function if,
L(x
,
) L(x,
), x X
L(x
,
) L(x
,), 0, R
m
Lagrangian Function (2)
Theorem. Let 0 and x
X. Point <x
,
> is a
saddle point of L(x,) only when
• x
is a minimum of L(x,
) for each x
X
• g
i
(x
) 0 for each i I
i
g
i
(x
) = 0 for each i I
Theorem. If point <x
,
> is a saddle point of
Lagrangian function L(x,), then x
is an optimum
solution of the NLP
Lagrange Duality (1)
Let L(x,) be a Lagrangian function
L(x,) = f(x) + g(x)
where X R
n
,
= { : R
m
, 0}, f : X R
1
and g: X R
m
.
Definition. L
P
(x) is a primal function defined as
follows
L
P
(x) = f(x) if g
i
(x) 0 for each i I
L
P
(x) = + otherwise
Definition. L
D
() is a dual function defined as
L
D
() = inf
xX
L(x,)
Lagrange Duality (2)
Theorem (weak duality). The function value of
the dual at any feasible solution is always lower
than or equal to the function value of the primal
at any feasible solution
L
P
(x) L
D
()
Definition. Let x
and
be optimal solutions of the
primal and dual problems, respectively. The
difference (L
P
(x
) – L
D
(
)) is called duality gap
Theorem (strong duality). If the objective and
inequality constraints are convex, equality
constraints are linear, then the duality gap is 0
Lecture Outline
• Linear Programming
• Integer Problems
• Branch and Bound Algorithms
• Nonlinear Problems
• Kuhn-Tucker Conditions
• Lagrangian Function
• Computational Intelligence
• Concluding Remarks
Computational Intelligence
(1)
• Computational intelligence are hueristic algorithms
combining various approaches mostly observed in
the nature
• Examples of computational intelligence methods
– Evolutionary algorithm
– Local search
– Simulated annealing
– Tabu search
– Ant algorithm
– GRASP
Computational Intelligence
(2)
• Computational intelligence methods are mostly
stochastic, since they use some random
elements
• Computational intelligence algorithms yield
suboptimal results without the optimality
guarantee
• Computational intelligence algorithms solve a
discrete problem
min F(x) dla x S
• where S is a finite space of feasible solutions
Lecture Outline
• Linear Programming
• Integer Problems
• Branch and Bound Algorithms
• Nonlinear Problems
• Kuhn-Tucker Conditions
• Lagrangian Function
• Computational Intelligence
• Concluding Remarks
Concluding Remarks
• Many optimization problems are linear with continous
variables and can be solved by Simplex method
• Problems with some integer variables are mostly NP-hard
• To find optimal solution of such problems branch and
bound or branch and cut methods must be used
• Some problems with specific properties can attacked
using classical optimization approaches like duality,
Kuhn-Tucker conditons, Lagrangian relaxation
• Computational intelligence provide effective
algorithms that can be adapted for most of optimization
problems
Further Reading
• J. Kleinberg, E. Tardos, Algorithm Design, Addison Wesley,
2005
• Pióro M. , Medhi D. , Routing, Flow, and Capacity Design in
Communication and Computer Networks, Morgan Kaufman
Publishers 2004
• Goldberg D. E., Genetic Algorithms in Search, Optimization,
and Machine Learning, Addison Wesley, 1989
• Glover F., Laguna M., Tabu Search, Kluwer Academic
Publishers, Norwell, MA. 1997