Optimization problemsLecture notes on optimization problems |
Date | ||
---|---|---|---|
Author | @asyncze |
An optimization problem can be solved numerically by adjusting the input (decision variable), to some function to find minimum or maximum output. The aim is to the find decision variables that gives minimum or maximum output.
A minimization problem is the inverse of maximization, so finding y = MAX[x(10 - x)]
is equivalent to y = MIN[-x(10 - x)]
.
y = lambda x : x * (10 - x)
def MAX(fn, res, range_):
for n in range_:
res.append((n, fn(n)))
return print(max(res, key = lambda tuple: tuple[1]))
MAX(y, [], range(100)) # (5, 25)
y = lambda x : -x * (10 - x)
def MIN(fn, res, range_):
for n in range_:
res.append((n, fn(n)))
return print(min(res, key = lambda tuple: tuple[1]))
MIN(y, [], range(100)) # (5, -25)
Optimization problems can be function-based (cost function) or trial-and-error, static/ dynamic (subject to change), constrained/ unconstrained. However, most optimization problems are function-based and constrained.
Penalty function can be added to the cost function to penalize infeasible solutions, i.e. solutions that violate constraints. For example f * P(x) = f(x) + SUM[m](C(x))
where f * P(x)
is the penalty cost function, f(x)
unpenalised cost function, m
number of constraints, and C(x)
penalty function imposed on violating constraints.
Multiple-objective optimization refers to optimizing for a set of cost functions where objectives conflict:
Weighted sum approach is a simple method for optimizing multiple objectives. The cost functions are combined into a single cost function by weighting, so g(x) = SUM[w * f(x)]
where w > 0
and SUM[w] = 1
.
The exhaustive search method, i.e. brute-force, is a numerical method that can be used to search a large but finite solution space using combinations of different variables.
Brute-force methods do not get stuck in local minimum with fine sampling and work with both continuous or discontinuous functions (no derivatives). However, brute-force can take long time to find global best, or missed due to under-sampling, and only practical with small number of variables in limited search space.
The Nelder-Mead method is a downhill simplex method and can be used to find local minimum of functions with several variables (a simplex is an elementary shape formed in dimension n + 1
, such as 2D triangle, where points are referred to as best, B
, good, G
, and worst, W
).
The process to search for local minimum is repeated until acceptable error. The best-good-worst points (or simply BGW
) are updated when finding better points.
For example: the function f(x,y) = x + y
with x = 2
and y = 3
gives f(2,3) = 5
, and point W = (4,5)
gives f(4,5) = 9
, so move towards (2,3)
, which is smaller (optimizing for minimum).
R = M + (M - W) = 2M - W
and expansion point is E = R + (R - M) = 2R - M
C1 = (W + M) / 2
and C2 = (M + R) / 2
, and shrink towards B
, so shrink point is S = (B + W) / 2
and midpoint is M = (B + G) / 2
A line minimization method, i.e. line search, starts at a random point and select direction to move until cost function changes (increases or decreases depending on goal).
Line search methods are generally robust and fast to converge, but use gradients so function need to be differentiable. Line search methods are not guaranteed to find global minimum:
x[k+1] = x[k] - f'(x[k]) / f''(x[k]
The random walk method use random numbers to move in different directions, which is particularly useful to avoid getting stuck in local minimum or explore different regions to locate additional minimums, concentrate search to smaller region to refine solution:
x[k+1] = x[k] + D[k] * h[k]
where D
is random search direction matrix and h
is step size vectorA random walk method or other methods using random numbers work well with both discrete and continuous variables, discontinuous functions and non-differential functions (no derivatives), and are not as sensitive to initial point selection. However, random walks are more sensitive to control parameters (direction and step size).
Note that solutions found using random numbers are not as easily repeatable so multiple runs might be needed to verify results.