Optimization: Calculus of Variations
To find the optimal values of a function f, we solve for the points where its derivative f’ equals zero.
The calculus of variations optimizes functionals, functions that take other functions as input. One application is finding the path of shortest travel time. Here, a path is a function y = f(x), and the travel time T is expressed in terms of f. The optimal path, denoted f*, is the function that minimizes T among all possible paths.
Let’s investigate the fastest sailing route from NYC to Morocco, taking into account the impact of trade winds and ocean currents. For a tiny displacement ds along the path, the travel time dt is equal to
Combining it with
To find the total travel tim, T, we integrate the infinitesimal time element dt along the entire path from x₁ (NYC) to x₂ (Morocco), giving us:
The function y = f(x) represents a sailing path, and T(f) calculates the corresponding travel time. This makes T a functional — a function that takes another function as input and returns a scalar value (the travel time). Our goal is to find the stationary function f *(x) of this functional, representing the path with the shortest travel time. i.e. we find y = f*(x) such that a functional I
is stationary with the boundary condition y₁= f(x₁) and y₂ = f(x₂).
Announcement
Although I haven’t published many articles in 2023 and 2024, these years have been incredibly busy for me. The rapid advancements in AI have inspired me to write a book on Generative AI, a project that has been both challenging and time-consuming. This article is part of a chapter on optimization. However, I cannot include every detail in the book, so I am providing additional information in this article.
While the book isn’t finished yet, you can stay updated on its progress by following me on Medium or connecting with me on LinkedIn. I’ll share an announcement there as soon as it’s ready!
Euler-Lagrange Equation
Suppose y(x) makes I stationary with the boundary conditions satisfied. We introduce a function η(x). It is a completely random function except its boundary points are zero.
We define another function ȳ that can represent any function except it has to fulfill the same boundary conditions of y(x).
If ε = 0, both ȳ and y(x) refer to the same stationary function.
The first and second derivatives of ȳ w.r.t. ε are
We will need that later. Next, we want to find
R.H.S. is an integral over x. Therefore I is a function of ε as everything else that depends on x will be integrated over. We just turn this problem into optimizing a single variable ε, instead of a function. Therefore, we can apply the regular calculus to find the optimal ȳ by setting the derivative of I w.r.t. ε to zero. With ε to be zero, ȳ and y(x) refer to the same stationary function. Therefore, we set the derivative of I to be zero with ε=0.
i.e.
By applying integration by parts in the second term with η(x₁) = η(x₂) = 0, we get
Now combining the result, we get
Since η is a relatively random function, for the integration to be zero, we need
This is the condition for y = f(x) in making F stationary. And it is called the Euler-Lagrange equation.
Shortest Path
Let’s use the sailing route problem to demonstrate the Euler-Lagrange equation. We’ll simplify the problem and assume that all external factors, such as ocean currents and winds, do not affect the path, and that the sailing boat moves at a constant speed. In an optimization problem, any constant factor can be ignored. Therefore, the functional III simplifies to:
To find the shortest path, we apply the Euler-Lagrange equation.
The shortest distance between two points is a straight line.