This note reviews the concepts of functionals, Euler’s equation, equilibrium, and the equivalence between the equilibrium equations and the principle of minimum potential energy.

## Functions and functionals

### Definition of Functions

Long long ago, I got introduced to the concept of mathematical functions when I was in the class of algebra in high school. My high school definition of functions is:

* A function is a unique correspondence relation \( \, f\) from a number \(x\) of a set of numbers \(X\) to a number y of another set of numbers \(Y\).*

A little bit more vivid translation can be:

*A function \(y=f(x)\) sends a number \(x \in X\) to a numbers \(y \in Y\) with one requirement—each number \(x\) only corresponds to one number \(y\).*

At that time, the sets a function can operate on are all number sets. We name the set \(X\) the *domain* of the function, the set \(Y\) the *codomain* of the function. And all the possible values of \(y\) together forms a new set that is called the *range* of the function.

Later in college calculus and mechanics courses, the definition of functions was expanded to include multiple sets of numbers (i.e. multivariable) and be able to operate on sets of vectors.

An even higher level definition of functions is based on the set theory and helped me greatly in understanding functional analysis as well as many fundamental concepts in continuum mechanics. And it was when some alias of functions such as “map” and “transform” first appeared to me.

*A map is a collection of ordered pairs with a restriction: the first item of every ordered pair in the collection is unique.*

An equivalent translation without introducing concepts of collection, pair, order pairs, is

*A function uniquely sends an element \(x\) of a set \(X\) to an element \(y\) of a set \(Y\).*

This definition further generalizes the type of sets a function can operate on: from sets of numbers and vectors to sets of any (mathematical) entities. (Professor Suo’s lecture notes give a great explanation.)

We confine our discussion to the mechanical engineers’ probably most frequently used real-valued functions with real variables:

*A real-valued function maps the set (or a subset) of real numbers to the set (or a subset) of real numbers.*

The verb “maps” here guarantees the unique correspondence. The following animation, emphasizing the action of mapping, shows the graphic representation of the real-valued function: \(y=x^2+1\), which maps the set of the real numbers to the set of real numbers who are greater than \(1\).

### Definition of Functionals

Since the elements of the domain and codomain have been expanded to any mathematical entities, a set of functions can also serve as the domain, the emergence of functionals naturally follows. A functional is a special type of functions such that their domain is not a set of numbers or vectors but a set of functions. If we also confined our discussion to the real-valued functionals, then

*A real-valued functional maps a set of functions ( i.e. a function space) to a set of real numbers.*

There are multiple ways of operating a function to get a real number so we confine our discussion to a special form: the integral form functionals. Suppose \(y=y(x)\) is a real-valued functions defined on \(x\in[x_0, x_1]\) and has a continuous first derivative \(y'(x)\) on it. The operation we want to do about \(y(x)\) is to integrate it and its related functions, expressed as \( f[x, y(x), y'(x)] \) over its domain of definition \(x\in[x_0, x_1]\), then we get a real number \(I[y(x)]\) depending on \(y(x)\):

$$I[y(x)]= \int_{x_0}^{x_1} f[x, y(x), y'(x)] \, dx. $$

For instance, if we have a planar curve on Cartesian coordinate plane , the infinitesimal length of curve is approximated by

$$ds = \sqrt{{dx}^2 + {dy}^2} = \sqrt{1+(\frac{dy}{dx})^2} \, dx = \sqrt{1+{y'(x)}^2}\, dx.$$

If \( f(x, y, y’) \) in the integral functional evaluates the above formula, the functional,

$$I[y(x)]= \int_{x_0}^{x_1} \sqrt{1+{y'(x)}^2} \, dx,$$

is the length of any planar curve \(y(x)\) in the interval \([x_0, x_1]\).

## Extremum (minimum or maximum) problems

A typical problem that arises with the introduction of functionals is the extremum problems in which we want to find which specific function from a function space that extremes (minimizes or maximizes) the value of the functional. Here we ignore the detail of judging if it is minimum or maximum, and only focus on extremum problem.

*Suppose we have a real-valued, integral form of functional \( I[y(x)] \), \( I[y(x)] = \int_{x_0}^{x_1} f[x, y(x), y'(x)] \, dx \) , whose input functions are defined on \(x \in [x_0, x_1]\), so \(y(x_0) = y_0, y(x_1)=y_1\), we want to find a specific form of function \(y(x)\) in \(x \in (x_0, x_1)\) that extremes the functional \(I[y(x)]\).*

The solution of the above extremum problem is Euler’s equation.

### Euler’s equation

Suppose \(y=y(x)\) is the function which extremes the \(I[y(x)]\), it must be so special that it wins out over all other functions. To make the comparison with all other functions, we make a general form to represent all functions: \(z(x) = y(x) + \epsilon \eta(s)\). Here \(\eta(x)\) is an arbitrary real-valued function that has the same continuity with \(y(x)\), \(\epsilon\) is a kind of real number scale. As stated in the definition of extremum problems, \(z(x)\) needs to satisfy \(z(x_0) = y_0\) and \(z(x_1) = y_1\), so \(\eta(x_0) = \eta(x_1) = 0\). The functional, with the input function \(z(x)\) has the value:

$$\begin{eqnarray} I[z(x)] & = & \int_{x_0}^{x_1} f[x, z(x), z'(x)] \, dx \\

& = & I[y(x) + \epsilon\eta(x)] = \int_{x_0}^{x_1} f\big(x, y(x)+\epsilon\eta(x), [y(x)+\epsilon\eta(x)]’\big) \, dx \\

& = & \int_{x_0}^{x_1} f[x, y(x)+\epsilon\eta(x), y'(x)+\epsilon {\eta}'(x)] \, dx. \end{eqnarray}$$

Since \(y(x)\) wins out all over functions, it also beats the groups functions of the form \(\epsilon \eta(x)\), in which \(\eta(x)\) is presumed so fixed and scaled by \(\epsilon\). Then the functional becomes a typical function that sends a real number \(\epsilon\) to another real number, given fixed \(y(x)\) and \(\eta(x)\):

$$g(\epsilon) = I[y(x)+\epsilon \eta(x)] \bigg|_{fixed\ y(x) \ and\ \eta(x)}.$$

As a normal function, a necessary condition for \(g(\epsilon\)) to extreme is its first derivative vanishes: \(\frac{dg(\epsilon)}{d\epsilon}=0.\)

On the other hand, the function \(g(\epsilon)\) falls back to \(I[y(x)]\) as \(\epsilon = 0\). Therefore, if \(g(\epsilon)\) extremes at \(\epsilon = 0\), \(I[y(x)]\) will be extremed by \(y(x)\). So we require:

$$\frac{dg(\epsilon)}{d\epsilon} \bigg|_{\epsilon = 0} = 0.$$

Manipulate a bit:

$$\begin{eqnarray}\frac{dg(\epsilon)}{d\epsilon} & = & \frac{\partial I[y(x)+\epsilon\eta(x)]}{\partial \epsilon} \\

& = & \frac{\partial I[z(x)]}{\partial \epsilon} = \int_{x_0}^{x_1} f[x, z(x), z'(x)] \, dx \\

& = & \int_{x_0}^{x_1} \bigg( \frac{\partial f[x, z(x), z'(x)]}{\partial z} \frac{\partial z(x)}{\partial \epsilon} + \frac{\partial f[x, z(x), z'(x)]}{\partial z’} \frac{\partial z'(x)}{\partial \epsilon} \bigg) \, dx \\

& = & \int_{x_0}^{x_1} \bigg( \frac{\partial f[x, z(x), z'(x)]}{\partial z} \frac{\partial [y(x)+\epsilon \eta(x)]}{\partial \epsilon} + \frac{\partial f[x, z(x), z'(x)]}{\partial z’} \frac{\partial [y'(x)+\epsilon \eta'(x)]}{\partial \epsilon}\bigg) \, dx \\

& = & \int_{x_0}^{x_1} \bigg( \frac{\partial f[x, z(x), z'(x)]}{\partial z} \eta(x) + \frac{\partial f[x, z(x), z'(x)]}{\partial z’} \eta'(x) \bigg) \, dx .\\

\end{eqnarray}$$

When \(\epsilon = 0\), \(z(x) = [y(x) + \epsilon\eta(x)] = y(x)\) , and \(\partial z \) and \(\partial z’ \) fall back to \(\partial y\) and \(\partial y’\), so

$$\frac{dg(\epsilon)}{d\epsilon} \bigg|_{\epsilon = 0}= \int_{x_0}^{x_1} \bigg( \frac{\partial f[x, y(x), y'(x)]}{\partial y} \eta(x) + \frac{\partial f[x, y(x), y'(x)]}{\partial y’} \eta'(x) \bigg) \, dx.$$

The second term in the integral can be converted by the rule of integration by parts:

$$\frac{\partial f[x, y(x), y'(x)]}{\partial y’} \eta'(x) = \big( \frac{\partial f[x, y(x), y'(x)]}{\partial y’} \cdot \eta(x) \big)’ – \frac{d}{dx}\bigg( \frac{\partial f[x, y(x), y'(x)]}{\partial y’} \bigg) \eta(x).$$

So,

$$\frac{dg(\epsilon)}{d\epsilon} \bigg|_{\epsilon = 0}= \int_{x_0}^{x_1} \bigg( \frac{\partial f[x, y(x), y'(x)]}{\partial y} – \frac{d}{dx}\big( \frac{\partial f[x, y(x), y'(x)]}{\partial y’} \big) \bigg) \eta(x) \, dx + \bigg( \frac{\partial f[x, y(x), y'(x)]}{\partial y’} \cdot \eta(x) \bigg) \bigg|_{x_0}^{x_1}. $$

Since \(\eta(x_0) = \eta(x_1) = 0\), the last term vanishes, so

$$\frac{dg(\epsilon)}{d\epsilon} \bigg|_{\epsilon = 0}= \int_{x_0}^{x_1} \bigg( \frac{\partial f[x, y(x), y'(x)]}{\partial y} – \frac{d}{dx}\big( \frac{\partial f[x, y(x), y'(x)]}{\partial y’} \big) \bigg) \eta(x) \, dx = 0. $$

Based on the arbitrariness of \(\eta(x)\), the above equation holds only when:

$$\frac{\partial f[x, y(x), y'(x)]}{\partial y} – \frac{d}{dx}\big( \frac{\partial f[x, y(x), y'(x)]}{\partial y’} \big)= 0.$$

Here we arrived at the famous Euler’s equation (or Euler-Lagrange equation). It states that if \(y(x)\) extremes the functional \(I[y(x)]=\int_{x_0}^{x_1} f[x, y(x), y'(x)]\), \(y(x)\) must satisfy the Euler’s equation.

We should note that Euler’s equation is only the necessary condition for the extreme of functionals. Just by an analogy with typical functions, the vanish of the first derivative (\(y'(x)=0\)) is also only the necessary condition. To have sufficient and necessary condition for extreme, the second order derivative is required to not vanish (\(y”(0) \neq 0\)). If \( y”(x)<0\), the function has the maximum. If \(y''(x)<0\), the function has the minimum. The extreme of functionals is the same. We also need other higher order conditions to construct the sufficient and necessary condition. We've be focusing on the extremum problem and ignored the specific maximum or minimum. The higher order conditions will specify maximum or minimum.

### Example problem

What is the shortest path connecting two planar points \((x_0, \, y_0)\) and \((x_1, \, y_1)\)? Yes our intuition has been too familiar with the answer—a straight line. But let’s see how functional gets us there mathematically.

We convert the problem to what kind function connecting \((x_0, \, y_0)\) and \((x_1, \, y_1)\) has the shortest length. We formulate the functional \(I[y(x)]\) that transforms any typical function (with suitable continuity) to a real number—its length:

$$I[y(x)]=\int_{x_0}^{x_1} f[x, y(x), y'(x)] \, dx=\int_{x_0}^{x_1} \sqrt{1+{y'(x)}^2} \, dx.$$

with

$$f[x, y(x), y'(x)]=\sqrt{1+{y'(x)}^2}.$$

Applying the Euler’s equation to \(f[x, y(x), y'(x)]\):

$$\begin{eqnarray} \frac{\partial f[x, y(x), y'(x)]}{\partial y} – \frac{d}{dx}\big( \frac{\partial f[x, y(x), y'(x)]}{\partial y’} \big)= 0 \\

\frac{\partial \sqrt{1+{y'(x)}^2}}{\partial y} – \frac{d}{dx}\big( \frac{\partial \sqrt{1+{y'(x)}^2}}{\partial y’} \big)= 0 \\

0 + \frac{d}{dx} \big( \frac{y'(x)}{\sqrt{1+{y'(x)}^2}} \big) = 0 \\

\frac{y”(x) \sqrt{1+{y'(x)}^2} – y'(x) \frac{y'(x)y”(x)}{\sqrt{1+{y'(x)}^2}}}{1+{y'(x)}^2} = 0 \\

\frac{y”(x)}{{\big (1+{y'(x)}^2 \big) }^{\frac{3}{2}}} = 0 \\

y”(x) = 0.

\end{eqnarray}$$

Therefore, \(y(x)\) has to be a linear function \(y(x) = ax+b\) connecting \((x_0, \, y_0)\) and \((x_1, \, y_1)\). This is the foundation of the theory that the shortest path between two points on a plane is a straight line.

## Equilibrium

We mechanics people are also too familiar with the concept of the potential energy and the principle of minimum potential energy. Let’s explore the connection. We study the simplest problem of a elastic bar subjected to a constant body force (for example gravity) and fixed at one end.

We assume the bar is undergoing small deformation and its material is linearly elastic. So the deformation measure—strain at coordinates \(x\) and the constitutive relation are:

$$\epsilon(x) = \frac{du(x)}{dx}=u'(x)$$

$$\sigma = E \epsilon.$$

For a single piece of material at coordinates \(x\) undergoing stress \(\sigma\) and strain \(\epsilon\), the strain energy is, it is also the definition of strain energy density:

$$dU=\int_{0}^{\epsilon}\sigma\bar{\epsilon} \, d\bar{\epsilon} = \frac{1}{2}\sigma \epsilon = \frac{1}{2} E \epsilon^{2} = \frac{1}{2} E{u'(x)}^2.$$

The strain energy for the whole bar is the summation of the strain energy of every such material pieces, or in other words, the integration of the strain energy density over all material:

$$U=\int_{\Omega} \, dU= \int_{0}^{L} \frac{1}{2}E {u'(x)}^2 A \, dx. $$

For a micro-element at the coordinate \(x\) of the bar with length \(dx\), the work done by the external body force on the micro-element of the bar:

$$dW=fAdx \, u(x).$$

The work done by the external body force for the whole bar is the summation of work done on every micro-elements by the external force:

$$W=\int_{\Omega}dW=\int_{0}^{L}fu(x)A \, dx.$$

The potential energy of the bar is:

$$P=U-W= \int_{0}^{L} \bigg[ \frac{1}{2}E {u'(x)}^2 – fu(x) \bigg] A \, dx. $$

From the point view of functionals, the potential energy is actually a real-valued functional that maps some set of displacement function \(u(x)\) to a real number—the potential energy.

$$P[u(x)] =\int_{x_0}^{x_1} f[x, u(x), u'(x)] \, dx= \int_{0}^{L} \bigg[ \frac{1}{2}E {u'(x)}^2 – fu(x) \bigg] A \, dx. $$

So from the correspondence,

$$x_0=0, \, x_1=L, \, f[x,u(x),u'(x)]=\frac{1}{2}E {u'(x)}^2-fu(x)$$

In the previous discussion of the functional, potential functions have to satisfy boundary conditions at \(x_0\) and \(x_1\). Here, \(u(x)\) also needs to satisfy its boundary conditions but with a bit difference. The first boundary condition is same that is the zero displacement at the left end \(u(0)=0\). The second boundary condition comes from the free surface (\(\sigma=0\)) at the right end, thus \(\sigma(L)=Eu'(L)=0\). This is a bit different from the above derivation since the boundary condition is imposed on the first derivative of the displacement function rather than the function itself. But the Euler’s solution still holds for this type problem.

Mother nature tends to minimize the energy, so let’s find what kind of function can minimize the potential energy functional. Applying Euler’s equation to the potential functional,

$$\begin{eqnarray} \frac{\partial f[x, u(x), u'(x)]}{\partial u} – \frac{d}{dx}\big( \frac{\partial f[x, u(x), u'(x)]}{\partial u’} \big)= 0 \\

\frac{\partial}{\partial u} \bigg( \frac{1}{2}E {u'(x)}^2-fu(x) \bigg) – \frac{d}{dx} \bigg( \frac{\partial}{\partial u’} \big( \frac{1}{2}E {u'(x)}^2-fu(x) \big) \bigg) = 0 \\

-f – \frac{d}{dx} \bigg( E u'(x) \bigg) = 0 \\

f + Eu”(x) = 0\end{eqnarray}$$

This is exactly the 1-D equilibrium equation obtained by analyzing the force balance of a micro-element of material. With boundary conditions: \(u(0)=0\) and \(Eu'(L)=0\), we have obtained the analytical solution:

$$u(x) = -\frac{f}{2E}x^2+\frac{fL}{E}x$$

The problem of finding the displacement of 1-D elastic bar is transformed to the problem of finding the function of displacement form a space of functions that minimizes the functional of potential energy. This is the equivalence of the principle of minimum potential energy and equilibrium.

## References

As a study note, there is nothing new in this post. All insights originally come from those materials:

- Gilbert Strang, Introduction to Applied Mathematics
- Louis Komzsik, Applied Calculus of Variations for Engineers