Why Aren’t Generalized Coordinates in the Lagrangian Equation Considered Functions of Time?

Category: [, ]

Excerpt:

Ever wondered why generalized coordinates in the Lagrangian formulation are treated as independent of time, despite their obvious connection?

Thumbnail:



Note: This post is an English adaptation of my original Chinese article (URL). Some parts have been modified for clarity, cultural relevance, or to better fit the English-speaking audience.

I was once puzzled by this issue, so I’d like to briefly share my understanding now. It may not be entirely correct, but here’s my take:

Without losing generality, let’s consider a $1$-dimensional space, as the conclusions here can be extended to $n$ dimensions.

First, in the Euler-Lagrange equation, the Lagrangian $\mathcal{L}$ is defined as a multivariable function $\mathcal{L}=f\left(q_1, \cdots, q_N ; v_1, \cdots, v_N ; t\right): \mathbb{R}^{2 N+1} \rightarrow \mathbb{R}$. Therefore, when we consider expressions like $\dfrac{\partial \mathcal{L}}{\partial q_i}$, $\dfrac{\partial \mathcal{L}}{\partial v_i}$, or $\dfrac{\partial \mathcal{L}}{\partial t}$, we are treating $q_i$, $v_i$, and $t$ as three entirely independent variables, much like how we handle variables $x$, $y$, and $z$ in the function $f(x, y, z)$.

Why is that? Fundamentally, it’s because this is how the mathematical definition of partial derivatives works. Even if the variables we’re differentiating are interrelated: for example, consider the function $f(x(t), t)$. Clearly, $x$ is a function of $t$, but when you take the partial derivative of $f$ with respect to $t$, it doesn’t affect $x(t)$ at all, and when you take the partial derivative of $f$ with respect to $x(t)$, it doesn’t affect $t$. This is simply how partial derivatives operate, and you must refer to the formal definition of partial derivatives to grasp this. Therefore, when we take partial derivatives of the Lagrangian $\mathcal{L}$, $q_i$, $v_i$, and $t$ are treated as independent variables.

However, when we consider the derivative $\dfrac{\mathrm{d} \mathcal{L}}{\mathrm{d} t}$, we are effectively treating the Lagrangian $\mathcal{L}$ as a single-variable function $\mathcal{L}=f(t): \mathbb{R} \rightarrow \mathbb{R}$. Thus, we need to apply the chain rule to expand it, resulting in:

$$
\displaystyle \frac{\mathrm{d} \mathcal{L}}{\mathrm{d} t}=\frac{\partial \mathcal{L}}{\partial t}+\sum_{i=1}^N \frac{\partial \mathcal{L}}{\partial q_i} \frac{\mathrm{d} q_i}{\mathrm{d} t} +\sum_{i=1}^N \frac{\partial \mathcal{L}}{\partial v_i} \frac{\mathrm{d} v_i}{\mathrm{d} t}\\
$$

It is important to note that the derivative $\dfrac{\mathrm{d} \mathcal{L}}{\mathrm{d} t}$ referred to here is the total derivative of the Lagrangian $\mathcal{L}$. According to the definition of the total derivative, the function being differentiated should be a single-variable function.

Therefore, the essence of solving this question lies in understanding the formal definitions of total derivatives and partial derivatives.

Updated: 2024-09-26

A friend of mine recently mentioned that he still doesn’t quite understand why the Lagrangian behaves differently in the Euler-Lagrange equation versus when considering the total derivative of it after reading my initial post, so I decided to clarify this distinction by rewriting my initial post into a formal mathematical approach. So here’s an updated post, using extremely formal and rigorous mathematical language, to explain the reasoning behind it.

I used to be perplexed by this question, and now, after some time, I’d like to share my understanding of it from a purely mathematical perspective (note: this might not be the definitive answer).

Without losing generality, let’s consider a $1$-dimensional space, as the conclusions here can be extended to $n$ dimensions.

First and foremost, let’s clarify an important point: in the Euler-Lagrange equation, the Lagrangian $\mathcal{L}$ is defined as a multivariate function $\mathcal{L} = f(q_1, \cdots, q_N; v_1, \cdots, v_N; t)$, where $f: \mathbb{R}^{2N+1} \to \mathbb{R}$.

For the partial derivatives $\dfrac{\partial \mathcal{L}}{\partial q_i}$, $\dfrac{\partial \mathcal{L}}{\partial v_i}$, or $\dfrac{\partial \mathcal{L}}{\partial t}$, we treat $q_i$, $v_i$, and $t$ as three completely independent variables, much like how we treat $x$, $y$, and $z$ in a function $f(x, y, z)$.

According to the formal definition of partial derivatives (note: set $m = 1$ and $n = 2N+1$ in the diagram, which aligns the function $\mathbf{f}$ as $\mathbb{R}^{2N+1} \to \mathbb{R}$, thereby matching the type of the function Lagrangian $\mathcal{L}$. Let $\mathbf{f}(\mathbf{x}) = \mathcal{L}$, which yields $f_i(\mathbf{x}) = \mathbf{f}(\mathbf{x}) = \mathcal{L}$):

Referenced from Baby Rudin – 9.16

Which indicates that when calculating partial derivatives, we disregard the relationships between the input variables of the function, as each input in a multivariate function forms an independent dimension. Therefore, when calculating the partial derivative $f_i(\mathbf{x} + t \mathbf{e}_j) -f_i(\mathbf{x})$, we are only looking at the change in a single input, while the input variables are orthogonal to each other.

For example, consider the function $f(x(t), t)$. Clearly, $x$ is a function of $t$, but when you take the partial derivative of $f$ with respect to $t$, it won’t affect $x(t)$ at all, and similarly, taking the partial derivative of $f$ with respect to $x(t)$ won’t affect $t$. This is because, within this function, the dimensions formed by $x(t)$ and $t$ are orthogonal and do not influence each other.

Now, regarding the total derivative, its formal definition is:

Referenced from Baby Rudin – 9.17

where $\mathbf{f}'(\mathbf{x})$ is defined as:

Referenced from Baby Rudin – 9.11

Once these definitions and theorems are in place, consider the following example:

Referenced from Baby Rudin – 9.18

Thus, when calculating the total derivative $\dfrac{d\mathcal{L}}{dt}$ of the Lagrangian $\mathcal{L}$, the Lagrangian $\mathcal{L}$ is defined as $\mathcal{L} = f(\gamma(t))$, where $f: \mathbb{R}^{2N+1} \to \mathbb{R}$ and $\gamma: \mathbb{R} \to \mathbb{R}^{2N+1}$.

So, if we set $g(t) = \mathcal{L} = f(\gamma(t))$ and define $\gamma(t) = \begin{bmatrix} q_1 & \cdots & q_N & v_1 & \cdots & v_N & t \end{bmatrix}^T$, using the chain rule to expand the total derivative of the Lagrangian $\mathcal{L}$, we obtain the following formula:

$$
\displaystyle \frac{\mathrm{d} \mathcal{L}}{\mathrm{d} t}= \sum_{i=1}^N \left ( \ (D_i f ) (\gamma (t)) \ \gamma_i'(t) \ \right ) = \frac{\partial \mathcal{L}}{\partial t}+\sum_{i=1}^N \frac{\partial \mathcal{L}}{\partial q_i} \frac{\mathrm{d} q_i}{\mathrm{d} t} +\sum_{i=1}^N \frac{\partial \mathcal{L}}{\partial v_i} \frac{\mathrm{d} v_i}{\mathrm{d} t}\
$$

Therefore, the essence of solving this question lies in understanding the formal definitions of total derivatives and partial derivatives.



Leave a Reply

Your email address will not be published. Required fields are marked *