Why is that when fxx*fyy - fxy^2 = 0 then the second derivatives alone cannot tell us whether f has a local minimum or maximum? How does this relate to the aforementioned statement that if ac- b^2 = 0, the form will again either be always positive or always negative, but now it's possible for it to equal 0 at values other than (x, y) = (0, 0). I am having trouble figuring this out. My strategy is to figure out what the parabola looks like when ac-b^2 = 0 (or f'xx*f'yy - f'xy^2). But I don't understand intuitively or analytically the implications that "now it's possible for it to equal 0 at values other than (x, y) = (0, 0)." Any help will be appreciated.

Hey Sam, You have to look at the whole function to understand this. We are saying that for a particular y, the zeros of the approximating function at (x, y) [where x is a variable] will occur at x = y*((-b - sqrt(b^2 - ac))/2a). If ac-b^2 = b^2 - ac = 0, then the only zeros of the function will occur at x = 0. For that to be the case, the function can NEVER cross the z axis. Why is that? It's hard to prove without using real analysis, but you can definitely picture it. We know for a fact that every zero of this function has to be on a line at x = 0. Now imagine that the function crosses the x axis. Then there's some point where the function is zero. By the thing we proved earier , it has to cross at x = 0. Now, we have to use the fact that the function that we're talking about is quadratic, and is kind of shaped like a sombrero. So if it crosses the z axis, the set of points where z=0 is a circular shape in the x-y plane. However, there's no circle that can fit on the line x = 0.

I cannot understand why, when the quadratic form can be both negative and positive means that the function must have a saddle. "This corresponds with the fact that the graph of f(x,y) = x^2 + 6xy + 5y^2" No matter how many times I read this section, I don't really get how this makes the graph a saddle at the point?

I was stuck on this a while too. If you watch the videos and only focus on the vertex of the parabola, you will notice it traces a parabola as "b" (yO) varies. Now notice how, in the first case, the parabola that the vertex traces has opposite concavity to the graphed parabola. This corresponds to a saddle. But in the second case, the traced parabola has the same concavity as the graphed function, and so the 3d graph is a paraboloid. You can sort of imagine the 2d graph to just be a slice of the graph along the axis of the constant variable. Now, the vertex of the graphed parabola must always pass through the origin since there are no constant terms. This also holds true for the traced graph, since it's really just a projection of the the function with x made constant instead of y. You can check all of this with this graph: https://www.desmos.com/calculator/qpp8ujuhvc. This all means that the vertex of the parabola cannot cross the x axis: if the vertex is above y=0 it can only be above y=0. Putting this all together, if the graphed parabola's vertex is above y=0 at any point, than you know the parabola it traces will be concave up like a cup, and concave down like a frown if the vertex is below y=0. Now, if a vertex is below y=0 then the only way it can have real x axis intercepts is if it's parabola is concave up. So the only way a function could have real x-axis intercepts in this case is if the concavities of graphed and traced are opposite each other, because the vertex being below y=0 inherently means the the traced parabola is concave down but requires the graphed parabola be concave up. So by checking if the parabola has real roots, we can cleverly check if the parabolas are opposite each other and therefore whether the 3d quadratic approximation is a saddle or a paraboloid.

At the beginning of the section titled "Analyzing Quadratic Forms," why can we plug in a constant value y0 for y?

As I understand it, it's just a conceptual trick. You could just as well plug in a constant value xO for x and you'd get the same result. More specifically, yO is a constant in the instance the equation is describing, but you can still vary yO. That's the point of the videos above where yO is varied. (their titles seem to be mislabeled, saying b varies when really yO varies). What you then find is that, no matter how many different numbers you try setting yO to, the graph might sometimes only ever be positive or negative, depending on a,b, and c. This idea is captured by using the quadratic formula, see above.

Main content

Course: Multivariable calculus > Unit 3

Lesson 4: Optimizing multivariable functions (articles)

Reasoning behind second partial derivative test

Google Classroom

For those of you who want to see why the second partial derivative works, I cover a sketch of a proof here.

Background

In the last article, I gave the statement of the second partial derivative test, but I only gave a loose intuition for why it's true. This article is for those who want to dig a bit more into the math, but it is not strictly necessary if you just want to apply the second partial derivative test.

What we're building to

To test whether a stable point of a multivariable function is a local minimum/maximum, take a look at the quadratic approximation of the function at that point. It is easier to analyze whether this quadratic approximation has maximum/minimum.
For two-variable functions, this boils down to studying expression that look like this:
$a x^{2} + 2 b x y + c y^{2}$ ‍
These are known as quadratic forms. The rule for when a quadratic form is always positive or always negative translates directly to the second partial derivative test.

Single variable case via quadratic approximation

First, I'd like to walk through the formal reasoning behind why the single-variable second derivative test works. By formal, I mean capturing the idea of concavity into more of an airtight argument.

In single-variable calculus, when

f^{'} (a) = 0

for some function

f

and some input

a

, here's what the second derivative test looks like:

$f$ ‍ has a local maximum at $a$ ‍ if $f^{″} (a) < 0$ ‍
$f$ ‍ has a local minimum at $a$ ‍ if $f^{″} (a) > 0$ ‍
If $f^{″} (a) = 0$ ‍, the second derivative alone cannot determine whether $f$ ‍ has a maximum, minimum or inflection point at $a$ ‍.

To think about why this test works, start by approximating the function with a taylor polynomial out to the quadratic term, also known as a quadratic approximation.

\begin{array}{r} f (x) \approx f (a) + f^{'} (a) (x - a) + \frac{1}{2} f^{″} (a) (x - a)^{2} \end{array}

Since

f^{'} (a) = 0

, this quadratic approximation simplifies like this:

\begin{array}{r} f (a) + \frac{1}{2} f^{″} (a) (x - a)^{2} \end{array}

Notice,

(x - a)^{2} \geq 0

for all possible

x

since squares are always positive or zero. That simple fact tells us everything we need to know! Why?

It means that when

f^{″} (a) > 0

, we can read our approximation like this:

\begin{array}{r} f (a) + \underset{\begin{array}{c} This is \geq 0 for all values of x, \\ and equals 0 only when x = a \end{array}}{\underset{⏟}{\frac{1}{2} f^{″} (a) (x - a)^{2}}} \end{array}

Therefore

a

is a local minimum of our approximation. In fact, it is a global minimum, but we only care about the fact that it is a local minimum. When the quadratic approximation of a function has a local minimum at the point of approximation, the function itself must also have a local minimum there. I'll say more on this in the last section, but for now the intuition should be clear since the function and its approximation "hug" one another around the point of approximation

a

Similarly, if

f^{″} (a) < 0

, we can read the approximation as

\begin{array}{r} f (a) + \underset{\begin{array}{c} This is \leq 0 for all values of x, \\ and equals 0 only when x = a \end{array}}{\underset{⏟}{\frac{1}{2} f^{″} (a) (x - a)^{2}}} \end{array}

In this case, the approximation has a local maximum at

x = a

, indicating that the function itself also has a local maximum there.

When

f^{″} (a) = 0

, our quadratic approximation always equals the constant

f (a)

, meaning our function is in some sense too flat to be analyzed by the second derivative alone.

What to take away from this:

When

f^{'} (a) = 0

, studying whether

f

has a local maximum or minimum at

a

comes down to whether the quadratic term of the Taylor approximation

\frac{1}{2} f^{″} (a) (x - a)^{2}

is always positive or always negative.

Two variable case, visual warmup

Now suppose you have a function

f (x, y)

with two inputs and one output, and you find a stable point. That is, a point where both its partial derivatives are

0

\begin{array}{r} f_{x} (x_{0}, y_{0}) = 0 \\ f_{y} (x_{0}, y_{0}) = 0 \end{array}

which is more succinctly written as

$\nabla f (x_{0}, y_{0}) = 0 \leftarrow Zero vector$ ‍

In order to determine whether this is a local maximum, local minimum, or neither, we look to it's quadratic approximation. Let's start with a visual preview of what we want to do:

$f$ ‍ will have a local minimum at a stable point $(x_{0}, y_{0})$ ‍ if the quadratic approximation at that point is a concave-up paraboloid.
Local min
$f$ ‍ will have local maximum there if the quadratic approximation is a concave down paraboloid:
Local max
If the quadratic approximation is saddle-shaped, $f$ ‍ has neither a maximum nor a minimum, but a saddle point.
Saddle point
If the quadratic approximation is flat in one or all directions, we do not have enough information to make conclusions about $f$ ‍.
Quadratic approximation is flat in one direction.

Quadratic approximation is constant.

Analyzing the quadratic approximation

The formula for the quadratic approximation of

f

, in vector form, looks like this:

$Q_{f} (x) = \underset{Constant}{\underset{⏟}{f (x_{0})}} + \underset{Linear term}{\underset{⏟}{\nabla f (x_{0}) \cdot (x - x_{0})}} + \underset{Quadratic term}{\underset{⏟}{\frac{1}{2} (x - x_{0})^{T} H f (x_{0}) (x - x_{0})}}$ ‍

Since we care about points where the gradient is zero, we can get rid of that gradient term

$Q_{f} (x) = f (x_{0}) + \frac{1}{2} (x - x_{0})^{T} H f (x_{0}) (x - x_{0})$ ‍

To see this spelled out for the two-variable case, let's expand out the Hessian term,

\begin{aligned} (x - x_{0})^{T} H_{f} (x_{0}) (x - x_{0}) \\ = {[\begin{array}{c} x - x_{0} \\ y - y_{0} \end{array}]}^{T} [\begin{array}{cc} f_{x x} (x_{0}, y_{0}) & f_{x y} (x_{0}, y_{0}) \\ f_{y x} (x_{0}, y_{0}) & f_{y y} (x_{0}, y_{0}) \end{array}] [\begin{array}{c} x - x_{0} \\ y - y_{0} \end{array}] \\ = [(x - x_{0}) (y - y_{0})] [\begin{array}{cc} f_{x x} (x_{0}, y_{0}) & f_{x y} (x_{0}, y_{0}) \\ f_{y x} (x_{0}, y_{0}) & f_{y y} (x_{0}, y_{0}) \end{array}] [\begin{array}{c} x - x_{0} \\ y - y_{0} \end{array}] \\ = [(x - x_{0}) (y - y_{0})] [\begin{array}{cc} f_{x x} (x_{0}, y_{0}) (x - x_{0}) + f_{x y} (x_{0}, y_{0}) (y - y_{0}) \\ f_{y x} (x_{0}, y_{0}) (x - x_{0}) + f_{y y} (x_{0}, y_{0}) (y - y_{0}) \end{array}] \\ = f_{x x} (x_{0}, y_{0}) (x - x_{0})^{2} + f_{x y} (x_{0}, y_{0}) (y - y_{0}) (x - x_{0}) \\ + f_{y x} (x_{0}, y_{0}) (x - x_{0}) (y - y_{0}) + f_{y y} (x_{0}, y_{0}) (y - y_{0})^{2} \\ = f_{x x} (x_{0}, y_{0}) (x - x_{0})^{2} + 2 f_{x y} (x_{0}, y_{0}) (y - y_{0}) (x - x_{0}) + f_{y y} (x_{0}, y_{0}) (y - y_{0})^{2} \end{aligned}

This last step follows due to the symmetry of second partial derivatives:

f_{x y} = f_{y x}

\begin{aligned} Q_{f} (x, y) & = f (x_{0}, y_{0}) + \\ \frac{1}{2} f_{x x} (x_{0}, y_{0}) (x - x_{0})^{2} + \\ f_{x y} (x_{0}, y_{0}) (x - x_{0}) (y - y_{0}) + \\ \frac{1}{2} f_{y y} (x_{0}, y_{0}) (y - y_{0})^{2} \end{aligned}

(Note, if this approximation or any of the notation feels shaky or unfamiliar, consider reviewing the article on quadratic approximations).

As I showed with the single variable case, the strategy is to study if the quadratic term of this approximation is always positive or always negative.

\begin{aligned} Q_{f} (x, y) & = f (x_{0}, y_{0}) + \\ \begin{array}{c} \frac{1}{2} f_{x x} (x_{0}, y_{0}) (x - x_{0})^{2} + \\ f_{x y} (x_{0}, y_{0}) (x - x_{0}) (y - y_{0}) + \\ \frac{1}{2} f_{y y} (x_{0}, y_{0}) (y - y_{0})^{2} \end{array}} \begin{array}{c} Is this always \geq 0 ? \\ Is it always \leq 0 ? \\ Can it be either? \end{array} \end{aligned}

Right now, this term is a lot to write down, but we can distill its essence by studying expressions of the following form:

\begin{array}{r} a x^{2} + 2 b x y + c y^{2} \end{array}

Such expressions are often fancifully called "quadratic forms".

The word "quadratic" indicates that the terms are of order two, meaning they involve the product of two variables.
The word "form" always threw me off here, and it makes the idea of a quadratic form sound more complicated than it really is. Mathematicians say "quadratic form" instead of "quadratic expression" to emphasize that all terms are of order $2$ ‍, and there are no linear or constant terms mucking up the expression. A phrase like "purely quadratic expression" would have been much too reasonable and understandable to adopt.

To make the notation for quadratic forms easier to generalize into higher dimensions, they are often written with respect to a symmetric matrix

M

\begin{array}{r} x^{⊺} M x = [x y] [\begin{array}{cc} a & b \\ b & c \end{array}] [\begin{array}{c} x \\ y \end{array}] \end{array}

Here is the crucial question:

How can we tell whether the expression $a x^{2} + 2 b x y + c y^{2}$ ‍ is always positive, always negative, or neither, just by analyzing the constants $a$ ‍, $b$ ‍ and $c$ ‍?

Analyzing quadratic forms

If we plug in a constant value

y_{0}

for

y

, we get some single variable quadratic function:

$a x^{2} + 2 b x y_{0} + c (y_{0})^{2}$ ‍

The graph of this function is a parabola, and it will only cross the

x

-axis if this quadratic function has real roots.

Otherwise, it either stays entirely positive or entirely negative, depending on the sign of

a

We can apply the quadratic formula to this expression to see whether it's roots are real or complex.

$a x^{2} + 2 b x y_{0} + c (y_{0})^{2}$ ‍

The leading term is $a$ ‍.
The linear term is $2 b y_{0}$ ‍.
The constant term is $c y_{0}^{2}$ ‍

Applying the quadratic formula looks like this:

\begin{aligned} \frac{- 2 b y_{0} \pm \sqrt{(- 2 b y_{0})^{2} - 4 a c y_{0}^{2}}}{2 a} \\ ⇓ \\ \frac{- 2 b y_{0} \pm 2 y_{0} \sqrt{b^{2} - a c}}{2 a} \\ ⇓ \\ y_{0} (\frac{- b \pm \sqrt{b^{2} - a c}}{a}) \end{aligned}

y_{0} = 0

, the quadratic has a double root at

x = 0

, meaning the parabola barely kisses the

x

-axis at that point. Otherwise, whether or not these roots are real depends only on the sign of the expression

b^{2} - a c

If $b^{2} - a c \geq 0$ ‍, there are real roots, so the graph of $a x^{2} + 2 b x y_{0} + c (y_{0})^{2}$ ‍ crosses the $x$ ‍-axis.
Otherwise, if $b^{2} - a c < 0$ ‍, there are no real roots, so the graph of $a x^{2} + 2 b x y_{0} + c (y_{0})^{2}$ ‍ either stays entirely positive or entirely negative.

For example, consider the case

$a = 1$ ‍
$b = 3$ ‍
$c = 5$ ‍

In this case,

b^{2} - a c = 3^{2} - (1) (5) = 4 > 0

, so the graph of

f (x) = x^{2} + 6 x y_{0} + 5 y_{0}^{2}

always crosses the

x

-axis. Here is a video showing how that graph moves around as we let the value of

y_{0}

slowly change.

Khan Academy video wrapper

See video transcript

This corresponds with the fact that the graph of

f (x, y) = x^{2} + 6 x y + 5 y^{2}

can be both positive and negative.

Khan Academy video wrapper

See video transcript

In contrast, consider the case

$a = 2$ ‍
$b = 2$ ‍
$c = 3$ ‍

Now,

b^{2} - a c = 2^{2} - (2) (3) = - 2 < 0

. This means the graph of

f (x) = 2 x^{2} + 4 x y_{0} + 3 y_{0}^{2}

never crosses the

x

-axis, although it kisses it if the constant

y_{0}

is zero. Here is a video showing how that graph changes as we let the constant

y_{0}

vary:

Khan Academy video wrapper

See video transcript

This corresponds with the fact that the multivariable function

f (x, y) = 2 x^{2} + 4 x y + 3 y^{2}

is always positive.

Khan Academy video wrapper

See video transcript

Rule for the sign of quadratic forms

As if to confuse students who are familiar with the quadratic formula, rules regarding quadratic forms are often phrased with respect to

a c - b^{2}

instead of

b^{2} - a c

. Since one is the negative of the other, this requires switching when you say

\geq 0

and when you say

\leq 0

. The reason mathematicians prefer

a c - b^{2}

is because this is the determinant of the matrix describing the quadratic form:

$det ([\begin{array}{cc} a & b \\ b & c \end{array}]) = a c - b^{2}$ ‍

As a reminder, this is how the quadratic form looks using the matrix.

$a x^{2} + 2 b x y + c y^{2} = [x y] [\begin{array}{cc} a & b \\ b & c \end{array}] [\begin{matrix} x \\ y \end{matrix}]$ ‍

Tying this convention together with what we found in the previous section, we write the rule for the sign of a quadratic form as follows:

If $a c - b^{2} < 0$ ‍, the quadratic form can attain both positive and negative values, and it's possible for it to equal $0$ ‍ at values other than $(x, y) = (0, 0)$ ‍.
If $a c - b^{2} > 0$ ‍ the form is either always positive or always negative depending on the sign of $a$ ‍, but in either case it only equals $0$ ‍ at $(x, y) = (0, 0)$ ‍.
- If $a > 0$ ‍, the form is always positive, so $(0, 0)$ ‍ is a global minimum point of the form.
- If $a < 0$ ‍, the form is always negative, so $(0, 0)$ ‍ is a global maximum point of the form.
If $a c - b^{2} = 0$ ‍, the form will again either be always positive or always negative, but now it's possible for it to equal $0$ ‍ at values other than $(x, y) = (0, 0)$ ‍

Some terminology:

When

a x^{2} + 2 b x y + c y^{2} > 0

for all

(x, y)

other than

(x, y) = (0, 0)

, the quadratic form and the matrix associated with it are both called positive definite.

When

a x^{2} + 2 b x y + c y^{2} < 0

for all

(x, y)

other than

(x, y) = (0, 0)

, they are both negative definite.

If you replace the

>

and

<

with

\geq

and

\leq

, the corresponding properties are positive semi-definite and negative semi-definite.

Applying this to $Q_{f}$ ‍

Okay zooming back out to where we started, let's write down our quadratic approximation again:

\begin{aligned} Q_{f} (x, y) & = f (x_{0}, y_{0}) + \\ \frac{1}{2} f_{x x} (x_{0}, y_{0}) (x - x_{0})^{2} + \\ f_{x y} (x_{0}, y_{0}) (x - x_{0}) (y - y_{0}) + \\ \frac{1}{2} f_{y y} (x_{0}, y_{0}) (y - y_{0})^{2} \end{aligned}

The quadratic portion of

Q_{f}

is written with respect to

(x - x_{0})

and

(y - y_{0})

instead of simply

x

and

y

, so everywhere where the rule for the sign of quadratic forms references the point

(0, 0)

, we apply it instead to the point

(x_{0}, y_{0})

As with the single-variable case, when the quadratic approximation

Q_{f}

has a local maximum (or minimum) at

(x_{0}, y_{0})

, it means

f

has a local maximum (or minimum) at that point. This means we can translate the rule for the sign of a quadratic form directly to get the second derivative test:

Suppose

\nabla f (x_{0}, y_{0}) = 0

, then

If $f_{x x} (x_{0}, y_{0}) f_{y y} (x_{0}, y_{0}) - (f_{x y} (x_{0}, y_{0}))^{2} < 0$ ‍, $f$ ‍ has a neither minimum nor maximum at $(x_{0}, y_{0})$ ‍, but instead has a saddle point.
Saddle point
If $f_{x x} (x_{0}, y_{0}) f_{y y} (x_{0}, y_{0}) - (f_{x y} (x_{0}, y_{0}))^{2} > 0$ ‍, $f$ ‍ definitely has either a maximum or minimum at $(x_{0}, y_{0})$ ‍, and we must look at the sign of $f_{x x} (x_{0}, y_{0})$ ‍ to figure out which one it is.
- If $f_{x x} (x_{0}, y_{0}) > 0$ ‍, $f$ ‍ has a local minimum.
  Local min
- If $f_{x x} (x_{0}, y_{0}) < 0$ ‍, $f$ ‍ has a local maximum.
  Local max
If $f_{x x} (x_{0}, y_{0}) f_{y y} (x_{0}, y_{0}) - (f_{x y} (x_{0}, y_{0}))^{2} = 0$ ‍, the second derivatives alone cannot tell us whether $f$ ‍ has a local minimum or maximum.

Our current tools are lacking

Everything presented here almost constitutes a full proof, except for one final step.

Intuitively, it might make sense that when a quadratic approximation bends and curves in a certain way, the function should bend and curve in that same way near the point of approximation. But how do we formalize this beyond intuition?

Unfortunately, we will not do that here. Making arguments about derivatives fully rigorous requires using real analysis, the theoretical backbone of calculus.

Furthermore, you might be wondering how this generalizes to functions with more than two inputs. There is a notion of quadratic forms with multiple variables, but phrasing the rule for when such forms are always positive or always negative uses various ideas from linear algebra.

Summary

To test whether a stable point of a multivariable function is a local minimum/maximum, take a look at the quadratic approximation of the function at that point. It is easier to analyze whether this quadratic approximation has maximum/minimum.
For two-variable functions, this boils down to studying expression that look like this:
$a x^{2} + 2 b x y + c y^{2}$ ‍
These are known as quadratic forms. The rule for when a quadratic form is always positive or always negative translates directly to the second partial derivative test.

Want to join the conversation?

Sort by:

sauj123
Posted 8 years ago. Direct link to sauj123's post “Can khan academy add vide...”
Can khan academy add videos/articles on real analysis? Perhaps add a subtopic to the mathematics topic called "Analysis", with sub-subtopics on real, complex and functional analysis.
Button navigates to signup pageComment on sauj123's post “Can khan academy add vide...”
(89 votes)
Answer
Jose Molina
Posted 7 years ago. Direct link to Jose Molina's post “When he says "If ac-b2<0,...”
When he says "If ac-b2<0, then the quadratic form can attain both positive and negative values, and is zero only at
(x,y)=(0,0)" I am not sure I understand, because this implies that the form has real solutions for any y value. So there should be other points for which the form = 0.
Button navigates to signup pageButton navigates to signup page
(8 votes)
Answer
- Charles Morelli
  Posted 5 months ago. Direct link to Charles Morelli's post “I don't know if the artic...”
  I don't know if the article's been amended since this question, but that's not what it says now.
  Button navigates to signup page
  (1 vote)
Sam
Posted 8 years ago. Direct link to Sam's post “Why is that when fxx*fyy ...”
Why is that when fxx*fyy - fxy^2 = 0 then the second derivatives alone cannot tell us whether f has a local minimum or maximum?
How does this relate to the aforementioned statement that if ac- b^2 = 0, the form will again either be always positive or always negative, but now it's possible for it to equal 0 at values other than (x, y) = (0, 0).
I am having trouble figuring this out. My strategy is to figure out what the parabola looks like when ac-b^2 = 0 (or f'xx*f'yy - f'xy^2). But I don't understand intuitively or analytically the implications that "now it's possible for it to equal 0 at values other than (x, y) = (0, 0)."
Any help will be appreciated.
Button navigates to signup pageButton navigates to signup page
(4 votes)
Answer
- AbhiSharm
  Posted 7 years ago. Direct link to AbhiSharm's post “Hey Sam, You have to loo...”
  Hey Sam,
  You have to look at the whole function to understand this. We are saying that for a particular y, the zeros of the approximating function at (x, y) [where x is a variable] will occur at x = y*((-b - sqrt(b^2 - ac))/2a). If ac-b^2 = b^2 - ac = 0, then the only zeros of the function will occur at x = 0.
  For that to be the case, the function can NEVER cross the z axis. Why is that? It's hard to prove without using real analysis, but you can definitely picture it. We know for a fact that every zero of this function has to be on a line at x = 0. Now imagine that the function crosses the x axis. Then there's some point where the function is zero. By the thing we proved earier , it has to cross at x = 0. Now, we have to use the fact that the function that we're talking about is quadratic, and is kind of shaped like a sombrero. So if it crosses the z axis, the set of points where z=0 is a circular shape in the x-y plane. However, there's no circle that can fit on the line x = 0.
  Button navigates to signup page
  (2 votes)
John Smith
Posted 3 years ago. Direct link to John Smith's post “I cannot understand why, ...”
I cannot understand why, when the quadratic form can be both negative and positive means that the function must have a saddle.

"This corresponds with the fact that the graph of
f(x,y) = x^2 + 6xy + 5y^2" No matter how many times I read this section, I don't really get how this makes the graph a saddle at the point?
Button navigates to signup pageButton navigates to signup page
(3 votes)
Answer
- 1564538
  Posted a year ago. Direct link to 1564538's post “I was stuck on this a whi...”
  I was stuck on this a while too. If you watch the videos and only focus on the vertex of the parabola, you will notice it traces a parabola as "b" (yO) varies. Now notice how, in the first case, the parabola that the vertex traces has opposite concavity to the graphed parabola. This corresponds to a saddle. But in the second case, the traced parabola has the same concavity as the graphed function, and so the 3d graph is a paraboloid. You can sort of imagine the 2d graph to just be a slice of the graph along the axis of the constant variable.
  Now, the vertex of the graphed parabola must always pass through the origin since there are no constant terms. This also holds true for the traced graph, since it's really just a projection of the the function with x made constant instead of y. You can check all of this with this graph: https://www.desmos.com/calculator/qpp8ujuhvc.
  This all means that the vertex of the parabola cannot cross the x axis: if the vertex is above y=0 it can only be above y=0.
  Putting this all together, if the graphed parabola's vertex is above y=0 at any point, than you know the parabola it traces will be concave up like a cup, and concave down like a frown if the vertex is below y=0.
  Now, if a vertex is below y=0 then the only way it can have real x axis intercepts is if it's parabola is concave up. So the only way a function could have real x-axis intercepts in this case is if the concavities of graphed and traced are opposite each other, because the vertex being below y=0 inherently means the the traced parabola is concave down but requires the graphed parabola be concave up. So by checking if the parabola has real roots, we can cleverly check if the parabolas are opposite each other and therefore whether the 3d quadratic approximation is a saddle or a paraboloid.
  Comment on 1564538's post “I was stuck on this a whi...”
  (3 votes)
Robert
Posted 5 years ago. Direct link to Robert's post “Would it then be right to...”
Would it then be right to conclude that for f(x,y,z) to have a local min/max at (x0,y0,z0) you need

fxxfyy + fxxfzz + fyyfzz - fxy² - fyz² - fxz²

to be greater than 0, since for f to have an min/max you need a min/max over the xy-plane (which would be accounted for by z=0, y=y0, x=variable, then the same reasoning as in the article to get fxxfyy - fxy² > 0), as well as over the yz-plane (fyyfzz - fyz² > 0) and the xz-plane (fxxfzz - fxz² > 0), and adding all of those together gives us

fxxfyy + fyyfzz + fxxfzz - fxy² - fyz² - fxz² > 0

, or would that only be analogous to having fxx and fyy be greater/smaller than 0 and there is still information missing?
Button navigates to signup pageComment on Robert's post “Would it then be right to...”
(2 votes)
Answer
Miss H
Posted a year ago. Direct link to Miss H's post “At the beginning of the s...”
At the beginning of the section titled "Analyzing Quadratic Forms," why can we plug in a constant value y0 for y?
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- 1564538
  Posted a year ago. Direct link to 1564538's post “As I understand it, it's ...”
  As I understand it, it's just a conceptual trick. You could just as well plug in a constant value xO for x and you'd get the same result.
  More specifically, yO is a constant in the instance the equation is describing, but you can still vary yO. That's the point of the videos above where yO is varied. (their titles seem to be mislabeled, saying b varies when really yO varies). What you then find is that, no matter how many different numbers you try setting yO to, the graph might sometimes only ever be positive or negative, depending on a,b, and c. This idea is captured by using the quadratic formula, see above.
  Comment on 1564538's post “As I understand it, it's ...”
  (3 votes)
Hemen Taleb
Posted 8 years ago. Direct link to Hemen Taleb's post “does there exist videos/a...”
does there exist videos/articles on limits and continuity in 3d?
Button navigates to signup pageComment on Hemen Taleb's post “does there exist videos/a...”
(2 votes)
Answer
Kalev Maricq
Posted 4 years ago. Direct link to Kalev Maricq's post “Why does "the quadratic ...”
Why does
"the quadratic form can attain both positive and negative values, and it's possible for it to equal 0 at values other than (x, y) = (0, 0)(x,y)=(0,0)"
=
"f has a neither minimum nor maximum at (x_0, y_0)(x_0 ,y_0), but instead has a saddle point"?

I'm missing the implicit connection here.
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
kyoc2011
Posted 5 months ago. Direct link to kyoc2011's post “If somebody can understan...”
If somebody can understand the entire article, please tell me about how to generalize the function with three parameters or even more. Or is there any other way to get a more close approximation to the original function?

Just like Grant mentioned in the last part, I wanna dig a little bit deeper, even I won't understand all of them, but I still wanna taste a little bit of it.

Up to unit 3, I loved the way he taught about those complicated concepts. First, he used video, graphs, and simple proof, after that, is bunch of articles wrapped things up, making things clearer, and revised all the stuff again.
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- Flo1148
  Posted 4 months ago. Direct link to Flo1148's post “Sharing my insights on th...”
  Sharing my insights on this one, please correct me if something is not quite correct :)
  
  The formula for quadratic approximation is not limited to 2 parameters. Looking at it in its vectorized form for f(x,y,z), the vector x will be a column vector (x,y,z) and the hessian a 3x3 matrix. The quadratic form is still described by v^T H v, where v = x - x0. (Technically I ignored the 1/2, but does not really matter for the following line of reason).
  
  Now we are still interested in the same question as for the 2 parameter case: Is the expression v^T H v always positive/negative/either?
  
  A symmetric matrix M is called positive definite matrix if v^T M v > 0 for all real, nonzero column vectors v. It is called negative definite for v^T M v < 0. An approach for testing if a matrix is definite is the investigation of its eigenvalues. A positive resp. negative definite matrix will show purely positive resp. negative eigenvalues.
  
  Thus the eigenvalues of the hessian matrix become your criterion for classification of stationary points. For a maximum the eigenvalues will be all negative, for a minimum all positive and if they differ its a saddle point. (also if one of the eigenvalues is 0 the criterion becomes indifferent).
  
  Now to close the loop lets apply this to the case of 2 parameters, as this should work as well. The key insight here is the relation between hessian matrix and the 2nd partial derivative test of f(x,y). Notice how fxx fyy - fxy^2 is the determinant of the 2x2 hessian matrix H_f.
  
  Since the hessian in this case is a 2x2 matrix it will show 2 eigenvalues. Either of them might be positive +, negative - (or 0). Now the determinant is the product of all eigenvalues (linear algebra stuff). So if we ignore the 0 case there are 3 possible outcomes:
  + ⋅ + = +
  - ⋅ - = +
  - ⋅ + = -
  As we can see, two positive or two negative eigenvalues give a positive determinant indicating a minimum or maximum. If the signs do not match we get a negative determinant, hence a saddle point according to the 2nd partial derivative test.
  
  Notice how the determinant can not be used anymore if there are more than 2 eigenvalues (more than 2 parameters for f). Example: for - ⋅ - ⋅ - = - the determinant would suggest a saddle point, whereas the eigenvalues reveal its a minimum.
  
  PS: Here is another fun thing I realized. For a definite matrix we need symmetry. So for the whole criterion to work we assume a symmetric hessian, which is true if fxy = fyx. This is why in the formula we can use fxy^2 instead of fxy fyx.
  
  PPS: Now all of this is still based on quadratic approximation. If you want to get a closer approximation you need to go cubic or higher. However things will get hardcore very fast. I am not sure if there is any gain from this for the sake of classification of stationary points.
  Button navigates to signup page
  (1 vote)
Will Simon
Posted 6 years ago. Direct link to Will Simon's post “This article raised a few...”
This article raised a few questions for me:
1) If a point p in R^k is not a critical point of a scalar-(or vector-)valued function f (i.e., the gradient of f at p is not 0), yet the quadratic approximation does have a local extremum at p, do we still have that f has a local extremum at p?
2) Does a scalar- or vector-valued function f defined on R^k having a local extremum at a point p in R^k always (or at least sometimes) imply that its quadratic approximation has a local minimum there?
3) In the context of this article, if b^2-ac > 0, how do we know the only zero of the quadratic form is at (0, 0)?
4) Again in the context of this article, why does b^2-4ac=0 mean both that the quadratic form will always be positive or always be negative and that it can have zeros at points other than (0, 0)?
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer

Multivariable calculus

Course: Multivariable calculus > Unit 3

Background

What we're building to

Single variable case via quadratic approximation

Two variable case, visual warmup

Analyzing the quadratic approximation

Analyzing quadratic forms

Rule for the sign of quadratic forms

Some terminology:

Applying this to Qf‍

Our current tools are lacking

Summary

Want to join the conversation?

Applying this to $Q_{f}$ ‍