In my opinion, the following proof is the shortest, simplest and best proof of the Cauchy-Schwarz inequality. It is a proof developed in The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities [1]. Below are three variations of the proof at three increasing levels of abstraction. These three variations are expressed respectively in terms of:

  • random variables [2]

  • vectors of any real inner product space [1]

  • vectors of any inner product space (real or complex) [3]

The Cauchy-Schwarz inequality was originally expressed in terms of sequences of numbers [1]. The continuous analogue is in terms of two integrable functions [4].

In terms of random variables

Given any two random variables XX and YY, E ⁣[XY]2E ⁣[X2]E ⁣[Y2] {\operatorname{E}\!\left[{ XY}\right]}^2 \le {\operatorname{E}\!\left[{ X^2}\right]} {\operatorname{E}\!\left[{ Y^2}\right]} with equality holding iff aX+bY=0a X + b Y = 0 for some constants aa and bb, at least one non-zero (i.e. XX and YY are linearly dependent).

Proof

If either E ⁣[X2]=0{\operatorname{E}\!\left[{ X^2}\right]} = 0 or E ⁣[Y2]=0{\operatorname{E}\!\left[{ Y^2}\right]} = 0 then E ⁣[XY]=0{\operatorname{E}\!\left[{ XY}\right]} = 0. Otherwise define X^:=XE ⁣[X2]   and   Y^:=YE ⁣[Y2] \hat{X} := \frac{X}{ \sqrt{{\operatorname{E}\!\left[{ X^2}\right]}} } \; \text{ and } \; \hat{Y} := \frac{Y}{ \sqrt{{\operatorname{E}\!\left[{ Y^2}\right]}} } for which E ⁣[X^2]=E ⁣[Y^2]=1{\operatorname{E}\!\left[{ \hat{X}^2}\right]} = {\operatorname{E}\!\left[{ \hat{Y}^2}\right]} = 1. The proof follows from the product of two numbers always being less than or equal to the average of their squares 0E ⁣[(X^Y^)2]E ⁣[X^Y^]E ⁣[X^2+Y^22]E ⁣[XY]E ⁣[X2]E ⁣[Y2]1+12E ⁣[XY]2E ⁣[X2]E ⁣[Y2] \begin{aligned} 0 & \le {\operatorname{E}\!\left[{ (\hat{X} - \hat{Y})^2 }\right]} \\ {\operatorname{E}\!\left[{ \hat{X} \hat{Y} }\right]} & \le {\operatorname{E}\!\left[{ \frac{ \hat{X}^2 + \hat{Y}^2 }{2} }\right]} \\ \frac{{\operatorname{E}\!\left[{ XY}\right]}}{ \sqrt{{\operatorname{E}\!\left[{ X^2}\right]}} \sqrt{{\operatorname{E}\!\left[{ Y^2}\right]}} } & \le \frac{1+1}{2} \\ {\operatorname{E}\!\left[{ XY}\right]}^2 & \le {\operatorname{E}\!\left[{ X^2}\right]} {\operatorname{E}\!\left[{ Y^2}\right]} \end{aligned} If both sides of the inequality are equal, linear dependence follows since either X=0X = 0 or Y=0Y = 0 or 1E ⁣[X2]X+1E ⁣[Y2]Y=X^Y^=0 \frac{1}{\sqrt{{\operatorname{E}\!\left[{ X^2}\right]}}} X + \frac{1}{\sqrt{{\operatorname{E}\!\left[{ Y^2}\right]}}} Y = \hat{X} - \hat{Y} = 0 If XX and YY are linearly dependent, either X=kYX = k Y or Y=kXY = k X for some constant kk, either way both sides of the inequality are equal.

QED

In terms of vectors of a real inner product space

The probabilistic proof can be generalized to any real inner product space as shown in [1].

Given any vectors x,yx, y from a real inner product space, the Cauchy-Schwarz inequality is x,yxy \left\langle{ x}, { y}\right\rangle \le \left\|{ x}\right\| \left\|{ y}\right\| with equality holding iff xx and yy are linearly dependent.

Proof

If either x=0\left\|{ x}\right\| = 0 or y=0\left\|{ y}\right\| = 0 then x,y=0\left\langle{ x}, { y}\right\rangle = 0. Otherwise define x^:=xx   and   y^:=yy \hat{x} := \frac{x}{ \left\|{ x}\right\| } \; \text{ and } \; \hat{y} := \frac{y}{ \left\|{ y}\right\| } for which x^=y^=1\left\|{ \hat{x}}\right\| = \left\|{ \hat{y}}\right\| = 1. 0x^y^,x^y^2x^,y^x^,x^+y^,y^2  x,yxy1+1x,yxy \begin{aligned} 0 & \le \left\langle{ \hat{x} - \hat{y}}, { \hat{x} - \hat{y}}\right\rangle \\ 2 \left\langle{ \hat{x}}, { \hat{y}}\right\rangle & \le \left\langle{ \hat{x}}, { \hat{x}}\right\rangle + \left\langle{ \hat{y}}, { \hat{y}}\right\rangle \\ 2 \; \frac{ \left\langle{ x}, { y}\right\rangle }{ \left\|{ x}\right\| \left\|{ y}\right\| } & \le 1+1 \\ \left\langle{ x}, { y}\right\rangle & \le \left\|{ x}\right\| \left\|{ y}\right\| \end{aligned} If both sides of the inequality are equal, linear dependence follows since either x=0x = \vec{0} or y=0y = \vec{0} or 1xx+1yy=x^y^=0 \frac{1}{\left\|{ x}\right\|} x + \frac{1}{\left\|{ y}\right\|} y = \hat{x} - \hat{y} = \vec{0} If xx and yy are linearly dependent, either x=λyx = \lambda y or y=λxy = \lambda x for some scaler λ\lambda, either way both sides of the inequality are equal.

QED

In terms of vectors of an inner product space

This section considers the Cauchy-Schwarz inequality for vectors of a real or complex inner product space.

The proof for real inner produce spaces does not work for complex inner product spaces because y,x=x,y\left\langle{ y}, { x}\right\rangle = \overline{\left\langle{ x}, { y}\right\rangle} (complex conjugate).

The proof is effectively the same as the previous proof for real inner product spaces. But the normalized vectors x^\hat{x} and y^\hat{y} must be “rotated” in the complex plane so that both sides of the inequality remain real. This rotation will be done via a multiplier α\alpha.

Proof

Let x^\hat{x} and y^\hat{y} be defined as in the proof for real inner product spaces. If x^,y^=0\left\langle{ \hat{x}}, { \hat{y}}\right\rangle = 0 the inequality holds, otherwise let α:=y^,x^x^,y^ \alpha := \sqrt{ \frac{\left\langle{ \hat{y}}, { \hat{x}}\right\rangle}{\left|{ \left\langle{ \hat{x}}, { \hat{y}}\right\rangle}\right|} } for which the following convenient properties hold αα=1=αα \alpha \overline{\alpha} = 1 = \overline{\alpha} \alpha α2x^,y^=y^,x^x^,y^x^,y^=x^,y^2x^,y^=x^,y^=α2y^,x^ \alpha^2 \left\langle{ \hat{x}}, { \hat{y}}\right\rangle = \frac{\left\langle{ \hat{y}}, { \hat{x}}\right\rangle\left\langle{ \hat{x}}, { \hat{y}}\right\rangle}{\left|{ \left\langle{ \hat{x}}, { \hat{y}}\right\rangle}\right|} = \frac{\left|{ \left\langle{ \hat{x}}, { \hat{y}}\right\rangle}\right|^2}{\left|{ \left\langle{ \hat{x}}, { \hat{y}}\right\rangle}\right|} = \left|{ \left\langle{ \hat{x}}, { \hat{y}}\right\rangle}\right| = \overline{\alpha}^2 \left\langle{ \hat{y}}, { \hat{x}}\right\rangle

The proof proceeds like with a real inner product space but using α\alpha, 0αx^αy^,αx^αy^=ααx^,x^α2x^,y^α2y^,x^+ααy^,y^2x^,y^x^,x^+y^,y^2  x,yxy1+1x,yxy \begin{aligned} 0 & \le \left\langle{ \alpha \hat{x} - \overline{\alpha} \hat{y}}, { \alpha \hat{x} - \overline{\alpha} \hat{y}}\right\rangle \\ & = \alpha \overline{\alpha} \left\langle{ \hat{x}}, { \hat{x}}\right\rangle - \alpha^2 \left\langle{ \hat{x}}, { \hat{y}}\right\rangle - \overline{\alpha}^2 \left\langle{ \hat{y}}, { \hat{x}}\right\rangle + \overline{\alpha} \alpha \left\langle{ \hat{y}}, { \hat{y}}\right\rangle \\ 2 \left|{ \left\langle{ \hat{x}}, { \hat{y}}\right\rangle}\right| & \le \left\langle{ \hat{x}}, { \hat{x}}\right\rangle + \left\langle{ \hat{y}}, { \hat{y}}\right\rangle \\ 2 \; \frac{ \left|{ \left\langle{ x}, { y}\right\rangle}\right| }{ \left\|{ x}\right\| \left\|{ y}\right\| } & \le 1+1 \\ \left|{ \left\langle{ x}, { y}\right\rangle}\right| & \le \left\|{ x}\right\| \left\|{ y}\right\| \end{aligned} If both sides of the inequality are equal, linear dependence follows since either x=0x = \vec{0} or y=0y = \vec{0} or αxx+αyy=αx^αy^=0 \frac{\alpha}{\left\|{ x}\right\|} x + \frac{\overline{\alpha}}{\left\|{ y}\right\|} y = \alpha \hat{x} - \overline{\alpha} \hat{y} = \vec{0} If xx and yy are linearly dependent, either x=λyx = \lambda y or y=λxy = \lambda x for some scaler λ\lambda, either way both sides of the inequality are equal.

QED

References

1.
Steele JM. The Cauchy-Schwarz master class: An introduction to the art of mathematical inequalities. Cambridge ; New York: Cambridge University Press; 2004.
2.
DeGroot MH, Schervish MJ. Probability and statistics. 3rd ed. Boston: Addison-Wesley; 2002.
3.
Cauchy–schwarz inequality — Wikipedia, the free encyclopedia. 2021. Available: https://en.wikipedia.org/w/index.php?title=Cauchy-Schwarz_inequality&oldid=1024014552
4.
Weisstein EW. Schwarz’s inequality. From MathWorld–a wolfram web resource. Available: https://mathworld.wolfram.com/SchwarzsInequality.html