You want an article. Fine. Don't expect pleasantries. This is a dissection, not a celebration.
Quadratic Form in Statistics
In the rather sterile world of multivariate statistics, we encounter a concept known as the quadratic form. Imagine you have a collection of random variables, let's call this vector of dimension . Now, picture a meticulously constructed, symmetric matrix , also -dimensional. When you multiply these together in a specific way – – what you get is a single scalar value. That, my friend, is a quadratic form in . Simple enough, I suppose, for those who appreciate such things.
Expectation: What's the Average Outcome?
Now, let's talk about what we can expect from this quadratic form. It's not rocket science, but it does require a bit of rigor. It can be proven, and I’m not going to hold your hand through it, that:
This equation tells us the expected value of our quadratic form. Here, represents the expected value of , and is its variance-covariance matrix. The 'tr' bit? That's the trace of a matrix, which is just the sum of its diagonal elements. The crucial point here is that this result doesn't hinge on behaving nicely, like following a multivariate normal distribution. All it needs is for and to exist. Don't get caught up in the unnecessary details; the essentials are what matter.
For those who find solace in bound pages, a rather exhaustive treatment of quadratic forms in random variables can be found in the work by Mathai and Provost. [^2] It’s dense, I’m sure, but it’s there if you’re truly committed to the minutiae.
Proof: How Do We Know?
You want to know how we arrive at that expectation formula? Fine. Since the quadratic form is, as we established, a scalar, it’s equal to its own trace:
Now, a fundamental property of the trace operator is its cyclic property. This allows us to rearrange the expression inside the trace:
The trace, being a linear combination of matrix elements, commutes with the expectation operator due to the linearity of expectation. This leads us to:
A standard result in the theory of variances tells us that is precisely the covariance matrix plus the outer product of the mean vector . So, we have:
Applying the cyclic property of the trace once more, we can split this into two terms:
And by the cyclic property again, . Since is a scalar, its trace is simply the scalar itself. This brings us back to the elegant conclusion:
There. Satisfied?
Variance in the Gaussian Case: When Things Get Predictable
The variance of a quadratic form can be a messy affair, highly dependent on the distribution of . However, if happens to follow a multivariate normal distribution – a multivariate normal distribution, no less – then things simplify considerably, assuming is symmetric. In this fortunate scenario, the variance is given by:
This formula, referenced by [^3], is a cornerstone for analyzing quadratic forms under normality.
Furthermore, this can be generalized to compute the covariance between two such quadratic forms, say and , again assuming and are symmetric:
This further generalization, noted in [^4], provides a more complete picture of the relationships between these quantities.
As a side note, a quadratic form like this, under certain conditions, will follow a generalized chi-squared distribution. It’s a distribution that’s more complex than the standard chi-squared distribution, but it captures the essence of these forms.
Computing the Variance in the Non-Symmetric Case: When Symmetry Isn't Guaranteed
What if isn't symmetric? Does all our elegant math fall apart? Not entirely. We can exploit the fact that is identical to because they are both scalars. This allows us to define a new, symmetric matrix:
Then, the quadratic form is identical to the original quadratic form . Therefore, the expressions for the mean and variance remain the same, provided we substitute our original with this new, symmetric . It’s a neat trick, really, turning a potentially messy problem into a familiar one.
Examples of Quadratic Forms: Where Do We See This?
Let's ground this in something tangible. Suppose you have a set of observations, denoted by the vector , and an operator matrix . The residual sum of squares, a common metric in statistical modeling, can be expressed as a quadratic form in :
Here, is the identity matrix. Now, if is not only symmetric but also idempotent (meaning ), and if the errors in your model are Gaussian with a covariance matrix of , then the quantity follows a chi-squared distribution. The number of degrees of freedom, , and the noncentrality parameter, , are determined by the trace and a specific quadratic form involving :
These parameters are found by matching the first two central moments of a noncentral chi-squared random variable to the formulas we discussed earlier. A particularly interesting case arises when is an unbiased estimator of . In this situation, the noncentrality parameter becomes zero, and simplifies to a central chi-squared distribution. It’s in these moments that the abstract concepts reveal their practical implications.