College of Science and Engineering, James Cook University
Last updated: 14 January 2022
In practice it is often impossible to know the true value of the measurand. If you knew the true value then you wouldn't bother doing a measurement! Every measurement is always subject to some level of uncertainty. Our goal this week is to learn how to rigorously handle measurement error. These concepts are essential in their own right, but also form the foundation of the sensor fusion methods that we will begin work on next week.
Recall that the central problem of sensing is to use measurements to obtain information about the measurand. This is fundamentally a question of statistical estimation. If the measurand is not changing, then each measurement is a random variable drawn from the same distribution. The randomness represents the measurement noise. (We will consider the case of a time-varying measurand next week.)
Review of probability and statistics
We will typically use a Bayesian interpretation of probability, i.e. probability represents a degree of belief. The probability reflects the degree to which a statement is supported by the available evidence.
Expected value
The expected value is the most likely outcome. In the context of measurement, it is the value that we believe best represents the true measurand based upon the available evidence.
Mathematically, it is defined as follows. Let X be a random variable defined by its probability density f(x). The expected value of X is
E[X]=Xˉ=∫−∞∞xf(x)dx.
Adjust the limits of integration if the probability density is defined over a different range. We will often write the expected value with a bar, e.g. Xˉ=E[X].
Sometimes we want the expected value of some calculated result instead of the expected value of the measurement itself. In this case the expected value of some function of X is given by:
E[g(X)]=∫−∞∞g(x)f(x)dx.
For a finite sample of measurements, the expected value is just the average computed in the usual way:
xˉ=N1i=1∑Nxi.
Variance and standard deviation
The variance represents how far samples are from the mean. There are two related quantities: σX2 is called the variance, and has units of the measurement squared. The standard deviation (with the same units as the measurement) is called σX. The standard deviation is the square root of the variance.
The definition of variance is:
σX2=E[(X−Xˉ)2].
Translated into words, this indicates the “average of the squared distance from the mean”.
For an entire population the variance can be calculated using
σX2=N1i=1∑N(xi−xˉ)2.
Note that the above definition applies to an entire population. Most often in statistics we deal with a finite sample drawn from the larger population, in which case the correct (“unbiased”) estimator of variance is
s2=N−11i=1∑N(xi−xˉ)2.
The proof of this estimator is outside the scope of this class, so refer to a statistics book for more details. The idea of the proof is to treat σX2 as a random variable (since it depends upon a random sampling of the broader population), then find the expected value of that random variable. The resulting algebra gives the correction factor of 1/(N−1).
Visual illustration of the mean and standard deviation
The impact of the mean and standard deviation can be explored using interactive Figure 1. This figure plots the normal distribution (also called the Gaussian distribution), which has probability density function
f(x)=2πσ21exp(σ2−(x−xˉ)2),
where xˉ is the expected value and σ is the standard deviation.
Drag the sliders to adjust the mean and standard deviation.
Correlation
Correlation is the tendency of two variables to exhibit a linear relationship. Mathematically:
ρX,Y=σxσyE[(x−xˉ)(y−yˉ)].
Correlation is always in the range −1≤ρ≤1. If ρ≈0 then the variables are said to be uncorrelated.
It is best explained visually (Figure 2). Drag the slider to see different levels of correlation.
Covariance
Covariance is similar to correlation but not normalised. While correlation is a dimensionless number between −1 and +1, covariance has units and there is no upper or lower limit.
Let σX,Y2 be the covariance between variables X and Y. Covariance is also written as cov(X,Y). The definition of covariance is as follows:
σX,Y2=E[(x−xˉ)(y−yˉ)]=ρX,Yσxσy.
Equation (10) shows that covariance is equivalent to the correlation multiplied by the standard deviation of each variable. The relationship to correlation provides an intuitive explanation, as shown in Figure 3. When two variables have nonzero covariance, the area of overlap is reduced, overall allowing for better precision in the joint measurement of the entire system.
Note that the covariance of a variable with itself is the variance:
cov(X,X)=σX2.
Since the covariance is defined between pairs of variables, it is convenient to list all pairwise combinations in a matrix. For instance, given two variables X and Y, the covariance matrix is of dimensions 2×2 and is given by:
In the general case of an arbitrary number of random variables, we can form the covariance matrix as follows. Firstly define a vector V containing the variables:
V=[v1v2⋯]T,
and then the covariance matrix is
ΣV=E[(V−E[V])(V−E[V])T].
Visual illustration of the covariance matrix for a 2D Gaussian distribution
Figure 4 shows a two-dimensional Gaussian with the specified correlation coefficient ρ and standard deviations σx and σy.
Error propagation
Defining error
There are two ways that we can define error:
Absolute error giving a range of possible values, e.g. v1=0.5±0.1 V, without specifying the likelihood of values within that range. The limits can be thought of as giving worst case scenarios.
Error characterised through a probability distribution, for example, as a normal distribution with a given standard deviation. The probability distribution gives the relative likelihood of each amount of error. We will typically use this approach.
Introduction to error propagation
Suppose that you perform a measurement and obtain a result x, however, this result is subject to uncertainty. You would like to use that measurement in some arbitrary calculation f(x). The problem of error propagation is to calculate the uncertainty in the calculated result f(x) given the uncertainty in x.
An intuitive picture is shown in Figure 5. The measurement x has error bars shown by the standard deviation σx. Our task is to calculate σy.
Derivation of the variance formula
Let x be a random variable with expected value xˉ and variance σx2. Let y=f(x) be a calculated value based on x. Notice that y is also a random variable.We seek an expression for the variance σy2.
Suppose we perform a measurement and obtain the result x0. Take a Taylor series expansion of f about x0.
y=f(x0)+∂x∂f∣∣x0(x−x0)+...
The notation ∂x∂f∣∣x0 means take the derivative then substitute x0. In other words, this quantity is just a number. Let J=∂x∂f∣∣x0 so that y≈f(x0)+J(x−x0).
Our goal is to find σy2=E[(y−yˉ)2]. We need an expression for yˉ.
The energy stored in a 1 μF capacitor is W=21CV2. You measure the voltage to be 3.6 V but your measurement is noisy. You know that your voltmeter’s inherent noise has a standard deviation of 0.1 V. What is the standard deviation of the energy?
Solution: the energy is
W=21(1×10−6)(3.6)2=6.48×10−6J.
Calculate the derivative and substitute the known values:
There is a simplification to the above method in the case that the variables are uncorrelated. The simplification (which you will prove in the tutorial questions) is as follows:
If y=f(x1,x2,...) then the uncertainty in y is given by:
σy=(∂x1∂f)2σx12+(∂x2∂f)2σx22+...
This is sometimes called the root sum of squares (RSS) method.
However, if the uncertainties are treated as absolute errors (instead of standard deviations) then another formula is sometimes used. This is the absolute error combination formula:
σy=∣∣∂x1∂f∣∣σx1+∣∣∂x2∂f∣∣σx2+...
Use the absolute error method only if the uncertainties are not derived from a standard deviation.
Introduction to calibration
Calibration is the process of estimating a sensor’s transfer function. Often there is prior knowledge of the functional form but some coefficients need to be adjusted to account for manufacturing variability, change in sensor properties over time, etc.
One point calibration
One point calibration is performed using one value of the measurand. Hence it is the simplest calibration method. Its role is to estimate bias (so it can be subtracted away).
This method assumes that the sensitivity and functional form of the transfer function are correctly known, and the only issue is with bias. Specifically this method addresses miscalibrations of the form yraw=yreference+b, where yraw is the actual measurement, yreference is the sensor response in the absence of bias, and b is the bias. Refer to Figure 6 for an illustration.
We calibrate the sensor by finding the bias b. If you have raw measurements obtained from your sensor, and reference measurements of the same measurand that are known to be accurate, then the bias is
b=E[yraw]−E[yreference].
The calibration for the sensor is then given by
ycalibrated=yraw−b.
Two point calibration
Two point calibration is appropriate for linear sensors. The idea is to measure the sensor response at two known points, and fit a straight line between the points (Figure 7).
The measurand can be determined using another sensor, or by measuring particular calibration samples whose characteristics are already known.
The calibrated transfer function is the equation of a straight line that passes through these two points:
ycalibrated=(x2−x1y2−y1)(x−x1)+y1.
Sensor noise can be handled by repeatedly sampling at the reference points. In this case, replace y1 and y2 with E[y1] and E[y2].
Conclusion
Systematic analysis of a sensor system requires characterisation of its measurement uncertainty. In this subject we will specify uncertainty with a covariance matrix, which represents the variance of each variable as well as information about linear correlations between variables.
The central result of this week is the method of propagating variance through calculations. Given the covariance of the measurements (Σx), the covariance of any calculated result y=f(x), provided that f is differentiable, is
Σy=JΣxJT,
where J is the Jacobian of the function f.
References
Clarence W. de Silva, Sensor Systems: Fundamental and Applications, CRC Press, 2017, Sections 6.4, 6.5 and Appendix C.