EE3901/EE5901 Sensor Technologies
Week 2 Notes
Measurement uncertainty

College of Science and Engineering, James Cook University

Last updated: 14 January 2022

In practice it is often impossible to know the true value of the measurand. If you knew the true value then you wouldn't bother doing a measurement! Every measurement is always subject to some level of uncertainty. Our goal this week is to learn how to rigorously handle measurement error. These concepts are essential in their own right, but also form the foundation of the sensor fusion methods that we will begin work on next week.

Measurement is a statistical estimation problem
Review of probability and statistics
Error propagation
Introduction to calibration
- One point calibration
- Two point calibration
Conclusion
References

Measurement is a statistical estimation problem

Recall that the central problem of sensing is to use measurements to obtain information about the measurand. This is fundamentally a question of statistical estimation. If the measurand is not changing, then each measurement is a random variable drawn from the same distribution. The randomness represents the measurement noise. (We will consider the case of a time-varying measurand next week.)

Review of probability and statistics

We will typically use a Bayesian interpretation of probability, i.e. probability represents a degree of belief. The probability reflects the degree to which a statement is supported by the available evidence.

Expected value

The expected value is the most likely outcome. In the context of measurement, it is the value that we believe best represents the true measurand based upon the available evidence.

Mathematically, it is defined as follows. Let $X$ be a random variable defined by its probability density $f(x)$ . The expected value of $X$ is

\begin{equation} E[X] = \bar{X} = \int_{-\infty}^{\infty}xf(x)dx. \end{equation}

Adjust the limits of integration if the probability density is defined over a different range. We will often write the expected value with a bar, e.g. $\bar{X}=E[X]$ .

Sometimes we want the expected value of some calculated result instead of the expected value of the measurement itself. In this case the expected value of some function of X is given by:

\begin{equation} E\left[g(X)\right]=\int_{-\infty}^{\infty}g(x)f(x)dx. \end{equation}

For a finite sample of measurements, the expected value is just the average computed in the usual way:

\begin{equation} \bar{x}=\frac{1}{N}\sum_{i=1}^{N}x_{i}. \end{equation}

Variance and standard deviation

The variance represents how far samples are from the mean. There are two related quantities: $\sigma_{X}^{2}$ is called the variance, and has units of the measurement squared. The standard deviation (with the same units as the measurement) is called $\sigma_{X}$ . The standard deviation is the square root of the variance.

The definition of variance is:

\begin{equation} \sigma_{X}^{2}=E\left[(X-\bar{X})^{2}\right]. \end{equation}

Translated into words, this indicates the “average of the squared distance from the mean”.

For an entire population the variance can be calculated using

\begin{equation} \sigma_{X}^{2}=\frac{1}{N}\sum_{i=1}^{N}\left(x_{i}-\bar{x}\right)^{2}. \end{equation}

Note that the above definition applies to an entire population. Most often in statistics we deal with a finite sample drawn from the larger population, in which case the correct (“unbiased”) estimator of variance is

\begin{equation} s^{2}=\frac{1}{N-1}\sum_{i=1}^{N}\left(x_{i}-\bar{x}\right)^{2}. \end{equation}

The proof of this estimator is outside the scope of this class, so refer to a statistics book for more details. The idea of the proof is to treat $\sigma_{X}^{2}$ as a random variable (since it depends upon a random sampling of the broader population), then find the expected value of that random variable. The resulting algebra gives the correction factor of $1/(N-1)$ .

Visual illustration of the mean and standard deviation

The impact of the mean and standard deviation can be explored using interactive Figure 1. This figure plots the normal distribution (also called the Gaussian distribution), which has probability density function

\begin{equation} f(x)=\frac{1}{\sqrt{2\pi\sigma^{2}}}\exp\left(\frac{-(x-\bar{x})^{2}}{\sigma^{2}}\right), \end{equation}

where $\bar{x}$ is the expected value and $\sigma$ is the standard deviation.

Drag the sliders to adjust the mean and standard deviation.

Figure 1: The probability density function of a one dimensional Gaussian distribution, for various values of mean and standard deviation.Zoom:
$\bar{x}$ = -5
$\sigma$ = 0.2

Correlation

Correlation is the tendency of two variables to exhibit a linear relationship. Mathematically:

\begin{equation} \rho_{X,Y}=\frac{E\left[(x-\bar{x})(y-\bar{y})\right]}{\sigma_{x}\sigma_{y}}. \end{equation}

Correlation is always in the range $-1 \leq \rho \leq 1$ . If $\rho \approx 0$ then the variables are said to be uncorrelated.

It is best explained visually (Figure 2). Drag the slider to see different levels of correlation.

Figure 2: Random variables $x$ and $y$ with the specified correlation.Zoom:
$\rho$ = -1

Covariance

Covariance is similar to correlation but not normalised. While correlation is a dimensionless number between $-1$ and $+1$ , covariance has units and there is no upper or lower limit.

Let $\sigma_{X,Y}^2$ be the covariance between variables $X$ and $Y$ . Covariance is also written as $\text{cov}(X,Y)$ . The definition of covariance is as follows:

\begin{align} \sigma_{X,Y}^{2} & =E\left[(x-\bar{x})(y-\bar{y})\right]\\ & =\rho_{X,Y}\sigma_{x}\sigma_{y}. \end{align}

Equation (10) shows that covariance is equivalent to the correlation multiplied by the standard deviation of each variable. The relationship to correlation provides an intuitive explanation, as shown in Figure 3. When two variables have nonzero covariance, the area of overlap is reduced, overall allowing for better precision in the joint measurement of the entire system.

Figure 3: Storing information about covariance allows for more precise understanding of interactions between variables. The two variables

x

and

y

each have some uncertainty according to their standard deviations

\sigma_x

and

\sigma_y

. (a) In the case of no correlation (and hence zero covariance), the entire area of

x

and

y

overlap must be considered as a plausible state of the system. (b) In contrast, when there is positive correlation, a high

x

implies a higher

y

and vice-versa. Therefore, the top-left and lower-right corners of the overlap region can be excluded as improbable. Knowledge of the covariance relationship allows for better state estimation by reducing the joint uncertainty. The covariance matrix is labelled

\Sigma

. Zoom:

Note that the covariance of a variable with itself is the variance:

\begin{equation} \text{cov}(X,X)=\sigma_{X}^{2}. \end{equation}

Since the covariance is defined between pairs of variables, it is convenient to list all pairwise combinations in a matrix. For instance, given two variables $X$ and $Y$ , the covariance matrix is of dimensions $2 \times 2$ and is given by:

\begin{align} \Sigma_{X,Y} & =\left[\begin{array}{cc} \sigma_{X}^{2} & \sigma_{X,Y}^{2}\\ \sigma_{Y,X}^{2} & \sigma_{Y}^{2} \end{array}\right]\\ & =\left[\begin{array}{cc} \sigma_{X}^{2} & \rho_{X,Y}\sigma_{X}\sigma_{Y}\\ \rho_{X,Y}\sigma_{X}\sigma_{Y} & \sigma_{Y}^{2} \end{array}\right] \end{align}

This matrix is symmetric because $\sigma_{X,Y}^{2}=\sigma_{Y,Z}^{2}$ .

In the general case of an arbitrary number of random variables, we can form the covariance matrix as follows. Firstly define a vector $\mathbf{V}$ containing the variables:

\begin{equation} \mathbf{V}=\left[\begin{array}{ccc} v_{1} & v_{2} & \cdots\end{array}\right]^{T}, \end{equation}

and then the covariance matrix is

\begin{equation} \Sigma_{\mathbf{V}} =E\left[\left(\mathbf{V}-E[\mathbf{V}]\right)\left(\mathbf{V}-E[\mathbf{V}]\right)^{T}\right]. \end{equation}

Visual illustration of the covariance matrix for a 2D Gaussian distribution

Figure 4 shows a two-dimensional Gaussian with the specified correlation coefficient $\rho$ and standard deviations $\sigma_x$ and $\sigma_y$ .

Figure 4: Two dimensional Gaussian distribution with the specific covariance matrix. Experiment with this figure to gain an understanding of the meaning of correlation, variance, and covariance.Zoom:
$\rho$ = -0.75
$\sigma_x$ = 1
$\sigma_y$ = 1

Error propagation

Defining error

There are two ways that we can define error:

Absolute error giving a range of possible values, e.g. $v_{1}=0.5\pm0.1$ V, without specifying the likelihood of values within that range. The limits can be thought of as giving worst case scenarios.
Error characterised through a probability distribution, for example, as a normal distribution with a given standard deviation. The probability distribution gives the relative likelihood of each amount of error. We will typically use this approach.

Introduction to error propagation

Suppose that you perform a measurement and obtain a result $x$ , however, this result is subject to uncertainty. You would like to use that measurement in some arbitrary calculation $f(x)$ . The problem of error propagation is to calculate the uncertainty in the calculated result $f(x)$ given the uncertainty in $x$ .

An intuitive picture is shown in Figure 5. The measurement $x$ has error bars shown by the standard deviation $\sigma_x$ . Our task is to calculate $\sigma_y$ .

Figure 5: An intuitive explanation of why error propagation depends upon the derivative of the function. Notice that the error bars on

y

are larger than the error bars of

x

due to the steep slope of the function in this region. Zoom:

Derivation of the variance formula

Let $x$ be a random variable with expected value $\bar{x}$ and variance $\sigma_{x}^{2}$ . Let $y = f(x)$ be a calculated value based on $x$ . Notice that $y$ is also a random variable.We seek an expression for the variance $\sigma_{y}^{2}$ .

Suppose we perform a measurement and obtain the result $x_0$ . Take a Taylor series expansion of $f$ about $x_0$ .

y=f(x_{0})+\left.\frac{\partial f}{\partial x}\right|_{x_{0}}\ (x-x_{0})+...

The notation $\left.\frac{\partial f}{\partial x}\right|_{x_{0}}$ means take the derivative then substitute $x_{0}$ . In other words, this quantity is just a number. Let $J=\left.\frac{\partial f}{\partial x}\right|_{x_{0}}$ so that $y \approx f(x_0) + J\,(x - x_0)$ .

Our goal is to find $\sigma_{y}^{2}=E\left[(y-\bar{y})^{2}\right]$ . We need an expression for $\bar{y}$ .

\begin{align*} \bar{y} & =E\left[y\right]\\ & \approx E\left[f(x_{0})+J\,(x-x_{0})\right]\\ & =E\left[f(x_{0})\right]+J\,E\left[(x-x_{0})\right]\\ & =f(x_{0})+J\,\left(E\left[x\right]-x_{0}\right)\\ & =f(x_{0})+J\,\left(\bar{x}-x_{0}\right). \end{align*}

Consequently,

\begin{align*} y-\bar{y} & =f(x_{0})+J\,(x-x_{0})\ -\ \left(f(x_{0})+J\,\left(\bar{x}-x_{0}\right)\right)\\ & =J\,(x-\bar{x}). \end{align*}

Therefore

\begin{align*} \sigma_{y}^{2} & =E\left[(y-\bar{y})^{2}\right] \\ & \approx E\left[\left(J\,(x-\bar{x})\right)^{2}\right] \\ & =\left(J\right)^{2}E\left[\left(x-\bar{x}\right)^{2}\right] \\ \end{align*}

\begin{align} \sigma_{y}^{2} & =\left(\left.\frac{\partial f}{\partial x}\right|_{x_{0}}\right)^{2}\sigma_{x}^{2}. \end{align}

This is the formula for propagating variance.

Example of variance propagation

The energy stored in a 1 μF capacitor is $W=\frac{1}{2}CV^{2}$ . You measure the voltage to be 3.6 V but your measurement is noisy. You know that your voltmeter’s inherent noise has a standard deviation of 0.1 V. What is the standard deviation of the energy?

Solution: the energy is

\begin{align*} W & =\frac{1}{2}\left(1\times10^{-6}\right)\left(3.6\right)^{2}\\ & =6.48\times10^{-6}\ \text{J}. \end{align*}

Calculate the derivative and substitute the known values:

\begin{align*} \frac{\partial W}{\partial V} & =CV\\ & =\left(1\times10^{-6}\right)\left(3.6\right)\\ & =3.6\times10^{-6}. \end{align*}

Now use the variance propagation law

\begin{align*} \sigma_{W}^{2} & =\left(\frac{\partial W}{\partial V}\right)^{2}\sigma_{V}^{2}\\ \sigma_{W} & =\sqrt{\left(3.6\times10^{-6}\right)^{2}\times0.1^{2}}\\ & =0.36\times10^{-6}\\ & =0.36\ \text{μJ}. \end{align*}

Variance propagation with multiple variables

Let $x_{1},x_{2},...,x_{n}$ be random variables (e.g. from different sensors).

Represent these variables as a column vector $\boldsymbol{x}=\left[\begin{array}{ccc} x_{1} & x_{2} & \cdots\end{array}\right]^{T}$ .

Let the expected values be $\bar{\boldsymbol{x}}=\left[\begin{array}{ccc} \bar{x}_{1} & \bar{x}_{2} & \cdots\end{array}\right]^{T}$ .

Let the covariance matrix of these variables be $\Sigma_{x}$ .

Define some calculation $\boldsymbol{y}=f(x_{1},x_{2},...,x_{n})=f(\boldsymbol{x})$ .

If $\boldsymbol{y}$ is a vector then we seek the covariance matrix $\Sigma_{y}$ . If $y$ is a scalar then $\Sigma_{y}$ is a 1x1 (i.e. a scalar) giving the variance $\sigma_{y}^{2}$ .

Derivation

Let $\boldsymbol{x}_{0}$ be a measurement result.

Define the Jacobian as:

J=\left[\begin{array}{ccc} \frac{\partial f_{1}}{\partial x_{1}} & \frac{\partial f_{1}}{\partial x_{2}} & \cdots\\ \frac{\partial f_{2}}{\partial x_{1}} & \frac{\partial f_{2}}{\partial x_{2}} & \cdots\\ \vdots & \vdots & \ddots \end{array}\right]_{\boldsymbol{x}=\boldsymbol{x}_{0}},

where the measurement result $\boldsymbol{x}_{0}$ is substituted into the Jacobian, so this is a matrix of numbers. If $\boldsymbol{y}$ is a scalar then the Jacobian is a row vector.

Then the Taylor series expansion of $f$ about the point $\boldsymbol{x}_{0}$ is

y=f(\boldsymbol{x}_{0})+J(\boldsymbol{x}-\boldsymbol{x}_{0})+\cdots

Notice that the matrix product $J\times(\boldsymbol{x}-\boldsymbol{x}_{0})$ produces all the first order terms in the multivariate Taylor series.

Proceeding as above, we find

\begin{align*} \bar{y} & =E\left[y\right]\\ & =E\left[f(\boldsymbol{x}_{0})+J(\boldsymbol{x}-\boldsymbol{x}_{0})\right]\\ & =E\left[f(\boldsymbol{x}_{0})\right]+E\left[J(\boldsymbol{x}-\boldsymbol{x}_{0})\right]\\ & =f(\boldsymbol{x}_{0})+JE\left[(\boldsymbol{x}-\boldsymbol{x}_{0})\right]\\ & =f(\boldsymbol{x}_{0})+J\left(E\left[\boldsymbol{x}\right]-E\left[\boldsymbol{x}_{0}\right]\right)\\ & =f(\boldsymbol{x}_{0})+J\left(\bar{\boldsymbol{x}}-\boldsymbol{x}_{0}\right). \end{align*}

Hence

\begin{align*} y-\bar{y} & =\left(f(\boldsymbol{x}_{0})+J(\boldsymbol{x}-\boldsymbol{x}_{0})\right)-\left(f(\boldsymbol{x}_{0})+J\left(\bar{\boldsymbol{x}}-\boldsymbol{x}_{0}\right)\right)\\ & =J(\boldsymbol{x}-\bar{\boldsymbol{x}}). \end{align*}

Hence the covariance matrix is

\begin{align*} \Sigma_{y} & =E\left[(\boldsymbol{y}-\bar{\boldsymbol{y}})(\boldsymbol{y}-\bar{\boldsymbol{y}})^{T}\right]\\ & =E\left[(J(\boldsymbol{x}-\bar{\boldsymbol{x}}))(J(\boldsymbol{x}-\bar{\boldsymbol{x}}))^{T}\right]\\ & =E\left[J(\boldsymbol{x}-\bar{\boldsymbol{x}})(\boldsymbol{x}-\bar{\boldsymbol{x}})^{T}J^{T}\right]\\ & =JE\left[(\boldsymbol{x}-\bar{\boldsymbol{x}})(\boldsymbol{x}-\bar{\boldsymbol{x}})^{T}\right]J^{T}\\ & =J\Sigma_{x}J^{T}. \end{align*}

To summarise, the variance propagation law for a multivariable function $\boldsymbol{y}=f(x_{1},x_{2},...,x_{n})$ is

\begin{equation} \Sigma_{y} = J\Sigma_{x}J^{T}, \end{equation}

where $J$ is the Jacobian matrix of the calculation evaluated at the measurement result, and $\Sigma_x$ is the covariance matrix of the measurements.

Example of multivariate error propagation

Given the function

P = 4a^2 + 2b

find the standard deviation in $P$ for the following conditions.

\begin{align*} a & =1.2\\ b & =0.8\\ \sigma_{a} & =0.1\\ \sigma_{b} & =0.05\\ \rho_{ab} & =0.3 \end{align*}

Solution

The state vector is

\boldsymbol{x}_{0}=\left[\begin{array}{c} 1.2\\ 0.8 \end{array}\right].

The covariance matrix is

\begin{align*} \Sigma_{x} & =\left[\begin{array}{cc} \sigma_{a}^{2} & \rho_{ab}\sigma_{a}\sigma_{b}\\ \rho_{ab}\sigma_{a}\sigma_{b} & \sigma_{b}^{2} \end{array}\right]\\ & =\left[\begin{array}{cc} \left(0.1\right)^{2} & \left(0.3\times0.1\times0.05\right)\\ \left(0.3\times0.1\times0.05\right) & \left(0.05\right)^{2} \end{array}\right]\\ & =\left[\begin{array}{cc} 0.01 & 0.0015\\ 0.0015 & 0.0025 \end{array}\right]. \end{align*}

Given that we have $P=4a^{2}+2b$ , the Jacobian is

\begin{align*} J & =\left[\begin{array}{cc} \frac{\partial P}{\partial a} & \frac{\partial P}{\partial b}\end{array}\right]_{\boldsymbol{x}_{0}}\\ & =\left[\begin{array}{cc} 8a & 2\end{array}\right]_{\boldsymbol{x}_{0}}\\ & =\left[\begin{array}{cc} 8\times1.2 & 2\end{array}\right]\\ & =\left[\begin{array}{cc} 9.6 & 2\end{array}\right] \end{align*}

Hence the variance in $P$ is

\begin{align*} \sigma_{P}^{2} & =J\Sigma_{x}J^{T}\\ & =\left[\begin{array}{cc} 9.6 & 2\end{array}\right]\left[\begin{array}{cc} 0.01 & 0.0015\\ 0.0015 & 0.0025 \end{array}\right]\left[\begin{array}{c} 9.6\\ 2 \end{array}\right]\\ & =0.9892. \end{align*}

Root sum of squares error vs absolute error

There is a simplification to the above method in the case that the variables are uncorrelated. The simplification (which you will prove in the tutorial questions) is as follows:

If $y=f(x_{1},x_{2},...)$ then the uncertainty in $y$ is given by:

\begin{equation} \sigma_{y}=\sqrt{\left(\frac{\partial f}{\partial x_{1}}\right)^{2}\sigma_{x_{1}}^{2}+\left(\frac{\partial f}{\partial x_{2}}\right)^{2}\sigma_{x_{2}}^{2}+...} \end{equation}

This is sometimes called the root sum of squares (RSS) method.

However, if the uncertainties are treated as absolute errors (instead of standard deviations) then another formula is sometimes used. This is the absolute error combination formula:

\begin{equation} \sigma_{y}=\left|\frac{\partial f}{\partial x_{1}}\right|\sigma_{x_{1}}+\left|\frac{\partial f}{\partial x_{2}}\right|\sigma_{x_{2}}+... \end{equation}

Use the absolute error method only if the uncertainties are not derived from a standard deviation.

Introduction to calibration

Calibration is the process of estimating a sensor’s transfer function. Often there is prior knowledge of the functional form but some coefficients need to be adjusted to account for manufacturing variability, change in sensor properties over time, etc.

One point calibration

One point calibration is performed using one value of the measurand. Hence it is the simplest calibration method. Its role is to estimate bias (so it can be subtracted away).

Figure 6: One point calibration uses a single point to estimate sensor bias. Zoom:

This method assumes that the sensitivity and functional form of the transfer function are correctly known, and the only issue is with bias. Specifically this method addresses miscalibrations of the form $y_\text{raw}=y_\text{reference}+b,$ where $y_\text{raw}$ is the actual measurement, $y_\text{reference}$ is the sensor response in the absence of bias, and $b$ is the bias. Refer to Figure 6 for an illustration.

We calibrate the sensor by finding the bias $b$ . If you have raw measurements obtained from your sensor, and reference measurements of the same measurand that are known to be accurate, then the bias is

\begin{equation} b=E\left[y_\text{raw}\right]-E\left[y_\text{reference}\right]. \end{equation}

The calibration for the sensor is then given by

\begin{equation} y_\text{calibrated}=y_\text{raw}-b. \end{equation}

Two point calibration

Two point calibration is appropriate for linear sensors. The idea is to measure the sensor response at two known points, and fit a straight line between the points (Figure 7).

Figure 7: Two point calibration fits a linear transfer function between two known points. Zoom:

The measurand can be determined using another sensor, or by measuring particular calibration samples whose characteristics are already known.

The calibrated transfer function is the equation of a straight line that passes through these two points:

\begin{equation} y_\text{calibrated} = \left(\frac{y_2 - y_1}{x_2 - x_1}\right)(x - x_1) + y_1. \end{equation}

Sensor noise can be handled by repeatedly sampling at the reference points. In this case, replace $y_1$ and $y_2$ with $E[y_1]$ and $E[y_2]$ .

Conclusion

Systematic analysis of a sensor system requires characterisation of its measurement uncertainty. In this subject we will specify uncertainty with a covariance matrix, which represents the variance of each variable as well as information about linear correlations between variables.

The central result of this week is the method of propagating variance through calculations. Given the covariance of the measurements ( $\Sigma_x$ ), the covariance of any calculated result $\mathbf{y} = f(\mathbf{x})$ , provided that $f$ is differentiable, is

\Sigma_{y} = J\Sigma_{x}J^{T},

where $J$ is the Jacobian of the function $f$ .

References

Clarence W. de Silva, Sensor Systems: Fundamental and Applications, CRC Press, 2017, Sections 6.4, 6.5 and Appendix C.