EE3901/EE5901 Sensor Technologies Chapter 1 NotesIntroduction to Sensors and Measurement Uncertainty
The goal of this chapter is to learn the basic concepts of how to characterise sensors. This chapter is heavy of definitions. As a study tool, you are strongly encouraged to write your own glossary of terms that gives concise definitions of each of the concepts introduced here. We review and expand on your knowledge of probability and statistics, and apply this to the problem of measurement. We discuss how to propagate measurement error through calculations. We also briefly introduce calibration.
Introduction to sensors
A sensor is a device that measures some property of its environment. Examples of properties that we can measure include:
- Force
- Distance
- Speed
- Sound pressure level (loudness of audio)
- Chemical composition
- … and many more.
Discussion Question: What are other types of sensors can you think of?
All sensors are energy converters. Every measurement involves a transfer of energy. For example, a photodetector absorbs light energy and converts it into electric current. In this example, there is a continual transfer of energy from the light source to the sensor. However, in other cases, there may be an equilibrium condition where the net energy flow is zero. For example, consider a temperature sensor that is initially at room temperature and is then placed into a hot furnace. The temperature sensor must initially absorb heat from the furnace, which is a transfer of energy. Eventually the sensor comes to the same temperature as its new surroundings, at which point an equilibrium has been reached and there is no net flow of energy.
In this subject, we will focus on sensors that respond with an electrical signal as opposed to any other kind of response (e.g. optical, chemical, thermal, etc). For our purposes, an electrical signal is a voltage, current, charge, resistance, capacitance, or inductance. We will learn how to design interface circuits to make these electrical signals accessible to downstream electronics such as an analog-to-digital converter. The goal of these interface circuits is to amplify, convert, transmit, and ultimately make this electrical signal available in a more convenient and robust manner for use by another circuit.
Measurements
When discussing measurement, it is important to distinguish between two related quantities:
-
The measurand is the physical property that we seek to observe, for example, temperature, pressure, flow rate, etc. The measurand is the “true value” that exists in the world. When we perform a measurement, we are trying to learn some information about this true value.
-
The measurement is the result of using a sensor to observe the measurand. The measurement is always subject to some uncertainty. Hence we should always think of the measurement as merely being an estimate of the measurand.
If the measurand is denoted
The transfer function
The transfer function the mathematical relationship between the sensor stimulus
The transfer function is used to convert the sensor response into the actual measurement, e.g.
Example of a transfer function
A thermocouple is a type of temperature sensor which produces a voltage
where
You could solve this equation (using the quadratic formula) to obtain the measured temperature
Digital vs analog sensors
An analog sensor outputs a continuous electrical quantity such as voltage, current, resistance or capacitance. An digital sensor always chooses one output state at a time from a fixed set of possibilities. Very often this will be a signed or unsigned integer of a given number of bits. Recall that an n-bit integer has
Sensor characteristics
Sensitivity
An example transfer function showing a region of higher sensitivity and a region of lower sensitivity.
Zoom:Sensitivity is how large the response is for a given change in the measurand. It is the slope of the transfer function, as shown in Figure 1. Precisely, given a transfer function
where
The units of sensitivity will be (electrical quantity) per (physical quantity). For example:
- A displacement sensor may have a sensitivity of 10 V/mm.
- A pressure sensor may have a sensitivity of 80 mV/kPa.
For digital sensors, the sensitivity relates to “counts” per physical quantity or “least significant bits” (LSB) per physical quantity e.g. a light sensor with 50 counts/lux. You will also see this written as 50 digits/lux or 50 LSB/lux. This means the digital sensor’s output integer rises by 50 for every increase of 1 lux.
Resolution
The resolution is the smallest change in measurand stimulus that can be accurately detected.
For analog sensors, this is often limited by noise. A small change may be undetectable if it is indistinguishable from noise.
For digital sensors, assuming no problems of noise, the resolution will be defined by the sensitivity. Specifically,
The sensitivity is how many LSBs there are per unit of measurand; the resolution is much measurand there is per LSB.
Example 1.1
Consider a digital accelerometer. Accelerometer specifications are often referenced to the typical strength of Earth’s gravity,
Span or range
The meaning of full-span (FS) and full-span output (FSO), as shown on a sketch of a transfer function.
Zoom:The full span is the range of measurands that can be accepted by the sensor. The full span is also called the range of a sensor. This property is illustrated in Figure 2.
Often the maximum of the range is caused by physical limits of the underlying sensor, e.g. a pressure sensor will have a maximum rated pressure that it can withstand. The minimum value is often limited by the sensor’s resolution, but could also be affected by physical limits, for example, a temperature sensor may not operate at extremely low temperatures.
Error, bias, precision and accuracy
A measurement error is when the measured value differs from the measurand. In practice there is always some degree of measurement error because no sensor system can ever be perfect.
There are several ways to define error. In simplest terms, the error is the raw difference between the measured and true value:
The error has the same units as the measurement. A positive error means that the measurement is too large, and a negative error means that the measurement is too small.
The absolute error is the absolute value of the error:
The relative error is scaled in proportion to the magnitude of the true value:
The relative error is dimensionless, and is often represented as a percentage. In this case it can be called a percentage error. Some authors will take the absolute value of the relative error so that it is always a positive number.
The error is a property of a single specific measurement. However, it is often useful to “zoom out” and discuss the statistical properties of measurement errors that occur over multiple uses of the sensor. Let us define measurement as a probabilistic process. For instance, let
represent the probability of measuring
The probability distribution of obtaining a certain measurement given a particular true measurand. For the purposes of explaining the different definitions of accuracy, a single measurement is also indicated with a green circle. Labelled here are the key definitions related to measurement error and uncertainty.
Zoom:We define the key properties as follows.
Bias, also called trueness, is a constant offset in the sensor response. In the probabilistic interpretation of measurement,
where
Suppose you have a sensor with a simple linear transfer function of the form
Precision is how much information is gained from a single measurement. If there is a lot of measurement noise then a single measurement would have a wide range of uncertainty, and would be less precise. A formal definition is
The key point is that a precise sensor always gives the same result, regardless of whether that result is true.
However, the word ‘precision’ is also used to refer to the resolution of a measurement, especially when specifying the number of significant figures. It is important to be aware of the difference between limited resolution and random measurement noise.
The term accuracy is often used when discussing measurement. Unfortunately, there are several different definitions of accuracy, and its meaning is not always clear. Probably the most common definition of accuracy is that it means the same thing as ‘bias’. In this definition, an accurate sensor is one where the expected value of the measurement is close to the measurand. Notably, the sensor can be accurate (have low bias) even if it is not precise (i.e. not repeatable).
Another common meaning of ‘accuracy’ is as a qualitative label of how ‘good’ a sensor is. In this case, both bias and precision are important.
Finally, ‘accuracy’ is also sometimes used discuss a specific measurement (as opposed the entire sensor). In this case accuracy can be synonymous with ‘error’.
Noise
Noise is an unwanted signal that interferes with a measurement. The most common mathematical model of noise that we will consider is additive white Gaussian noise (AWGN).
Define
where
Importantly, the distribution of the noise has zero mean. If you find this objectional, consider the following. Suppose that there were a physical process that resulted in the additive noise term having a non-zero mean. Such a process could be treated as a bias and simply subtracted away. Once the bias is subtracted off, the remaining component necessarily has zero mean. Hence we can always apply a calibration process such that the only remaining noise is that with zero mean. Overall, the purpose of the noise model is to account for physical processes that cannot be removed by calibration.
The terminology “additive white Gaussian noise” indicates several essential properties of this type of noise:
- It is “additive”, meaning that the noise is mathematically added to the underlying signal, and hence linear relationships are preserved.
- It is “white”, meaning that it has no frequency dependence. In other words, it has a flat power spectrum across the entire measurement bandwidth. This implies that the noise cannot be easily filtered out with a suitable high-pass or low-pass filter.
- It is “Gaussian”, meaning that in the time domain, its values are sampled from a Gaussian (aka normal) distribution.
Example 1.2
Use the definition of root-mean-square (RMS) to prove that the RMS intensity of Gaussian noise is equal to its standard deviation.
Hint: RMS intensity, in the limit of large numbers of samples, is defined to be
and the standard deviation is defined to be
Solution
Let
Since the noise has zero mean, we have
We recognise this as the standard deviation, and therefore
Signal to noise ratio (SNR) and dynamic range (DR)
The signal to noise ratio (SNR) is a measurement of the power in the signal to the power in the noise. Generally if there is a large SNR then obtaining high precision measurements is straightforward, whereas if there is a low SNR then the measurement becomes more challenging and more subject to uncertainty.
The SNR is often expressed in decibels:
where
It is common to plot the transfer function on log-log axes when analysing signal to noise ratios. This is because of the algebraic identity
When the transfer function is plotted on log-log axes, the SNR and dynamic range have a simple geometric interpretation. Both are proportional to the distances indicated on this plot.
Zoom:The dynamic range (DR) is the ratio between the largest and smallest values of the measurement:
again where P is power and M is RMS magnitude.
In cases where the minimum is 0 (e.g. absolute scales like light intensity, sound pressure, etc), the lower value is the noise floor. In this case the DR is simply the SNR for the largest possible measurand:
For reference, human hearing has a DR of roughly 140 dB, and human eyesight has a DR of roughly 90 dB.
Example 1.3
For a given operating condition, a sensor outputs a constant voltage of 100 mV. However the measurement is corrupted by a additive Gaussian noise with a standard deviation of 18 μV. Find the SNR.
Solution
Recall from Example 1.1 that the standard deviation of AWGN is equal to its RMS magnitude. Hence the magnitude of the noise floor is 18 μV.
Hysteresis
The impact of hysteresis is that the transfer function is different when sweeping up vs sweeping down the range of measurands. The effect here is exaggerated for educational purposes. A common test for hysteresis is to vary the measurand at a fixed sweep rate in both positive and negative directions and plot the two measured transfer functions on the same axes for comparison.
Zoom:Hysteresis is a history dependence, meaning that different measurement results can be obtained despite the measurand being the same. Specifically the measurement depends upon the recent history of the sensor (as shown in Figure 5).
There are several reasons for hysteresis, for instance:
- Slow response time, so that the output is a weighted average over its recent history.
- Temperature changes in a sensing element.
- Backlash in gears, for instance in a rotation sensor that uses gears to couple to the axle being monitored. When the direction of rotation changes, there will be a small amount of movement in the new direction before the gear teeth ‘bite.’ This is called backlash, and will cause the rotation sensor to display hysteresis because the position of the sensor will be slightly offset depending upon the direction of rotation.
- Other chemical or physical properties changing in the sensing element in response to the environment being measured.
Hysteresis is sometimes deliberately introduced to avoid rapid switching near a threshold. A circuit called a Schmidt trigger is sometimes used for this purpose.
Response time
Sensors do not respond immediately to an input stimulus.
The response time or rise time is the time required to reach a given threshold, typically 90% of the final value. If the sensor’s underlying physics results in an exponential response (
Bandwidth
The bandwidth of a sensor is the frequency at which the output power has dropped by half from its DC (low frequency) value.
Zoom:If the measurand is time-varying, then it is important to consider whether the sensor can respond quickly enough to keep up with the system being measured. The ability of a sensor to track a changing measurand is determined by its bandwidth.
To formally define bandwidth, we consider the frequency response of the sensor system. Much like you can analyse the frequency response of a circuit such as a low-pass filter, so too can you define the frequency response of a sensor. The frequency response of a sensor gives the measurement response when the measurand input is a sinusoid of a given frequency. The frequency response of a typical sensor system is sketched in Figure 6.
Sensors typically have low-pass characteristics, i.e. there is some maximum frequency at which the output voltage or output current starts to drop. This defines the bandwidth of the sensor. Precisely, we define the bandwidth to be point at which the output power has dropped by half. Recall that half power corresponds to a drop of 3 dB.
Measurement is a statistical estimation problem
Recall that the central problem of sensing is to use measurements to obtain information about the measurand. This is fundamentally a question of statistical estimation. If the measurand is not changing, then each measurement is a random variable drawn from the same distribution. The randomness represents the measurement noise. (We will consider the case of a time-varying measurand in the next chapter.)
Review of probability and statistics
We will typically use a Bayesian interpretation of probability, i.e. probability represents a degree of belief. The probability reflects the degree to which a statement is supported by the available evidence.
Expected value
The expected value is the most likely outcome. In the context of measurement, it is the value that we believe best represents the true measurand based upon the available evidence.
Mathematically, it is defined as follows. Let
Adjust the limits of integration if the probability density is defined over a different range. We will often write the expected value with a bar, e.g.
Sometimes we want the expected value of some calculated result instead of the expected value of the measurement itself. In this case the expected value of some function of X is given by:
For a finite sample of measurements, the expected value is just the average computed in the usual way:
Variance and standard deviation
The variance represents how far samples are from the mean. There are two related quantities:
The definition of variance is:
Translated into words, this indicates the “average of the squared distance from the mean”.
For an entire population the variance can be calculated using
Note that the above definition applies to an entire population. Most often in statistics we deal with a finite sample drawn from the larger population, in which case the correct (“unbiased”) estimator of variance is
The proof of this estimator is outside the scope of this class, so refer to a statistics book for more details. The idea of the proof is to treat
Visual illustration of the mean and standard deviation
The impact of the mean and standard deviation can be explored using interactive Figure 7. This figure plots the normal distribution (also called the Gaussian distribution), which has probability density function
where
Drag the sliders to adjust the mean and standard deviation.
The probability density function of a one dimensional Gaussian distribution, for various values of mean and standard deviation.
Zoom:Correlation
Correlation is the tendency of two variables to exhibit a linear relationship. Mathematically:
Correlation is always in the range
It is best explained visually (Figure 8). Drag the slider to see different levels of correlation.
Random variables
Covariance
Covariance is similar to correlation but not normalised. While correlation is a dimensionless number between
Let
Equation (25) shows that covariance is equivalent to the correlation multiplied by the standard deviation of each variable. The relationship to correlation provides an intuitive explanation, as shown in Figure 9. When two variables have nonzero covariance, the area of overlap is reduced, overall allowing for better precision in the joint measurement of the entire system.
Storing information about covariance allows for more precise understanding of interactions between variables. The two variables
Note that the covariance of a variable with itself is the variance:
Since the covariance is defined between pairs of variables, it is convenient to list all pairwise combinations in a matrix. For instance, given two variables
This matrix is symmetric because
In the general case of an arbitrary number of random variables, we can form the covariance matrix as follows. Firstly define a vector
and then the covariance matrix is
Visual illustration of the covariance matrix for a 2D Gaussian distribution
Figure 10 shows a two-dimensional Gaussian with the specified correlation coefficient
Two dimensional Gaussian distribution with the specific covariance matrix. Experiment with this figure to gain an understanding of the meaning of correlation, variance, and covariance.
Zoom:Error propagation
Defining error
There are two ways that we can define error:
- Absolute error giving a range of possible values, e.g.
V, without specifying the likelihood of values within that range. The limits can be thought of as giving worst case scenarios. - Error characterised through a probability distribution, for example, as a normal distribution with a given standard deviation. The probability distribution gives the relative likelihood of each amount of error. We will typically use this approach.
Introduction to error propagation
Suppose that you perform a measurement and obtain a result
An intuitive picture is shown in Figure 11. The measurement
An intuitive explanation of why error propagation depends upon the derivative of the function. Notice that the error bars on
Derivation of the variance formula
Let
Suppose we perform a measurement and obtain the result
The notation
Our goal is to find
Consequently,
Therefore
This is the formula for propagating variance.
Example of variance propagation
The energy stored in a 1 μF capacitor is
Solution: the energy is
Calculate the derivative and substitute the known values:
Now use the variance propagation law
Variance propagation with multiple variables
Let
Represent these variables as a column vector
Let the expected values be
Let the covariance matrix of these variables be
Define some calculation
If
Derivation
Let
Define the Jacobian as:
where the measurement result
Then the Taylor series expansion of
Notice that the matrix product
Proceeding as above, we find
Hence
Hence the covariance matrix is
To summarise, the variance propagation law for a multivariable function
where
Example of multivariate error propagation
Given the function
find the standard deviation in
Solution
The state vector is
The covariance matrix is
Given that we have
Hence the variance in
Root sum of squares error vs absolute error
There is a simplification to the above method in the case that the variables are uncorrelated. The simplification (which you will prove in the tutorial questions) is as follows:
If
This is sometimes called the root sum of squares (RSS) method.
However, if the uncertainties are treated as absolute errors (instead of standard deviations) then another formula is sometimes used. This is the absolute error combination formula:
Use the absolute error method only if the uncertainties are not derived from a standard deviation.
Introduction to calibration
Calibration is the process of estimating a sensor’s transfer function. Often there is prior knowledge of the functional form but some coefficients need to be adjusted to account for manufacturing variability, change in sensor properties over time, etc.
One point calibration
One point calibration is performed using one value of the measurand. Hence it is the simplest calibration method. Its role is to estimate bias (so it can be subtracted away).
One point calibration uses a single point to estimate sensor bias.
Zoom:This method assumes that the sensitivity and functional form of the transfer function are correctly known, and the only issue is with bias. Specifically this method addresses miscalibrations of the form
We calibrate the sensor by finding the bias
The calibration for the sensor is then given by
Two point calibration
Two point calibration is appropriate for linear sensors. The idea is to measure the sensor response at two known points, and fit a straight line between the points (Figure 13).
Two point calibration fits a linear transfer function between two known points.
Zoom:The measurand can be determined using another sensor, or by measuring particular calibration samples whose characteristics are already known.
The calibrated transfer function is the equation of a straight line that passes through these two points:
Sensor noise can be handled by repeatedly sampling at the reference points. In this case, replace
Conclusion
In this chapter, we have introduced the key specifications and performance characteristics of sensor systems. It is important that you familiarise yourself with this terminology because it will be used in future weeks, especially when we compare and contrast different sensor types in the latter part of the course. We also pointed out the essential fact that every measurement is an estimation of the true state of the world. We have formalised our discussion of measurement uncertainty, and learned how to mathematically analyse the impact of measurement error on the calculations that we perform. Systematic analysis of a sensor system requires characterisation of its measurement uncertainty. In this subject we will specify uncertainty with a covariance matrix, which represents the variance of each variable as well as information about linear correlations between variables.
The central result of this week is the method of propagating variance through calculations. Given the covariance of the measurements (
where
References
Jacob Fraden, Handbook of Modern Sensors: Physics, Designs, and Applications, 5th edition, Springer, 2016, Chapter 3. Clarence W. de Silva, Sensor Systems: Fundamental and Applications, CRC Press, 2017, Chapter 5, Sections 6.4, 6.5 and Appendix C.