Measurement and Uncertainty
In the Parallax lab, you were asked to estimate the uncertainty in your
results, from your guesses about the sources of error in the measurement.
This lab explores, in a little more detail, how one can express the
level of confidence associated with a scientific measurement.
Introduction
All of our knowledge of the physical world is obtained from our
observations of that world,
and from extensions to those observations
that allow us to predict physical behavior. These extensions
are called theory, and theories must be testable
by measurements of predicted behavior.
So the process is a circle, from measurement to theory to
prediction to measurement again, which leads to improvement
in the theory, new predictions and tests. Each loop
increases our understanding, so it is perhaps useful to
think of the process more as a spiral, where every time
around we make a little progress.
For example: a simple observation is that the Sun "goes down"
sometimes—but, after giving us time for a nap, it
"comes up" again on the other side of the house.
A theory based on several such observations
would be something like "every time the Sun goes down, it will
come back up on the other side of the world." This can be
tested by watching. Better observations would include
measurements of the times between sunset and sunrise,
using some sort of clock, and measurements of the time that
the sun is "up." The theory could then be improved to provide
predictions of when the Sun would rise and set. You can see how
this understanding can get more precise, as we make measurements
from various locations on Earth, and at various times of "year."
Hmm, what is a "year"? This observation, that the days get
longer, then shorter, and that this cycle repeats too, can lead
to a whole new theory.
Crucial to this process is the understanding that no measurement
is exact: there is always some uncertainty in a measurement,
and a statement of the result of a measurement in incomplete
without a statement if its uncertainty. In fact, as we have
already seen, it is usually harder to establish the uncertainty
of a measurement than to make the measurement itself. Some parts
of the uncertainty have to do with the quality of the measurement
tools and our use of these tools; these effects can be explored
and quantified by making a set of repeated measurements. Other
sources of uncertainty have more to do with our understanding
of the measurement technique, and can be very difficult to evaluate.
The words "uncertainty" and "error" are used interchangeably
in this context. This is a misuse of the word "error", because
we never mean "mistake". Mistakes, like confusing the inches
scale with the cm scale, can be avoided and need to be rectified
before determining the experimental error.
First, we need a few definitions:
- Accuracy: This is the amount by which your measurement is
in fact different from the true value. In any interesting
situation, you will not know the "true" value, so there will be
no way to absolutely establish the measurement's accuracy.
- Precision: This is the extent to which you can specify
the exactness of a measurement. For example, a measurement given
as 12.14 +/- 0.01 cm is more precise than one given as 12 +/- 1 cm.
Higher precision does not necessarily imply higher accuracy.
- Statistical (random) Uncertainty: This is the inescapable
fact that every time you repeat a measurement, you will get
a slightly different value. The values will be distributed
about the mean (average) value, and the way they are distributed
can be used to establish the statistical uncertainty of the measurement.
We will explore techniques for handling statistical uncertainty
(random errors) in this lab.
- Systematic Uncertainty: Everything else that causes
your measurement to lose accuracy. This includes instrumental effects,
not taking things into account, and gross (stupid) errors.
Significant figures
There are two conventions that are used to communicate the
confidence level (or lack of confidence, the uncertainty)
in a measurement. First, the result of the measurement is
always accompanied by an explicitly stated value for the
uncertainty. The usual form is 12 +/- 1, where the characters "+/-"
are read "plus or minus" and are used here to cope with
the limitations of html. Secondly, the number of digits
used to express the result are chosen to properly reflect its
uncertainty. If your measurement gives the value 12.33, but
you believe the uncertainty is about 0.1, you must write it
as 12.3 +/- 0.1—that is, don't include digits whose
magnitude is smaller than the uncertainty.
Zeros can cause confusion. Leading zeros are not significant
figures: 0.00004 has one significant figure. Trailing zeros
without a decimal point may or may not be significant: 400 may
have one, two or three significant digits. This can be cleared
up by an explicit statement of the uncertainty (e.g. +/- 10),
or by putting in a decimal point: 400. is conventionally
taken to mean that there are three significant digits.
Expressing the number in powers of ten notation makes it easier
to tell which zeros are significant: 4 x 102 has
one significant figure; 4.00 x 102 has three.
The uncertainty is usually expressed as a single
digit (sometimes two), of the
same order of magnitude (i.e. same decimal place) as the last
significant digit of the value. For example:
8.45 +/- 0.03
10.0 +/- 1.5
5 +/- 2
The following numbers are all incorrectly written: write the
correct expression next to them.
83.45 +/- 0.023815
100.0 +/- 2
5 +/- 0.5
0.00034 +/- 0.0001
When you are combining numbers to get a result, as we did for
the parallax measurements, you can keep an extra digit for
the computations to avoid rounding errors. Your calculator keeps
lots of extra digits, of course. But be sure to trim off the
meaningless digits when you express the result. In particular,
converting from one unit to another does not change the uncertainty:
if you measure a length to be 15.5 +/- 0.5 feet and want to convert
it to cm, the value should be written as something like 470 +/- 15 cm
(0.5 feet is about 15 cm), even though the calculator says 472.44.
Notice that practically every single container in the grocery
store gets this wrong when they convert from fluid ounces to ml.
Systematic Errors
Systematic errors shift the measurements all one way. Incorrect
calibration of test equipment would be an example of a source
of systematic error. Actual variation of the thing you are measuring
would be another: a variable star's brightness cannot be measured
accurately without taking into account its variation. Or the length
of an object may depend on the ambient temperature or humidity.
A goal in any experiment is to reduce the magnitude of systematic
errors below the size of the random errors.
Random Errors
Random errors can occur for a variety of reasons, all of which
lead to the measurements fluctuating about a mean value.
Random errors may be reduced by improving the measurement
apparatus (like getting a more precise voltmeter) or the
technique (reading the scale with a magnifying glass), but they
cannot be eliminated. The size of the random uncertainty may
be obtained only by making a set of repeated, independent
observations.
Mean and Standard Deviation
The mean Xmean of a set of measured values Xi
is simply the sum of the Xi, divided by the number N
of measurements. This is the best estimate of the true value,
based on this set of measurements.
The variance of a set of measured values is
the average of the squared deviations from the mean:
variance = (sum of (Xi - Xmean)2) / N
and the standard deviation SD is the square root of the variance.
If the errors are truly random, and a fairly large number
of measurements are taken, they will scatter symmetrically about
the mean value, with more of them close to the mean and a smaller
number farther from the mean. This distribution is called a
Gaussian or normal distribution. In a normal distribution, 68%
of the measurements will lie within one standard deviation of the
mean and 95% of them will be within two standard deviations.
This means that if you make one more identical measurement,
it has a 68% probability of being within one standard deviation
of the previously calculated mean, and a 95% probability of
being within two standard deviations.
Calculate the mean and standard deviation of the
following set of numbers. Write them below.
74, 75, 79, 77, 74, 65, 64, 78, 75, 74
Standard Deviation of the Mean
When we average a set of measurements, we get a better estimate
of the true value than we have from a single measurement.
The parameter that expresses this improvement is called the
standard deviation of the mean (we'll call it SDM), which is
SD divided by the square root of N. So if N is 9, SDM = SD / 3.
You can see what this means: if you were to take another set
of measurements and calculate their mean, you expect that
you have a 68% likelihood that the second mean value would
be within one SDM of the first mean. And a 95% likelihood
that it would be within 2 SDM of the first mean.
Another way to express
this is that we are 95% confident that the true value (always
assuming, of course, that we've removed all the sources of
systematic error!) is between Xmean - 2 SD and
Xmean + 2 SD.
Assuming the above numbers are a sequence
of measured values, write down the 95% confidence interval
(lower bound and upper bound) within which the true
value is expected to be.
Absolute and Fractional Error
Suppose you measure your weight on a spring-type bathroom scale,
where the needle sticks a little and the readings are not all
exactly the same. You take several measurements and determine
that your weight is 105 +/- 2 lbs. The error estimate of 2 lbs
is the standard deviation of your set of measurements.
So we say the absolute error is 2 lbs—but the fractional
error is 2 / 105 or about 0.02 or two parts in 100. If you then
weigh your cat (you have an extremely docile cat) on the same
scale, you get a mean value of 10.5 lbs. But the readings
still vary the same amount, so the absolute error is the same: 2 lbs.
However, the fractional error is now 2 / 10.5, or about 0.2.
This kind of result is typical: to measure small values
precisely you often need better tools.
Combining Measurements
Suppose the result you want depends on more than one measurement--
like the parallax measurement where you needed to measure both
the baseline and the angle. (In fact that experiment also required
the measurement of the length of the cross-staff, though we
assumed its length to be known pretty accurately.) How do you
combine the individual uncertainties to get the uncertainty
in the result?
The rule is that when you add or subtract two or more measured values,
the absolute error in the result is the square root of the sum
of the squares of the individual absolute errors. And when you multiply
or divide two values, you do the same thing but using the fractional
errors: the fractional error of the result is the square root of
the sum of the squares of the individual fractional errors.
Given two measurements X = 10.0 +/- 0.7
and Y = 3.1 +/- 0.4, what are the uncertainties in
computed values A = X + Y and B = X / Y?
Laboratory Measurements
Angle measurement with a cross-staff
Use one of the A110L cross-staffs. Stand at the end of the hall and
use the cross-staff to measure the angular width of the door at
the far end of the hall. Make ten independent measurements and
record all the values. Try very hard to ignore previous values
when measuring or recording a subsequent value, because this only
works if the measurements are really independent. Move one paper
clip before taking each measurement, then adjust the other clip
to make the reading. Do each observation as carefully as you can,
but pay no attention to what you got for other observations.
Compute and record the mean and standard deviation. How does the
standard deviation compare with your previous estimate (in the
parallax lab) of how well you could read the cross-staff?
Compute the standard deviation of the mean. This is your estimate
of the uncertainty in the final, averaged result.
Volume of platinum brick
All the resources of the Astronomy 110L program are contained in
a platinum brick. We need to measure its volume carefully, so
we can determine if we can afford new flashlights for next term.
A precision ruler is available. We know that the volume of a
rectangular prism is V = L x W x H. To get the best possible
measurement, each person will use the precision ruler to measure
the three dimensions of the brick and record them. Please do this
completely independently and do not share your results.
For consistency, use units of cm, but record your measurement
to as much precision as you can manage. When
all the measurements are complete, we will tabulate them on the board.
Now copy the tabulated measurements, and for each dimension
compute the mean and SD, and SDM. Compute the volume of the brick
in cubic cm, and the value of the uncertainty in the volume.
Which dimension is most important to measure precisely?
Consider possible sources of systematic error in measuring the
volume of the brick. Not dumb mistakes, but real possibilities
that could cause the result to be systematically too large or
too small. List two of these, and suggest for each a way to
evaluate the error, or reduce its impact.
mickey@ifa.hawaii.edu
Last modified: April 7, 2005
http://www.ifa.hawaii.edu/users/mickey/ASTR110L_S05/measurement.html