Abstract: The accuracy of image based length and angle measurements made using StereoCore™ PhotoLog 3 was tested by processing a set of artificially generated photographs previously used to test StereoCore™ PhotoLog 2. The simulated core segments in the images were of known length, and the simulated structures had known (α, β) angles. The measured values were compared with the known values and the errors were calculated. It was confirmed that as in Version 2, StereoCore™ PhotoLog 3 measures α angles on average with an error of 1°, but that when the α angle is less than 5° the β measurement can be inaccurate. The error data was used to calculate summary statistics showing that length of core segments is measured with an average error of 4mm (0.8%) and plane poles are measured with an average error of 2.5° if α angles less than 5° are excluded, or 5.0° if α angles less than 5° are included.
We're about to release the latest and greatest version of StereoCore™ PhotoLog, so as part of our internal quality assurance we needed to run a few tests to make sure that the software was accurate. For this test, I used the same initial data as for the bench test of StereoCore™ PhotoLog version 2. The thinking was that at the same time as testing the new software I could check that the measurements were of a comparable standard to those of version 2, which I did confirm.
Precision and accuracy
Wikipedia has more on this topic, but basically precision refers to how repeatable your measurements are, and accuracy refers to how close you are to the actual values. If we think of an analogy of hitting a target then a precise but inaccurate gun will hit the same spot on the target but might not hit the bullseye, whereas an imprecise but accurate gun will on average hit the dead centre of the target, but the shots will be scattered all over the target. We usually want our tools to be both accurate and precise.
To measure accuracy we measure errors - the difference between the bullseye and where we hit on the target, and we average the errors. If we have a low mean error it means our gun is accurate.
To measure precision, we look at the "spread" of the shots. If they all hit in a tight grouping then we're happy that the gun is precise, if not, it is imprecise. In statistics the tightness of the grouping is reflected by the standard deviation σ. σ is a number in the same units as the original measurement, and provided that the data follows a "normal" distribution, then, quoting from Wikipedia: "About 68% of values drawn from a normal distribution are within one standard deviation σ away from the mean; about 95% of the values lie within two standard deviations; and about 99.7% are within three standard deviations."
Absolute and relative error
If I take a tape measure and measure a length which I know to be 1m (because I'm psychically able to measure distance), and it shows the length to be 0.99m, then my absolute error is 0.01m, or 1cm. Absolute error is the actual difference between the actual length and the measured length. If I now express that absolute error as a fraction of the actual length - by saying that the error is 1% - then I have stated a relative error. I like to state my relative errors as percentages.
Frequency, relative frequency and cumulative frequency
One of the things which I did a lot of in this analysis was I made lots of relative frequency histograms. I had a lot of error measurements and I wanted to get an idea of how well StereoCore™ PhotoLog was measuring things. Mean and standard deviation give you a reasonably good idea, but nothing beats a visual display.
Let's start with frequency, and stick to length measurements for now. For this bench test I took 300 length measurements and calculated the error associated with each. Now what I did was I counted (well, Excel did the counting) how many error measurements were in the range 0mm to 2mm, 2mm to 4mm and so forth.
To calculate relative frequency, all I do is normalize the previous measurements to the size of the sample, so that my data is now in terms of percentages. So for example I had 58 length measurements within 2mm of their actual values, which translates to 19% of the total of 300 measurements.
Cumulative frequency and cumulative relative frequency are also useful statistics. Again using the length measurements, 58 (19%) were within an error of 0mm to 2mm, and 99 (33%) were within an error of 2mm to 4mm, that means that 157 (52%) were within an error of 0mm to 4mm. Similarly 252 (84%) were within an error of 0mm to 6mm.
Bench testing method
As explained in previous articles on the bench testing method that I use, the basic idea is very simple. You want to know how well your tool measures things, so you measure a bunch of items - in our case core segments and structures - for which you already know the measurements (lengths and α, β angles) through some independent means, and then measure the items with your tool, and see how well the actual measurements compare with those from your tool.
Unfortunately getting reliable independent measurements of actual core segments and structures to the accuracy that I need for bench testing StereoCore™ PhotoLog is hard, so I made a compromise by generating artificial photographs (see Figure 1) with simulated core and structures, for which I could control the lengths and angles. To remove the element of human bias (if I know what lengths and angles I want to measure I will subconsciously manipulate the test), I randomized the core segment lengths and the structure plane pole directions.
The artificial photographs had simulated lens distortion and were taken from a randomized camera position as well, to more accurately simulate the kind of conditions that StereoCore™ PhotoLog is intended to be used in.
The generated photographs were undistorted and calibrated as per StereoCore™ PhotoLog standard operating procedure, and then segment lines and structures were marked on each structure. Once "logging" was complete, the data was exported to Excel.
The measured data was then compared to the original data and errors were calculated.
Mean absolute error was 4.0mm with a standard deviation of 2.1mm.
Mean relative error was 0.8% with a standard deviation of 0.4%.
84.0% of the data was within 6mm and 99.7% of the data was within 10mm.
Charts of the data are presented below.
I think in this case it's better to look at the absolute error rather than the relative error, because the length measurements are mainly limited by the operator of the software's ability to place the segment lines correctly, not by the length of the segment lines themselves - if the lengths had been on the order of a metre or so, then I expect that the absolute error histogram would look much the same - errors of 4mm or so, whereas if I looked at the relative error it would seem to have reduced from an average of 1% (when using lengths of around 0.5m) to an average of 0.5% (using lengths of roughly 1m).
In the table below I present two sets of statistics for angle measurements here, the one set includes α angles of less than 5°, and the other excludes them.
The really important columns of Table 1 are the ones labelled "Angle Diff". They show the summary statistics for the angular error between the measured and actual plane poles.
Again we can look at cumulative relative frequencies for the data as well. α angle measurements are the most impressive. For all data, 84% of measurements are within 2° and 97% are within 5°. For all data, the pole angle difference is less than 5° for 80% of the data and less than 10° for 96% of the data. For the data with α < 5 removed, the pole angle difference is less than 5° for 87% of the data and less than 10° for 100% of the data.
As always there are a few things to bear in mind with this data. Firstly, this is artificial data and the structures that one sees in the core shed rarely look as pleasantly elliptical as the ones in the generated photographs. Secondly I did not use a new dataset, I simply reused the same dataset as from the bench test I did for version 2. I will probably perform another bench test using freshly generated data at some later date. It's a little bit paranoid but the thinking behind this is that perhaps the dataset I used just so happened to be one which StereoCore™ PhotoLog was freakishly well suited to measure. Thirdly the generated structure images show the whole structure ellipse and I haven't yet done the test with structures showing only a partial ellipse.
The result of this bench test shows that StereoCore™ PhotoLog makes both precise and accurate length and angle measurements, although measurements which have an α angle of less than 5° should be treated with more caution since the β measurement can be inaccurate in this case. α angles are measured with consistent accuracy however, regardless of their magnitude.
StereoCore™ PhotoLog Lead Programmer
You can download a pdf copy of the bench test article here: