See how the Tangram Vision Platform can radically accelerate your perception roadmap.
Table of Contents
One question that I have received time and time again in my career is
"How accurate is your calibration process?"
This always bothers me. If I answer "very accurate, of course!", what does that... mean? What should they expect from results? What am I even claiming?
Even more interesting: a question that I've never been asked, but that has equal or greater importance, is "How precise is this calibration?" Accuracy is always considered, while precision is never questioned. What makes this so?
There are probably a few reasons for this:
Both of these points are a shame; one should have a good understanding of accuracy and precision in order to get the most out of their perception systems, or any calibrated system for that matter. In order to do this, though, we need some context.
Caveat: Statistics is a very dense field of study. There's a lot of nuance here that won't get covered due to the inordinate amount of detail behind the subject. However, the thesis should hopefully be clear: most current calibration processes leave out some important stuff.
If you're the type of person to dive deep into this kind of thing, and are looking for a career in perception... you know where to find us.
Imagine that we have a set of points on a graph. We would like to predict the output value (the Y axis) of every input value (the X axis).
😎 We do this because we are totally normal and cool and have fun in our free time.
We can do this by fitting a model to the data. Good models give us a general idea of how data behaves. From the looks of it, these points seem to follow a straight line; a linear model would probably work here. Let's do this:
The input-output points that are on our graph are close to or on the line that we derived. This means that our linear model has a high accuracy given this data.
Yet there's something that's not shown in this graph. When we collected our datapoints, we used a tool that only measured so well. In other words, it was impossible to get exact measurements; we just used the best measurements we could get. This means that every datapoint has an uncertainty, or variance, associated with it. This is a representation of the precision of the input space.
When we factor in the input space variance, our graph of points really looks like this:
Now the linear model we derived looks a little less perfect, huh? Yet this is not reflected in our results at all. If every input point measurement was actually a bit different than what's shown, the true model for this data is something else entirely. All of our predictions are off.
The good news is that there are ways to compensate for this uncertainty. Through the magic of statistics, we can translate the input variance into variance of our learned model parameters:
This gives us a much richer understanding of our input-output space. Now we know that our model can be trusted much more with certain inputs than with others. If we're trying to predict in a region outside of our input data range, we have a good idea of how much we can trust (or be skeptical of) the data. This is called extrapolation; it's a hard thing to get right, but most people do it all the time without considering the precision of their model.
🖥️ You can try this for yourself by playing around with the source code used to develop these graphs. Find it at the Tangram Visions Blog repository.
There's a strong moral here: accuracy isn't everything. Our model looked accurate, and given the input data, it was! However, it didn't tell the whole story. We needed the precision of our model to truly understand our predictive abilities.
The same logic that we used in the line-fitting example applies directly to calibration. After all, calibration is about fitting a model to available data in order to predict the behavior of outputs from our inputs. We get better prediction with a better understanding of both the calibration accuracy and precision.
To demonstrate this, let's calibrate a camera using a target.
Our target provides metric data that we can use to connect the camera frame with the world frame; we call this metric data the object space. An image taken by the camera of the target captures a projection of the object space onto the image plane; this is called image space. We produce a new set of object space inputs to image space outputs with every image.
The object-to-image data in each camera frame act like the points in our linear fit example above. The camera model we refine with this data provides a prediction about how light bends through the camera's lens.
To dive into the mathematics of this process, see our blog post on camera calibration in Rust
Just like in our graph, the accuracy of our derived camera model is a comparison of predicted and actual outputs. For camera calibration, this is commonly measured by the reprojection error, the distance in image space between the measured and predicted point.
🚀 For those who are familiar with camera calibration: It's important to note that reprojection error calculated on the input data alone doesn't tell the whole story.
In order to actually get accuracy numbers for a calibration model, one should calculate the reprojection error again using data from outside of your training set. This is why some calibration processes put aside every other reading or so; it can use those readings to validate the model at the end of optimization.
A common rule of thumb is if the reprojection root mean squared error (RMSE) is under 1 pixel, the camera is "calibrated". This is the point where most calibration systems stop. Does anything seem off with this?
The answer is yes! Red flags abound, and they all lie in our uncertainties. Except this time, it's more than just our inputs; in camera calibration, we have knowledge, and therefore uncertainty, about our parameters as well.
Let's start with the object space, i.e. our input. Traditional calibration targets are mounted on a flat surface. This gives us metric points on a plane.
However, what happens when this board bends? This assumption of a planar metric space is no longer valid. Luckily, we can account for this through a variance value for each object space point.
This is a step that very few calibration processes take into account. Unless the target is mounted on a perfectly flat surface like a pane of glass, there will be inconsistencies in the plane structure. This will lead to a calibration that is imprecise. However, given that the target points are taken as ground truth, the calibration model will give the illusion of accuracy. Whoops.
🚀 "Perfectly flat" here is relative. If the deviation from the plane is small enough that we can't observe it... it's flat enough.
In our linear model example above, we didn't know what real-world effect our line was modeling. Any guess at our parameter values before the line fit would have probably been way off. But what if we were confident in our initial guesses? How would we convey this information to our model fitting optimization?
Good news: variances do this, too! Assigning variances to our parameters tells the model optimization how certain we are of the initial guess. It's a straightforward way to tune the optimization, as we'll see later.
As an example applied to camera calibration: the principal point of a camera lens is often found around the center of the image. We're fairly certain that our lens will follow this paradigm, so we can assign a low variance to these parameters. If we have enough of an understanding of our model, we can do this for every parameter value.
Now our optimization process has a bit more information to work with; it knows that some parameters have more room to deviate from their initial guess than others.
We can get a better idea of this effect by looking at its extreme. Let's assign all of our parameters a variance of 0.00, i.e. no uncertainty at all.
Now, when we calibrate over our object space with these parameter values, we find that... our model doesn't change! All of our parameters have been fixed.
This makes sense, since we indicated to the model optimization that we knew that our initial parameters had no uncertainty via the 0.00 variance value. Thus, a larger variance value allows the parameter value to move around more during optimization. If we're very uncertain about our guess, we should assign a large initial variance.
Our parameter variances will change as we fit our model to the input data. This becomes a good measure of model precision, just like in our linear fit example. We're now armed with enough knowledge to predict our model's behavior confidently in any scenario.
With all of that said, when is the last time you measured the precision in your calibration process? Odds are that the answer is "never". Most calibration systems don't include any of these factors! That's a huge problem; as we've seen, accuracy won't paint the whole picture. Factoring in precision gives
...among other useful features. Being aware of these factors can save you a lot of heartache down the road, even if you continue to use a calibration process that only uses accuracy as a metric.
You might have guessed, but Tangram Vision is already addressing this heartache through our SDK by providing both accuracy and precision for every calibration process we run. Calibration is a core part of our platform, and we hope to offer the best experience available. Did you find this post interesting? Tweet at us and let us know!
The Tangram Vision Platform lets perception teams develop and deploy faster.