This post will be part of an ongoing academic project on GPS error. This is not exactly breakthrough research, but it’s a fun project that I’ll share here for anyone else who recognizes their inner geek, and seeing as today is Pi Day the timing is perfect! Many of us rely on GPS to either follow an existing path, record a path we’re currently travelling on, or perhaps to find a hidden treasure while out geocaching – and we certainly rely on it to point us to the nearest coffee shop or burger joint when in unfamiliar territory and in need of sustenance. It’s probably not news, but your GPS isn’t telling you the whole story. You can’t handle the truth!!! Or, can you? Let’s go there.

## Eyes In the Skies

Just very quickly let’s talk about what GPS actually is. “GPS” = Global Positioning System. Circling the Earth are a handful of satellites, each transmitting a radio signal. Our GPS devices receive those signals. In order to determine their position on the planet. A device must receive at least 3 satellites, and preferably more to do this successfully. GPS is actually a USA based system, but accessible around the world. Russia has deployed a similar system known as GLONASS. Since both systems provide worldwide coverage they fall under the umbrella of GNSS, Global Navigation Satellite Systems. Holy acronyms!

## GPS Error Overview

Like all measuring devices, GPS devices are prone to a certain amount of error. The error is due to several factors but as it turns out that error is often less than 6 meters from the actual positon. When navigating from one location to another, or finding the physical location of a specific latitude and longitude the error is not usually significant enough to be relevant. However in situations where distances along a path are being measured, there is a *cumulative effect*; the total distance reported by the GPS is likely to be over‐estimated as the number of recorded points increases. This situation is common in tracking activities such as hiking, running, or recording a path while travelling in a vehicle, bicycle, or other mode of transport. There are multiple contributing factors of error…

**Baseline Accuracy of the Device:**The baseline accuracy can also be viewed as the best possible accuracy. To quantify this value will usually require knowing which GPS chipset the device manufacturer has used (chipsets are almost always manufactured by a third party) and then finding the specification of the chipset. If in doubt, or for lack of good information, using a value of +/- 3 meters (radial) is probably a good “close guess”.

**Receiver Sensitivity:**Yet again, a chipset spec. The sensitivity will be given in the format of db, dbw, or dbmw (which are all interchangeable) and will appear as a negative number. This indicates how weak of a signal the receiver can pick up, so “more negative” is actually good when it comes to receiver sensitivity. Under heavy tree cover, in canyons, and in cities with tall buildings the signal can be degraded and difficult or impossible to receive (or to trust anyway).

**Poor Implementation:**Without being specific, there seems to be empirical evidence that the implementation of GPS technology by end-user device manufacturers (Garmin, Magellan, Epson, Nike, Samsung, Apple, etc…) has some bearing (possibly quite a bit) of what kind of performance to expect. This sort of thing is tough to gauge outside of user data / reviews / experiences posted on the web and such.

## Improving GPS Accuracy

There are several ways a GPS measurement can be improved, even if the GPS has a fix on an ideal number of satellites…

**Satellite Based Augmentation Systems:**SBAS for short, they are country specific systems that devices

*may*be able to receive (check the device’s spec, this would almost certainly be mentioned as a selling point). The USA has WAAS, Europe has EGNOS, Japan has MSAS, and there are and will be others. Devices that are able to leverage SBAS can have accuracy to the 2 meter range (radial), possibly even better.**Assisted GPS:**Also known as AGPS. A somewhat cryptic term, given that we’re now up to speed on the above. AGPS is usually associated with smartphones or other devices that are capable of communicating with cellular networks. You’ve seen the movies where the bad guy dismantles his cellphone battery after making that intimidating call to the police department… that’s because he doesn’t want them triangulating his position via cell signal. You might want to take advantage of that capability though if you live in a big city and like to go running. It’s obviously quite useless in the back country where there is no cell signal.

**Differential GPS:**DGPS (last new acronym for now) uses multiple land based receivers to correct GPS error. DGPS are part of the SBAS systems mentioned above. Stand alone DGPS are used by surveyors or in one off projects where high local accuracy is needed… not something the “civilian user” will encounter.

## Simulation

This Google Spreadsheet contains a simulation that can be tinkered with. Currently it’s a very simple model, but it demonstrates the concept at hand. Eventually a more advanced model or two will be added, and this post will be updated.

### Inputs

Polling Frequency:

The number of locations (points) recorded per minute. The time interval between recorded points varies from device to device, and in some cases can be set by the user.

Circular Error Probability:

A number given in meters that represents the radius of error of a measured GPS point. This is a specification of the GPS chipset. Since most device manufacturers do *not* manufacturer the chipset, and they do not disclose which chipset is used in a given device, it can be difficult to find this specification, however 3 meters is a decent “best guess” for the CEP value. This translates to the error being within a radius of 3 meters 50% of the time.

Average Velocity:

This input is strictly a parameter of the simulation. Its purpose is to determine the total number of points that will be recorded over a fixed distance. The faster the GPS unit travels, the fewer the recorded points for a given distance.

Total Distance:

The distance we are interested in measuring for our simulation(s). This parameter can demonstrate how error accumulates over greater distances vs. shorter distances.

### The Data

The relevant data regarding errors introduced is shown just below the inputs (top left of the spreadsheet). All the other data is generated by the simulation and is only left visible for those who are really interested in what’s going on under the hood. If that doesn’t sound like you, skip down to “Take Aways”.

Path:

In the real world no path will be a straight line, however it’s convenient to use one as our test path for purposes of a simulation. The spreadsheet will calculate the number of locations to measure based on the polling frequency, average velocity, and total distance (each are input parameters). With the total number of points determined the “Actual X” position is incremented by a fixed distance and the “Actual Y” value is held at 0, to represent an “actual path” moving in a straight line along the X‐axis. This can be pictured as walking down a straight road on the center dividing line.

Circular Error Probability (“CEP”):

By definition, CEP is a *distribution* with 50% of the points falling within the radius of error 0 to n meters, 43% of the points falling between n and 2n meters, and 7% of the points falling between 2n and 3n meters. This ratio is simulated with a random variable ranging between 1 and 100. The ranges 1‐50, 51‐97, and 98‐100 are used to determine the range of error for each recorded point. The spreadsheet references these ranges as simply 0, 1, or 2 respectively in the “CEP” column. There is also a sanity check displayed showing the actual percentages of CEP 0, 1, and 2 to ensure they closely follow the pattern of 50%, 43%, and 7% (they do, and the ratio varies slightly between trials which would also be expected in the real world).

So, the CEP determines the minimum and maximum possible *range of error* for a given recorded point. Armed with this, a random variable “r” is generated for each point which defines the error radius for that point. As a reminder, the error radius always lies within the range set forth by the determined CEP value of the point (in terms of 0 to n, n to 2n, or 2n to 3n where “n” is the input listed as “Circular Error Probability”). With the radius now defined, the last random variable “theta” represents the polar angle (0 to 2 pi) that determines where on the radius the point will lay. Finally, with a radius and polar angle in hand, the X and Y values of the point are determined, resulting in the “GPS X” and “GPS Y” coordinates – in other words, the point that the GPS *thinks* it is at.

Error Measurement:

The amount of error is shown in two different ways on the spreadsheet. First, the “XY Error” is determined by calculating the difference between the Actual (X,Y) point and the GPS (X,Y) point. Because this simulation uses a straight line to model the path, errors can only translate into positive values (the shortest distance between points is a straight line, so the actual path is always the shortest distance). The “GPS Distance” column shows the cumulative distance of the GPS path, which can be compared to the actual path. This distance is calculated by taking the distance between each of the GPS locations recorded, and adding it to the previous total distance.

Errors are visually represented using the scatter plot, which shows the random sampling of points in the XY plane (usually referred to as “lateral” or “horizontal” in GPS-speak) where the GPS device is located at 0,0. The line chart shows the Actual Path plotted in green, and the GPS Path plotted in red. Note: The scale of errors is exaggerated due to compression of the distance traveled (x‐axis). Visualize each chart as an aerial view and they’ll make a little more sense.

## Take Aways

In a theoretical world where our path is a straight line, we would only use two points to record it, minimizing GPS error. Adding any intermittent points would only introduce more error. In the real world we rarely travel in a straight line and we need to use more than just a few points to represent a path. This model, while very simple, does demonstrate how GPS error can accumulate over distances simply due to greater numbers of points being recorded. There is validity in the argument to use fewer points to measure a path however. As an additional thinking point, an illustration is included on the spreadsheet that represents a circular path. Several examples are shown, each with a different sampling frequency “f”. The line segments (which are more apparent when the value of “f” is small) represent the GPS Path, while the perfect circle represents the Actual Path. The sum of the length of line segments is divided by the circumference of the circle to provide a ratio of how accurate the measurement is for a given value of “f”. It becomes clear that at f=8 the representation of the distance around the circle is becoming very accurate, and by f=16 the GPS errors introduced will likely have a negative impact on the resulting measurement. In this situation we would ideally want to travel at a speed that in combination with the polling frequency would result in at least 8 and no more than 16 measured points.

## Limitations and Suggestions for Further Study

The main limitation of this model is that it can never result in an underestimated distance. While this is also uncommon in real world situations, it does happen. This has been known to occur when the traveled path has many tight curves or “switchbacks” and the device happens to measure a straighter path thru them due to the polling frequency and/or GPS errors that “shortcut” the path. Lastly, this study does not account for other factors that increase the probability of error such as weak signal, or those that can have less predictable effects on the resulting recorded data such as “path smoothing” which may be done at the time the GPS track is written to a file (either by the device, software, or by way of settings that the user may be unaware of). For further reading of potential interest, get comfortable, then hit this up.

Future study on this subject will include both the circular path example (mentioned above) as well as a more complex path with multiple curves in attempt to demonstrate randomness and the effect of short cutting. If you enjoy this type of geeky stuff (or not) please let us know in the comments. Insights on a technical level, or otherwise are always appreciated too! **-JD**