GPSD Numbers Matter

ABSTRACT

Geodesy tortures numbers harder than almost any other discipline. It requires that very large numbers be known to very small precision. This leads to some unexpected, sometimes perplexing, choices in how gpsd handles numbers. This white paper will explore many of those choices.

Numeric Precision

Computers work with either floating point numbers in [IEEE754] format or binary numbers. Binary numbers are exact, floating point numbers are not. For example, the quantity 1/10 (one tenth) can not be represented exactly in IEEE 754. IEEE 754 is also known as IEC 60559,

An IEEE 754 single precision (float) has 7.22 decimal digits of precision (24 binary bits of precision).

An IEEE 754 double precision (double) has 15.95 decimal digits of precision (53 binary bits of precision)

Appendix F of the C99 standard requires the use of strict IEEE 754. Any standard conforming floating point program will result in the exact same results, regardless of distribution, architecture, C compiler, etc. C headers include "#define STDC_IEC_559 1" to indicated standards compliance. See C99 Section 6.10.8.2.

Beware: since about 2011, some distributions have decided to trade a bit of standards compliance for a bit of speed. C allows this deviant behavior as long as the C headers do not define "STDC_IEC_559" as 1.

gpsd regression tests only pass when "STDC_IEC_559" is 1.

printf() format %f defaults to %.6f. Below you will see how that may cause problems.

Python floats are very similar to IEEE 754 doubles, plus some annoying bugs related to [NaN]. More on that later.

A 32 bit integer has 9.33 decimal digits of precision. That is 31 binary digits of precision, plus a sign bit.

A 64 bit integer has 18.96 digits of decimal precision. That is 63 binary digits of precision, plus a sign bit.

Latitude and Longitude

Your GNSS receiver starts with really big, numbers. Like the Earth’s polar radius: 6356752.314245 m. Then with the help of a lot of math, computes your position to a high precision. The u-blox ZED-F9P reports 0.1 mm (1e-9 or 0.000000001 degree) precision. That is about 12 decimal digits of precision. It is certainly not that accurate, maybe soon. No matter, gpsd wants to not lose the precision of the data it is given.

Table 1. Degrees
Resolution D	D	DM	DMS	Distance	Units
1.0E+00	1°	1° 0'	1° 0' 0"	111.319	km
1.0E-01	0.1°	0° 6'	0° 6' 0"	11.132	km
1.0E-02	0.01°	0° 0.6'	0° 0' 36"	1.113	km
1.0E-03	0.001°	0° 0.06'	0° 0' 3.6"	111.319	m
1.0E-04	0.0001°	0° 0.006'	0° 0' 0.36"	11.132	m
1.0E-05	0.00001°	0° 0.0006'	0° 0' 0.036"	1.113	m
1.0E-06	0.000001°	0° 0.00006'	0° 0' 0.0036"	111.319	mm
1.0E-07	0.0000001°	0° 0.000006'	0° 0' 0.00036"	11.132	mm
1.0E-08	0.00000001°	0° 0.0000006'	0° 0' 0.000036"	1.113	mm
1.0E-09	0.000000001°	0° 0.00000006'	0° 0' 0.0000036"	111	µm

Table 2. Degrees Minutes
Resolution M	D	DM	DMS	Distance	Units
1.0E+00	0.016…°	0° 1'	0° 1' 0"	1.855	km
1.0E-01	0.0016…°	0° 0.1'	0° 0' 6.0"	185.533	m
1.0E-02	0.00016…°	0° 0.01'	0° 0' 0.6"	18.553	m
1.0E-03	0.000016…°	0° 0.001'	0° 0' 0.06"	1.855	m
1.0E-04	0.0000016…°	0° 0.0001'	0° 0' 0.006"	185.533	mm
1.0E-05	0.00000016…°	0° 0.00001'	0° 0' 0.0006"	18.553	mm
1.0E-06	0.000000016…°	0° 0.000001'	0° 0' 0.00006"	1.855	mm
1.0E-07	0.0000000016…°	0° 0.0000001'	0° 0' 0.000006"	185	µm

Table 3. Degrees Minutes Seconds
Resolution S	D	DM	DMS	Distance	Units
1.0E+00	0.00027…°	0° 0.016…'	0° 0' 1"	30.922	m
1.0E-01	0.000027…°	0° 0.0016…'	0° 0' 0.1"	3.092	m
1.0E-02	0.0000027…°	0° 0.00016…'	0° 0' 0.01"	309.221	mm
1.0E-03	0.00000027…°	0° 0.000016…'	0° 0' 0.001"	30.922	mm
1.0E-04	0.000000027…°	0° 0.0000016…'	0° 0' 0.0001"	3.092	mm
1.0E-05	0.0000000027…°	0° 0.00000016…'	0° 0' 0.00001"	309	µm

Source: [DD]

u-blox firmware since at least protocol version 4 (Antaris 4) has reported latitude and longitude to 0.0000001° (1e-7) with the UBX-NAV-POSLLH message. At that time, 1e-7 was wildly optimistic.

Starting with protocol version 20, u-blox devices with High Precision firmware support the UBX-NAV-HPPOSECEF message. That message reports to 0.1 mm precision.

Note this will not fit in a 32-bit integer or single precision float, but fits easily in a 64-bit integer or double precision float.

NMEA

More interesting is how precision relates to how NMEA reports latitude and longitude. NMEA sentences report those in ddmm.mmmmmm. Where d is in degrees and m is in minutes with a decimal fraction.

You can use Table 2 (Degrees Minutes) to see how the decimal fraction of minutes relates to NMEA reported latitude and longitude.

7 places after the decimal point is almost as precise as a ZED-F9P with the High Precision (HP) firmware can report.

printf() format %f defaults to %.6f, which will reduce the ZED-F9P precision of latitude and longitude. So print with %.7f, or even %.9f, if you have a survey grade GPS.

See [DD] for more information on Decimal Degrees and precision.

Altitude

Altitude, once you decide which altitude is which, is numerically easy for gpsd. The u-blox F9P can report altitudes (HAE and MSL) to a precision of 0.1 mm. [CoCom] limits the maximum altitude of civilian GNSS receivers to a maximum altitude of 18,000 m. That fits comfortably in 28 binary digits. 29 if you want to go to below Sea Level. This will not fit in a "float", but fits in a 32 integer (scaled) or a "double".

If you are a rocket scientist, and can get a GNSS receiver that works in Geostationary orbit (35,786 km) then you would need 39 binary digits of precision. This will not fit in a "float" or a 32 integer but fits in a 64 bit integer (scaled) or a "double".

Time

In the "Latitude and Longitude" section above we learned that C doubles are just fine for holding position information. The same can not be said for "Time". There is loss of precision when storing time as a double!

A double is 53 significant bits.
POSIX time to nanoSec precision is 62 significant bits
POSIX time to nanoSec precision after 2038 is 63 bits
POSIX time as a double is only micro second precision

That is why POSIX time as a double and PPS do not play well together.

WARNING: Loss of precision telling time as a double!

That is why gpsd tells time using struct timespec. That look like this:

  struct timespec {
      time_t  tv_sec;   // Seconds
      long    tv_nsec;  // Nanoseconds
  };

time_t is usually a 64-bit integer. Older systems, and some 32-bit systems, define time_t as a 32-bit integer, which is deprecated. A 32-bit integer will overflow at: 03:14:07 UTC on 19 January 2038. Plan for that apocalypse now. Source: [Y2038]

In 2021 cosmologists believe the age of the universe is just under 14 billion years. That is 4.4 e17 seconds, which fits comfortably into a 59 bit unsigned integer. A 64-bit time_t will be good enough for a while longer.

The other part of timespec_t is a long, carrying the nanosecond part of the time stamp. In 2021, a GNSS receiver can tell you the start of every second to about 1 nano second (1 e-9 seconds) accuracy. That fits comfortably into a 30 bit unsigned integer. As long integer in C is always at least 32 bits.

A timespec_t fails when you need to measure time to better than 1 nano second. The SI second is defined as 9,192,631,770 cycles of radiation from a caesium-133 (Cs) atom. Close to 0.1 nano seconds. That requires a 34 bit unsigned integer.

In 2021 the smallest frequency difference that can be measured is about 1 second in 400 million years, one part in about 1.23 e16. That involves clocks composed of strontium atoms, and measuring time differences with optical combs. The time difference between those two is thus 7.9 e-17 seconds per second. Needing a 54 bit unsigned integer fraction of a second to hold it.

Time Accuracy

Just because gpsd can represent a number does not mean the number means what you think it does. The u-blox ZED-F9T data sheet says it can output absolute PPS to 5ns. But the fine print says: "1-sigma, fixed position mode, depends on temperature, atmospheric conditions, baseline length, GNSS antenna, multipath conditions, satellite visibility and geometry".

There are many distinct embodiments of Universal Coordinated Time (UTC). In the USA there are two, one kept by the National Institute of Standards and Technology (NIST] and one by the US Naval Observatory (USNO). These are referred to as UTC(NIST) and UTC(USNO). The primary source for UTC(NIST) is in Fort Collins Colorado. Their secondary (backup) source is in Gaithersburg Maryland. According to [NIST2187], in 2021, the secondary UTC(NIST) site is only plus/minus 25 nano seconds aligned with the primary source. Don’t expect to tell time better than the NIST.

UTC(USNO) supplies the master clock for the GPS system. In 2020, NIST said that UTC(USNO) differed from UTC(NIST) by plus/minus 20 nano seconds. See [NIST-USNO]. Even if you could track GPS time perfectly, and it tracked UTC(USNO) perfectly, you are still off by plus/minus 20 nano seconds.

The biggest obstacle to gpsd and ntpd keeping accurate time is the granularity of the local host clock. The gpsd release includes a program called clock_test, and the NTPsec release includes a program in the attic called clocks. Both can characterize your system clock.

Using these programs you can determine the granularity of you system clock. Some examples:

CPU

Clock speed

Clock granularity

Standard deviation

Raspberry Pi 3B

1.2GHz

155 ns

120 ns

Raspberry Pi 4B

1.5GHz

56 ns

90 ns

Xeon E5-1620 v3

3.50GHz

14 ns

46 ns

Core i5-4570

3.20GHz

11 ns

231 ns

Core i7-8750H

2.2GHz

18 ns

19 ns

Ryzen 5 3600

3.6 GHz

10 ns

60 ns

Consider these best cases. Any load, reduced clock speed, I/O interrupts, interrupt latency, etc. will reduce the accuracy with which he system clock can be read or set. Your goal, and that of NIST stated in [NIST2187], is that you can tell time to less than 1 micro second.

Not Your Grandmother’s Pi

The Babylonians knew about Pi 4,000 years ago. You know pi as an irrational number that randomly continues forever. Slowly converging on the ratio between the radius and the circumference of a circle. At the start of the 20th century, about 500 digits of pi were known.

Here are the first 74 digits of pi:

3.1415926535897932384626433832795028841971693993751058209749445923078164062…

Forget that pi. GNSS systems do not use that pi.

USA Space Systems Command has decreed, in [IS-GPS-200], that pi is exactly 3.1415926535898. 45 binary digits of precision, about 14 decimal digits of precision. Galileo [OS_SIS_ICD], BeiDou [ICD_B1I], and QZSS [IS-QZSS] also mandate that pi.

Of course GLONASS <ICD-GLONASS>> has a different opinion. It uses 3.14159265358979. 49 binary digits of precision. One more decimal digit than GPS, et. al.

Both flavors of pi fit comfortably in the 53 bits of binary precision in an [IEEE754] double.

How much does this matter? SBAS satellites orbit in a Geosynchronous Equatorial Orbit (GEO). The radius of a GEO orbit is 42,164 km. The circumference of that orbit, computed using the GPS pi, and computed with the 74 digit pi above,, differ by 0.285 microns (micro meters).

The circumference of that orbit, computed using the GPS pi, and the GLONASS pi, differs by 0.422 microns (micro meters). Worse than using GPS pi.

According to the [IS-GPS-200] an orbit error of one meter equates to a position error of about one meter. So we can assume, for now, the selection of pi to use (GPS, GLONASS, etc.) is of minimal importance. Unless you are studying gravity waves.

NaN ain’t your Nana

The most obviously confounding choice is the use in gpsd of NaNs (Not A Number). gpsd keeps track of a large number of individual numbers, most of them are invalid at any one time. To keep track of which integers are valid, a bit field is used. When an integer is valid, a corresponding bit is set. Keeping track of which bit matches which integer is challenging. NaNs from [IEEE754] eliminate that problem with floating point numbers.

When gpsd marks a floating point number invalid, it sets the value to [NaN]. So before using any gpsd floating point number, check that it is valid. Standard C obeys [IEEE754]. Some distributions turn off strict compliance in exchange for a bit of speed. Those distributions will fail the gpsd regression tests. Python, and Golang, sort of obey [IEEE754], enough for our purposes.

C NaN

A little C program will make the behavior of [NaN] easy to see:

1
2
3
4
5
6
7
8
9
// Compile with: gcc nan.c -o nan
#include <stdio.h>     // for printf()

int main(int argc, char **argv)
{
    double a = 1.0 / 0.0;
    double b = -1.0 / 0.0;
    printf("a: %f b: %f\n", a, b);
}

What do you expect to see when that program is run? Try it:

~ $ gcc nan.c -o nan
~ $ ./nan
a: inf b: -inf

1.0 divided by 0.0 is infinity. -1.0 divided by 0.0 is negative infinity.

Any program that printed out a lot of "inf" or -inf" would annoy the users and they would complain. To avoid that, gpsd clients check, and print out "n/a" instead.

Here is a common solution:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Compile with: gcc nan.c -o nan
#include <math.h>      // for isnan()
#include <stdio.h>     // for printf()

int main(int argc, char **argv)
{
    double a = 1.0 / 0.0;
    if (isnan(a)) {
        printf("a: n/a\n");
    } else {
        printf("a: %f\n", a);
    }
}

What do you expect to see when that program is run? Try it:

~ $ gcc  nan.c -o nan
~ $ ./nan
a: inf

Whoops. All [NaN]s are not [NaN]s! Very confusing, rather than try to explain, I’ll send you to the Wikipedia explanation: [NaN]. But there is a simple solution. We do not really care if a number if [NaN], or if it is infinity. We care that it is finite, and that is easy to test for:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Compile with: gcc nan.c -o nan
#include <math.h>      // for isfinite()
#include <stdio.h>     // for printf()

int main(int argc, char **argv)
{
    double a = 1.0 / 0.0;
    if (isfinite(a)) {
        printf("a: %f\n", a);
    } else {
        printf("a: n/a\n");
    }
}

What do you expect to see when that program is run? Try it:

~ $ gcc  nan.c -o nan
~ $ ./nan
a: n/a

Exactly the desired result. Now you know why isfinite() is all over gpsd client code.

Python NaN

Python is similar, it almost follows [IEEE754], but has many undocumented "features" that conflict with [IEEE754]:

# python
>>> a = 1.0 / 0.0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: float division by zero

For shame. It does provide a sideways method to set a variable to various [NaN]s:

~ $ python
>>> Inf = float('inf')
>>> Ninf = float('-inf')
>>> NaN = float('NaN')
>>> print("Inf: %f Ninf: %f NaN: %f" % (Inf, Ninf, NaN))
Inf: inf Ninf: -inf NaN: nan

And math.isnan() and math.isfinite() work as expected. Continuing the previous example:

>>> import math
>>> math.isnan(Inf)
False
>>> math.isnan(NaN)
True
>>> math.isfinite(NaN)
False
>>> math.isfinite(Inf)
False

And that is why gpsd uses math.isfinite() instead of math.isnan().

[NaN]s have many other interesting properties, be sure to read up on the subject. The [IEEE754] document is a closed source standard. For a public description look at the Wikipedia [NaN] article.

Golang NaN

Golang has similar quirks to Python, "features" that conflict with [IEEE754], many undocumented. Criminally Go does not allow NaN to be used as a constant:

$ cat nan.go
package main

const myNaN = 1.0/0.0

$ go build nan.go
# command-line-arguments
./nan.go:3:19: invalid operation: division by zero

Or ven as a computed variable:

# cat nan.go
package main

func init() {
    myNaN := 1.0 / 0.0
}
kong /tmp # go build nan.go
# command-line-arguments
./nan.go:6:19: invalid operation: division by zero

Go provides the function math.NaN() as a roundabout way to get NaN. But it cannot be used as a constant:

# cat nan.go
package main

import "math"  // for math.NaN()

const myNaN = math.NaN()
kong /tmp # go build nan.go
# command-line-arguments
./nan.go:5:15: math.NaN() (value of type float64) is not constant

You can only use math.NaN() as a variable, you can’t make it a constant. Like this:

$ cat nan.go
package main

import "math"  // for math.NaN()

func init() {
    myNaN := math.NaN()
}
kong /tmp # go build nan.go
$ command-line-arguments
./nan.go:6:5: myNaN declared and not used

Go does not provide isFinite(), so you gpsd rolls its own:

func IsFinite(x float64) bool {
        if math.IsNaN(x) || math.IsInf(x, 0) {
                return false
        }
        return true
}

See above for why you should always use isFinite() before using a floating point number.

Baying at the moon

All the preceding hasa been about how precise numbers can, and should, be. But GNSS receivers are more or less random in the least significant digits. And not random in a nice convenient Bayesian way.

To be perfectly clear: GNSS errors are not normally distributed. You can not use Bayesian inferences on GNSS numbers. Forget them if you learned them. The Standard Distribution (sigma) no longer behaves as you would expect. For example, the "68–95–99.7 rule" fails.

In the true spirit of statistical analysis, here we will ignore the causes of errors, and only look at the shape of the end result data (latitude, longitude, altitude, etc.).

CEP

The primary measure of GNSS goodness is Circular Error Probability (CEP). If Y = CEP(X) then X% of the measurements will be within a circle of Y radius from the mean measurement. CEP(50), CEP(95) and CEP(99) are commonly used. gpsprof will compute those, and other statistics, for you.

Statisticians prefer to use the term Median Average Deviation (MAD) instead of CEP(50). But CEP(50) and MAD are the same thing.

Receiver manufacturers like to quote CEP(50), but who wants a receiver that is right only half of the time? CEP(99) is more interesting. You can use the CEP(99) to design a car navigation system that only leaves the road 1% of the time. About once every one hundred minutes.

It is easy to verify that your receiver errors do not have a normal distribution. With a normal distribution sigma / CEP(50) = 1.4826. I just used gpsprof to compute the CEP(50) and sigma ov my receiver over 12 hours. From that data: sigma / CEP(50) = 1.6446.

Skewness

On casual inspection, you should notice that data with a Bayesian Distribution is symmetric about the center. But a quick look at a gpsprof plot that GNSS data is not symmetric about the center.

In statistics, this property is called skewness[SKEW]. A normal distribution has a skewness of zero. The skewness of latitude, longitude, and altitude is often between negative 3 and negative 5 when measured at my office. Another confirmation that GNSS data is not Bayesian.

Kurtosis

Another property of a data distribution is the shape of the "tail" of the data as it gets further from the peak. In statistics, this property is called kurtosis[KURTOSIS].

A normal distribution has a kurtosis of three. A kurtosis greater than three means there are more outliers than in a normal distribution. The kurtosis of latitude, longitude, and altitude is often between nine and fifteen when measured at my office. More confirmation that GNSS data is not Bayesian.

REFERENCES

GPSD Project web site: https://gpsd.io/

[IEEE754] https://standards.ieee.org/standard/754-2019.html
[DD] https://en.wikipedia.org/wiki/Decimal_degrees
[CoCom] https://en.wikipedia.org/wiki/Coordinating_Committee_for_Multilateral_Export_Controls
[Y2038] https://en.wikipedia.org/wiki/Year_2038_problem
[NaN] https://en.wikipedia.org/wiki/NaN
[NIST2187] https://nvlpubs.nist.gov/nistpubs/TechnicalNotes/NIST.TN.2187.pdf
[NIST-USNO] https://www.nist.gov/pml/time-and-frequency-division/time-services/nist-usno/nist-usno-2020-archive
[IS-GPS-200] https://navcen.uscg.gov/gps-technical-references
[OS_SIS_ICD] https://galileognss.eu/galileo-gnss-documents/
[ICD_B1I] http://en.beidou.gov.cn/SYSTEMS/ICD/
[IS-QZSS] https://qzss.go.jp/en/technical/ps-is-qzss/ps-is-qzss.html
[ICD-GLONASS] https://glonass-iac.ru/en/documents/
[SKEW] https://en.wikipedia.org/wiki/Skewness
[KURTOSIS] https://en.wikipedia.org/wiki/Kurtosis

COPYING

This file is Copyright by Gary E. Miller This file is Copyright by the GPSD project
SPDX-License-Identifier: BSD-2-clause