3 Importing the Stage Route Data

The starting point for this adventure is an XML text file downloaded from the WRC website. The file contains data that describes the stage routes for the 2021 Rallye Monte Carlo using the KML file format. This format (“Keyhole Markup Language”) was originally developed as a way loading data into the application that became Google Earth following its acquisition by Google. KML is now an international standard maintained by the Open Geospatial Consortium, Inc. (OGC).

If you see the .kmz file suffix attached a file, the file is a compressed (zipped) KML file.

The first part of this chapter is the boring part, describing how to load the data in and some differences that arise depending on how you load it in. (If you do a web search, you’ll find there are various tools for opening KML files into an R program.) The second part shows how we can actually start to preview the data in a graphical way.

3.1 Downloading the Stage Data File

Let’s start by downloading the data from its web location:

file.url = 'https://webapps.wrc.com/2020/web/obc/kml/montecarlo_2021.xml'
downladed.filename = 'montecarlo_2021.xml'

# Download the file from a specified web location to a specifically named file
download.file(file.url, downladed.filename)

3.2 Opening KML Geodata Files

A wide variety of geodata file formats can be opened using general purpose geodata packages as well as certain specialist packages.

Two powerful general purpose packages are rgdal and sf, the simple features package.

3.2.1 Using rgdal and sp to Open Geodata Files

One way of reading in the file we have just downloaded is to use the readOGR() function found in the rgdal package. We can call this function explicitly from the package as rgdal::readOGR() or we can import the package and then access the function simply by calling it by name:

# Import the rgdal package
library(rgdal)

kml.file <- "montecarlo_2021.xml"

kml_sp = readOGR(kml.file)
## OGR data source with driver: KML 
## Source: "/Users/tonyhirst/Documents/GitHub/visualising-rally-stages/montecarlo_2021.xml", layer: "Meine Orte"
## with 9 features
## It has 2 fields
## Warning in readOGR(kml.file): Z-dimension discarded

The readOGR() function is capable of loading in a wide variety of geo-related file formats and automatically detecting what sort of format the file represents.

We can see what sort of object is loaded in by inspecting its class():

class(kml_sp)
## [1] "SpatialLinesDataFrame"
## attr(,"package")
## [1] "sp"

In this case, the KML file is loaded in and parsed into a SpatialLinesDataFrame object although as you may have noticed from a warning message when the file was loaded that the Z (altitude) dimension been discarded (as we shall see later, it is actually contains zero values anyway).

The SpatialLinesDataFrame datatype, along with other spatial datatypes, is defined in the R sp package. To a certain extent, the classes (that is, data types) declared by this package have been superseded by a different datatype hierarchy defined by the more recent sf package. However, the sp package is still a dependency of many of R’s spatial data packages and some functions rely on being presented with SpatialLinesDataFrame object data, for example.

3.2.2 Using sf to Open Geodata Files

The sf package is a more recently created package for working with geodata primitives and is maintained under the auspices of the r-spatial Github organisation.

We can load in data from a wide range of geodata file formats using the sf::st_read() function:

library(sf)

kml_sf = st_read(kml.file)
## Reading layer `Meine Orte' from data source `/Users/tonyhirst/Documents/GitHub/visualising-rally-stages/montecarlo_2021.xml' using driver `KML'
## Simple feature collection with 9 features and 2 fields
## geometry type:  LINESTRING
## dimension:      XYZ
## bbox:           xmin: 5.243488 ymin: 43.87633 xmax: 6.951953 ymax: 44.81973
## z_range:        zmin: 0 zmax: 0
## geographic CRS: WGS 84

In this case, we notice that the data has been loaded into a spatial features simple feature collection.

The data loaded into each object is the same, but it is represented differently. There are ways of converting between various forms of the two representations as we shall see later.

One thing to note in each case that the data appears to have been loaded in from a particular layer. The KML file format is capable of grouping various sets of data together in different ways. Where the datafile contains only one element that is decoded as a “layer”, that is loaded in by default. If multiple layers are detected they will be reported and can then be loaded in and “unpacked” by name.

We can also review the contents of the file by opening it with the sf::st_layers() function:

# Preview the file layers
st_layers(kml.file)
## Driver: KML 
## Available layers:
##   layer_name  geometry_type features fields
## 1 Meine Orte 3D Line String        9      2

If there is more than one layer, we can load it in by name:

kml_sf = st_read(kml.file, "Meine Orte")
## Reading layer `Meine Orte' from data source `/Users/tonyhirst/Documents/GitHub/visualising-rally-stages/montecarlo_2021.xml' using driver `KML'
## Simple feature collection with 9 features and 2 fields
## geometry type:  LINESTRING
## dimension:      XYZ
## bbox:           xmin: 5.243488 ymin: 43.87633 xmax: 6.951953 ymax: 44.81973
## z_range:        zmin: 0 zmax: 0
## geographic CRS: WGS 84

3.2.2.1 Reviewing the sf Feature Collection

The layer contains a feature collection with features containing linestrings in 3-dimensions (XYZ). There are several keys things to note:

  • the projection is WGS 84, the common “lat long” projection
  • the Z-range (altitude, or elevation) appears to be zeroed.

If we refer back to the data object loaded in using the rdgal::readOGR function, we note that it does not contain the Z co-ordinate. We can see this more clearly if we convert that sp SpatialDataFrame object to a simple features object using the sf::st_as_sf() function:

st_as_sf(kml_sp)
## Simple feature collection with 9 features and 2 fields
## geometry type:  LINESTRING
## dimension:      XY
## bbox:           xmin: 5.243488 ymin: 43.87633 xmax: 6.951953 ymax: 44.81973
## geographic CRS: WGS 84
##       Name Description                       geometry
## 0     SS 1             LINESTRING (5.894486 44.735...
## 1     SS 2             LINESTRING (6.09604 44.8040...
## 2   SS 3/6             LINESTRING (5.722938 44.487...
## 3   SS 4/7             LINESTRING (5.355052 44.500...
## 4     SS 5             LINESTRING (5.518181 44.283...
## 5  SS 9/11             LINESTRING (6.30413 44.4428...
## 6    SS 10             LINESTRING (6.57791 44.6499...
## 7 SS 12/14             LINESTRING (6.900294 43.950...
## 8 SS 13/15             LINESTRING (6.77335 43.8763...

A conversion also exists back from the sf object to the sp representation:

round_trip = as( st_as_sf(kml_sp), "Spatial")

However, if we try the same conversion on the simple features collection created directly from the parsed KML file, we get an error:

#round_trip2 = as( kml_sf, "Spatial")

The problem appears to be the Z layer. If we drop the zeroed Z layer manually, whilst preserving the projection:

kml_sf = st_zm(kml_sf, drop = TRUE, what = "ZM")

we can then convert this simple feature collection to a SpatialDataFrame Spatial object:

round_trip2 = as( kml_sf, "Spatial")

3.2.3 Accessing Route Data as a geojson String

GeoJSON is a widely used format for getting geodata into web pages. We can get the route for a stage from the routes spatial collection and cast it to JSON using the geojsonio::geojson_json function:

# Retrieve the geojson for a single stage and
# from within that, the linestring geometry,
# casting it to a geojson string
stage_route_gj = geojsonio::geojson_json(kml_sf[1,]$geometry)

3.2.4 Saving Simple Features Data to Various Geodata File Formats

The st_write function has a range of output drivers for writing geodata to different output types. although data may be last and/or the conversion of a dataset loaded from one format and output to another may not be as meaningful as may be desired.

For example, we can write an object out to a geojson file:

geojson_filename = 'montecarlo_2021.geojson'

# The st_write function can update files or create new ones, but not
# replace existing ones. So let's make sure the file doesn't exist
# by deleting it if it does...
if (file.exists(geojson_filename)) {
  #Delete file if it exists
  file.remove(geojson_filename)
}
## [1] TRUE
st_write(kml_sf, geojson_filename, driver='geojson')
## Writing layer `montecarlo_2021' to data source `montecarlo_2021.geojson' using driver `geojson'
## Writing 9 features with 2 fields and geometry type Line String.

Equally, we can we write out the data loaded as a GPX data file (GPX data files are often used to share data collected from cycling or running route logging applications and devices):

gpx_filename = 'route.gpx'

# Remove any previous instances of this file
if (file.exists(gpx_filename)) {
  #Delete file if it exists
  file.remove(gpx_filename)
}
## [1] TRUE
st_write(kml_sf, gpx_filename, 
         driver='GPX', dataset_options ="GPX_USE_EXTENSIONS=yes" )
## options:        GPX_USE_EXTENSIONS=yes 
## Writing layer `route' to data source `route.gpx' using driver `GPX'
## Writing 9 features with 2 fields and geometry type Line String.

If we have access to car telemetry data in a simple tabular form, it may be convenient to save it using the GPX format as a convenient way of serialising that data.

3.3 Important Geodata File Formats

The sf::st_read() (as well as the rdgal::readOGR() function) is capable of reading in data from a wide variety of file formats.

We have already seen how it can load in data from a KML file, so let’s see how it copes with some other file formats.

3.3.1 Loading geojson Data

As well as KML files, route data is may be available in the GeoJSON text format. We can read geojson data file into R using the sf::st_read() function , returning the data as a spatial object:

geojson_sf = sf::st_read(geojson_filename)
## Reading layer `montecarlo_2021' from data source `/Users/tonyhirst/Documents/GitHub/visualising-rally-stages/montecarlo_2021.geojson' using driver `GeoJSON'
## Simple feature collection with 9 features and 2 fields
## geometry type:  LINESTRING
## dimension:      XY
## bbox:           xmin: 5.243488 ymin: 43.87633 xmax: 6.951953 ymax: 44.81973
## geographic CRS: WGS 84

As before, we can convert the simple features object to a Spatial dataframe by dropping the z-axis and then converting:

geojson_sf = as(st_zm(geojson_sf, drop = TRUE, what = "ZM"), "Spatial")

If you have a geojson string, you can cast it to a spatial object using the geojsonio::geojson_sp(geojson_str) function:

geojson_str = '{"type": "Point","coordinates": [-105.01621,39.57422]}'

class( geojsonio::geojson_sp(geojson_str) )
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"

3.3.2 Loading GPX Data

Route data collected by personal GPS devices is often shared using GPX files.

We can read in GPX files using sf::st_read:

st_read(gpx_filename)
## Multiple layers are present in data source /Users/tonyhirst/Documents/GitHub/visualising-rally-stages/route.gpx, reading layer `waypoints'.
## Use `st_layers' to list all layer names and their type in a data source.
## Set the `layer' argument in `st_read' to read a particular layer.
## Warning in evalq((function (..., call. = TRUE, immediate. = FALSE, noBreaks. =
## FALSE, : automatically selected the first layer in a data source containing more
## than one.
## Reading layer `waypoints' from data source `/Users/tonyhirst/Documents/GitHub/visualising-rally-stages/route.gpx' using driver `GPX'
## Simple feature collection with 0 features and 23 fields
## bbox:           xmin: NA ymin: NA xmax: NA ymax: NA
## geographic CRS: WGS 84

The plotKML package, which has a wide range of tools for creating KML files and rendering KML into Google Earth, also has a custom function for loading in GPX files:

#http://plotkml.r-forge.r-project.org/readGPX.html
library(plotKML)

gpx = readGPX(gpx_filename)

In this particular case, the GPX file contains multiple routes which we index by name.

A dataframe of point values, one point per row, is associated with each route:

head(gpx$routes$`SS 1`, 3)
##        lon      lat name  cmt desc  sym type
## 1 5.894486 44.73562 <NA> <NA> <NA> <NA> <NA>
## 2 5.894618 44.73580 <NA> <NA> <NA> <NA> <NA>
## 3 5.894778 44.73602 <NA> <NA> <NA> <NA> <NA>

In a “born GPX” file, we might expect to see more of the columns populated. As it currently stands, the GPX file we created from the original data, which was more or less limited to simple 2D linestrings, contains just the latitude and longitude data, albeit still in distinct stage identifiable routes.

3.3.3 Loading Flight Data Using IGC format GPS Files

GPS route data contained in IGC formatted flight data files can be loaded in using the geoviz::read_igc("path/to/your/file.igc") function.

3.3.4 Reading Data from GPS Devices

The pgirmess Spatial Analysis and Data Mining for Field Ecologists) package package provides a range of tools for retrieving data from GPS devices and then analysing them. The pgirmess::gps2gpx() function provides support from retrieving GPS data from a range of devices via the GPSBabel application](http://www.gpsbabel.org/), writing waypoint or track data to GPX files to local storage with the pgirmess::writeGPX() function, and uploading GPX data back up to Garmin GPS devices (pgirmess::uploadGPS()).