5 Finding Pace Across Stages

Average speed on a rally is all very well, but it’s not the most useful of metrics for making sense of what’s actually going on in a rally. Far more useful is the notion of pace the reciprocal of a speed like measure, that tells you how many seconds it’s taking each driver to cover one kilometer.

Knowing the pace allows you to make more direct comparisons between drivers, as well as simplifying rule of thumb calculations, like what sort of pace advantage a driver needs to make up the 2s to the leader over the remaining 100 kilometers available in the final four stages…

In this chapter, we look at some simple pace calculations, rebase pace values relative to a specified driver, and explore a couple of ways of visualising differential pace over the course of a rally in the form of pace maps and off-the-pace charts.

5.1 Load Base Data

To get the stage data from a standing start, we can load in the current season list, select the rally we want, look up the itinerary from the rally, extract the sections and then the stages, and from that access the stage ID for the stage or stages we are interested in.

Load in the helper functions:

source('code/wrc-api.R')

And get the base data:

s = get_active_season()
eventId = get_eventId_from_name(s, 'arctic')

itinerary = get_itinerary(eventId)
sections = get_sections(itinerary)
stages = get_stages(sections)
stages_lookup = get_stages_lookup(stages)

# Driver details
entries = get_rally_entries(eventId)
cars = get_car_data(entries)

Get a sample stage ID:

stageId = stages_lookup[['SS3']]

5.2 Defining Pace

With variable stage distances on a stage rally, metrics such as average speed provide one way of comparing performances across stage, calculated as \(\textrm{stage_time}/\textrm{stage_distance}\) with units of kilometers or miles per hour.

A more useful measure, particularly in rally terms, is the notion of pace, typically given with units of seconds per kilometer. Speed tells us much quickly a car covers distance in unit time; pace gives us an indication of how much time is required to travel a unit distance.

When used as a rebased difference measure between drivers, pace difference allows us to rapidly calculate how much time a driver is likely to gain or lose over a particular stage distance as per the word equation \(\textrm{time_gain}=\textrm{stage_distance}\cdot\textrm{pace_difference}\).

Basic pace itself is given as \(\textrm{pace}=\textrm{time}/\textrm{distance}\).

Developing our rally algebra, we might identify the stage distance for stage \(S\) as \(d_S\). For a stage time by driver \(i\) of \({_S}t_i\) the stage pace \({_S}p_i\) for driver \(i\) on stage \(S\) is then given as:

\[ {_S}p_i = \frac{{_S}t_i}{d_S} \]

5.3 Calculating Stage Pace

We can calculate stage pace from stage times and stage distances.

We can find stages distances directly from the stages dataframe:

stages %>% select(c('code','distance')) %>% head(3)
##   code distance
## 1  SS1    31.05
## 2  SS2    31.05
## 3  SS3    24.43

5.3.1 Calculating Pace for a Single Stage

Let’s start by looking a single stage using a recipe we have used before:

# Example stage code
stage_code = 'SS3'

stageId = stages_lookup[[stage_code]]

# Get the stage distance
stage_distance = stages[stages['code']==stage_code, 'distance']

# Get driver metadata
cars = get_car_data(entries)

# Create stage times with merged in driver metadata
stage_times = get_stage_times(eventId, stageId) %>%
                      arrange(position) %>%
                      head(10) %>%
                      # Merge in the entries data
                      merge(cars, by='entryId')  %>%
                      # Convert milliseconds to seconds
                      mutate(TimeInS = elapsedDurationMs/1000)  %>%
                      # Limit columns and set column order
                      select(c('position', 'identifier',
                               'code', 'TimeInS')) %>%
                      # The merge may upset the row order
                      # so reset the order again
                      arrange(position) %>%
                      # Improve column names by renaming them
                      rename(Pos=position,
                             Car = identifier,
                             Code = code,
                             `Time (s)` = TimeInS)

formattable(stage_times )
Pos Car Code Time (s)
1 8 TÄN 834.5
2 11 NEU 835.5
3 2 SOL 839.1
4 69 ROV 840.2
5 42 BRE 841.9
6 1 OGI 842.0
7 18 KAT 843.2
8 7 LOU 844.3
9 33 EVA 845.8
10 3 SUN 846.9

We can now calculate pace as the stage time divided by the stage distance:

stage_times$pace = stage_times$'Time (s)' / stage_distance

stage_times
##    Pos Car Code Time (s)     pace
## 1    1   8  TÄN    834.5 34.15882
## 2    2  11  NEU    835.5 34.19975
## 3    3   2  SOL    839.1 34.34711
## 4    4  69  ROV    840.2 34.39214
## 5    5  42  BRE    841.9 34.46173
## 6    6   1  OGI    842.0 34.46582
## 7    7  18  KAT    843.2 34.51494
## 8    8   7  LOU    844.3 34.55997
## 9    9  33  EVA    845.8 34.62137
## 10  10   3  SUN    846.9 34.66639

5.3.2 Calculating Pace for Multiple Stages

First, let’s get the data for all the stages:

stage_list = get_stage_list(stages)

multi_stage_times = get_multi_stage_times(stage_list)
  
multi_stage_times %>% tail(2)
##     stageTimeId stageId entryId elapsedDurationMs  elapsedDuration    status
## 539       96810    1749   21571           1301693 00:21:41.6930000 Completed
## 540       96793    1749   21541                NA             <NA>       DNS
##      source position diffFirstMs        diffFirst diffPrevMs         diffPrev
## 539 Default       52      699224 00:11:39.2240000     153590 00:02:33.5900000
## 540 Default       NA          NA             <NA>         NA             <NA>

We can generate the pace by adding the stage distance as an extra column and performing the pace calculation.

We’ll also take the opportunity to merge in driver metadata and limit cars to WRC group entries:

get_multi_stage_pace = function(multi_stage_times, cars) {
  multi_stage_times %>%
                    merge(stages[,c('stageId' ,'distance',
                                    'number', 'code')],
                          by='stageId') %>%
                    mutate(elapsedDurationS = elapsedDurationMs / 1000,
                            pace = elapsedDurationS / distance) %>%
                    merge(cars[,c('entryId','drivername',
                                  'code', 'groupname')],
                                    by='entryId',
                          suffixes=c('','_driver')) %>%
                    filter(groupname=='WRC') %>%
                    select(c('stageId', 'number', 'code_driver',
                             'elapsedDurationS', 'pace', 'code'))  %>%
                    arrange(number, elapsedDurationS)
  
}

multi_stage_pace = get_multi_stage_pace(multi_stage_times, cars)

multi_stage_pace %>% head(3)
##   stageId number code_driver elapsedDurationS     pace code
## 1    1747      1         TÄN            957.8 30.84702  SS1
## 2    1747      1         BRE            961.4 30.96296  SS1
## 3    1747      1         ROV            968.4 31.18841  SS1

Create a mapping from stage ID to stage codes and cast the ordered list of stage Ids to an ordered list of stage codes:

get_stage_codes = function(stages){
     # Create a stage code mapping function
    stages_lookup_code = get_stages_lookup(stages, 'stageId', 'code')
    stage_code_map = function(stageId)
      stages_lookup_code[[as.character(stageId)]]
    
    # Map stage ID column names to stage codes
    stage_codes = unlist(purrr::map(stage_list,
                                    function (x) stage_code_map(x)))
    stage_codes 
}

stage_codes = get_stage_codes(stages)

Use the generic widener function to widen the pace dataframe to give the pace for each driver on each stage:

pace_wide = get_multi_stage_generic_wide(multi_stage_pace,
                                         stage_codes, 'pace',
                                         # Unique group keys required
                                         # Driver code not guaranteed unique
                                         group_key=c('code_driver'),
                                         spread_key='code')

pace_wide %>% head(3)
##   code_driver      SS1      SS2      SS3      SS4      SS5      SS6      SS7
## 1         BER 32.66345 32.84380 58.71879 57.58915 48.77529 58.83340 57.69965
## 2         BRE 30.96296 31.09501 34.46173 27.69965 27.33382 34.84241 28.04119
## 3         EVA 31.38486 31.18196 34.62137 27.45354 27.10260 34.94065 27.75490
##        SS8      SS9     SS10
## 1 49.27746 28.69159 28.64174
## 2 28.14668 27.32532 26.82381
## 3 28.32731 27.03605 27.05955

5.4 Rebasing Stage Pace

We can rebase the stage pace according to a specific driver:

example_driver = pace_wide[2,]$code_driver

pace_wide_rebased = rebase(pace_wide, example_driver, stage_codes,
                           id_col='code_driver')

pace_wide_rebased %>% head(3)
##   code_driver       SS1        SS2        SS3        SS4        SS5         SS6
## 1         BER 1.7004831 1.74879227 24.2570610 29.8895028 21.4414740 23.99099468
## 2         BRE 0.0000000 0.00000000  0.0000000  0.0000000  0.0000000  0.00000000
## 3         EVA 0.4219002 0.08695652  0.1596398 -0.2461075 -0.2312139  0.09823987
##          SS7        SS8        SS9      SS10
## 1 29.6584631 21.1307803  1.3662661 1.8179350
## 2  0.0000000  0.0000000  0.0000000 0.0000000
## 3 -0.2862883  0.1806358 -0.2892746 0.2357365

More abstractly, the rebased pace, \({_S}p_i^j\), for driver \(i\) relative to driver \(j\) on stage \(S\) is given as:

\[ {_S}p_i^j = {_S}p_i - {_S}p_j = \frac{{_S}t_i - {_S}t_j}{d_S} = \frac{{_S}t_i^j}{d_S} \]

5.5 The Ultimate Rally

Finally, in passing, it is worth noting that we can calculate an “ultimate rally” time from the sum of th fastest stage times completed on the rally, by any driver. This gives us the fastest possible rally time from recorded stage times against which we can compare the performance of the rally winner. (Of course, it might be that a particularly fast time on one stage by a particular driver ruined the rest of their loop!)

Furthermore, when split times are available, we can go even further and construct and ultimate ultimate (sic) rally time from ultimate stage times that have themselves been constructed from ultimate split times on the stage.