8 Visualising Pace Across Splits
We have already seen how we can perform pace calculations on stage level data and use pace maps and off-the-pace charts to visualise pace over the course of a rally.
But in WRC rallies at least, the stages are often long enough, and the promoter well resourced enough, to merit the collection of split data data at various split points along a stage. So in this chapter, we’ll review how we can create pace charts and apply the techniques to plotting progress within a stage, across stage splits.
8.1 Load Base Data
As ever, load in the helper functions:
source('code/wrc-api.R')
source('code/wrc-wrangling.R')
source('code/wrc-charts.R')And get the base data:
s = get_active_season()
eventId = get_eventId_from_name(s, 'arctic')
itinerary = get_itinerary(eventId)
sections = get_sections(itinerary)
stages = get_stages(sections)
stages_lookup = get_stages_lookup(stages)
# Quick Lookups
stage_list = get_stage_list(stages)
stage_codes = stages$code
# Driver details
entries = get_rally_entries(eventId)
cars = get_car_data(entries)Get a sample stage ID and associated splits:
# Get example stage ID
stageId = stages_lookup[['SS3']]
# Get splits for the stage
splits = get_splits(eventId, stageId)
splits_locations = get_split_locations(splits)
splits_list = splits_locations$splitPointId
split_names = splits_locations$splitname
# Get wide format data
splits_wide = get_splits_wide(splits) %>%
relabel_times_df2(splits_list, cars, typ='split')
splits_wide %>% head(2)## code split_1 split_2 split_3 split_4 split_5
## 1 OGI 161.7 272.3 471.3 690.4 789.2
## 2 EVA 162.3 273.7 472.3 692.1 792.4
Get long form splits data for one or more stages, in this case, just a single stage:
splits_long = get_multi_split_times(stageId)8.1.1 Obtaining Split Distances
We can find the distance between each split as the difference between consecutive values. Let’s augment the splits_locations with these values as well as with section start distances:
splits_locations$start_dist = lag(splits_locations$distance,
default=0)
splits_locations$section_dist = c(splits_locations$distance[1],
diff(splits_locations$distance))
splits_locations## splitPointId stageId number distance splitname start_dist section_dist
## 1 3615 1750 1 4.83 split_1 0.00 4.83
## 2 3601 1750 2 9.02 split_2 4.83 4.19
## 3 3621 1750 3 14.87 split_3 9.02 5.85
## 4 3617 1750 4 20.63 split_4 14.87 5.76
## 5 3593 1750 5 23.21 split_5 20.63 2.58
We can also retrieve these section distances into a splitPointId named list:
split_distances = splits_locations$section_dist
# Label distances using split names
names(split_distances) = split_names
# Label the values using spiltPointId
#names(split_distances) = splits_locations$splitPointId
split_distances## split_1 split_2 split_3 split_4 split_5
## 4.83 4.19 5.85 5.76 2.58
We recall that the split points do not include the final timing line (the finish), so a complete set of distances also means we need to access the overall stage distance and account for that:
stage_dist = stages[stages['stageId']==stageId,'distance']
stage_dist## [1] 24.43
The complete set of intermediate distances is then:
full_split_distances = c(split_distances, stage_dist-sum(split_distances))
names(full_split_distances) = c(split_names, 'total')
full_split_distances## split_1 split_2 split_3 split_4 split_5 total
## 4.83 4.19 5.85 5.76 2.58 1.22
8.2 Calculating Splits Pace
To calculate pace between two split points we need to get the elapsed time between those two points as well as the distance between split points.
We can obtain the split differences by finding differences between the columns of the wide format dataframe using the get_split_duration() function we created previously:
#split_cols = get_split_cols(splits)
split_durations_wide = get_split_duration(splits_wide, split_names,
id_col='code')
split_durations_wide %>% head(3)## code split_1 split_2 split_3 split_4 split_5
## 1 OGI 161.7 110.6 199.0 219.1 98.8
## 2 EVA 162.3 111.4 198.6 219.8 100.3
## 3 NEU 159.1 109.5 197.2 216.9 100.0
We can then find the pace by dividing the split section times through by the split distances:
section_pace_wide = split_durations_wide
for (s in split_names) {
section_pace_wide[,s] = section_pace_wide[,s] / split_distances[s]
}
section_pace_wide %>% head(2)## code split_1 split_2 split_3 split_4 split_5
## 1 OGI 33.47826 26.39618 34.01709 38.03819 38.29457
## 2 EVA 33.60248 26.58711 33.94872 38.15972 38.87597
8.3 Visualising the Splits Pace
To visualise the pace over each of the split sections, we can use exactly the same techniques that we used to visualise the stage pace, including pace maps and off-the-pace charts.
There are several different ways in which we might try to visualise pace. First, we can visualise absolute or rebased pace. Second, we can visualise pace within sections, using the times taken to get one split point to the next, or across the stage as a whole using the accumulated stage time.
8.3.1 Pace Over Each Section
One quick way of inspecting the pace over each section is to use a box plot:
section_pace_long = section_pace_wide %>%
head(10) %>%
gather(splitname, pace, split_names) %>%
merge(splits_locations[,c('splitname',
'start_dist', 'distance')],
by='splitname')
section_pace_long %>% head(3)## splitname code pace start_dist distance
## 1 split_1 OGI 33.47826 0 4.83
## 2 split_1 EVA 33.60248 0 4.83
## 3 split_1 NEU 32.93996 0 4.83
ggplot(section_pace_long[section_pace_long$pace<40,],
aes(x=distance, y=pace)) +
geom_boxplot(aes(group=distance))
This suggests that the section between the first and second split may be quite technical, and the final sections much faster.
Comparing section times against route metrics as described in Visualising Rally Stages will be the focus of a future unbook. Comparing manufacturer performance against different section and stage route types might also be worth further investigation.
8.3.2 Splits Sections Pace Maps
To generate the pace map, let’s first rebase the split times with respect to a specified driver:
example_driver = section_pace_wide[2,]$code
section_pace_wide_rebased = rebase(section_pace_wide, example_driver,
split_names, id_col='code')
section_pace_wide_rebased %>% head(3)## code split_1 split_2 split_3 split_4 split_5
## 1 OGI -0.1242236 -0.1909308 0.06837607 -0.1215278 -0.5813953
## 2 EVA 0.0000000 0.0000000 0.00000000 0.0000000 0.0000000
## 3 NEU -0.6625259 -0.4534606 -0.23931624 -0.5034722 -0.1162791
To plot the pace map, we need to get the data into a long format:
section_pace_long_rebased = section_pace_wide_rebased %>%
head(10) %>%
gather(splitname, pace,
as.character(split_names)) %>%
merge(splits_locations[,c('splitname',
'start_dist', 'distance')],
by='splitname')
section_pace_long_rebased %>% head()## splitname code pace start_dist distance
## 1 split_1 OGI -0.1242236 0 4.83
## 2 split_1 EVA 0.0000000 0 4.83
## 3 split_1 NEU -0.6625259 0 4.83
## 4 split_1 ROV -0.5590062 0 4.83
## 5 split_1 KAT -0.1863354 0 4.83
## 6 split_1 GRE 0.7246377 0 4.83
We can now view the rebased pace over the splits:
section_pace_long_rebased %>%
pace_map( xstart='start_dist',
drivers=c('KAT','ROV'),
xend='distance', id_col='code', lines=FALSE, label_dodge=2)
8.3.3 Off-the-Pace Splits Pace Mapping
To review the off-the-pace performance over the splits on a stage, we can use the off-the-pace chart function applied to rebased elapsed times data..
Let’s get some rebased data using the accumulated stage time at each split, hackfix flipping the basis of the rebase for now until such a time as the off-the-pace chart is better behaved:
wide_splits_rebased = splits_wide %>%
head(10) %>%
rebase(example_driver,
splits_locations$splitname,
id_col='code', flip=TRUE)
wide_splits_rebased %>% head(3)## code split_1 split_2 split_3 split_4 split_5
## 1 OGI 0.6 1.4 1.0 1.7 3.2
## 2 EVA 0.0 0.0 0.0 0.0 0.0
## 3 NEU 3.2 5.1 6.5 9.4 9.7
We can convert this to long form and add in distance information:
long_splits_rebased = wide_splits_rebased %>%
pivot_longer(splits_locations$splitname,
names_to = "splitname",
values_to = "sectionDurationS") %>%
merge(splits_locations[,c('splitname','distance')],
by='splitname')
long_splits_rebased %>% head(3)## splitname code sectionDurationS distance
## 1 split_1 OGI 0.6 4.83
## 2 split_1 BRE 3.9 4.83
## 3 split_1 ROV 2.7 4.83
At the start of the chart, it’s convenient to add some zeroed values, so let’s create a dataframe to help us add those data points:
zero_df = data.frame(code=unique(long_splits_rebased$code))
zero_df$distance = 0
zero_df$sectionDurationS=0
zero_df$splitname = 'split_0'And add them in:
long_splits_rebased = bind_rows(long_splits_rebased, zero_df)The off-the-pace chart is intended to show how much time is lost over the course of a stage, the gradient of the slope in each section being an indicator of the pace differential within that section (i.e. between two consecutive split points).
The off-the-pace chart is most easily generated from a long dataframe containing the accumulated stage time rather than the sectional times.
For example, we can cast the wide form data to a long form and co-opt the pace chart to render the times for us:
long_splits_rebased %>%
off_the_pace_chart(dist='distance',
t='sectionDurationS',
label_typ='ggrepel',
code='code')