Data Enrichment and Derived Tables#
The basic dataframes provide a good start for analysis, but there are a few gaps in the data.
For example, the withdrawals data does not directly provide information regarding what class a withdrawal is associated with.
To find the class, we need to merge in data from the clazz dataset.
from dakar_rallydj.getter import DakarAPIClient
dakar = DakarAPIClient(
use_cache=True,
backend='memory',
# cache_name='dakar_cache',
expire_after=3600 # Cache for 1 hour
)
Enriching the Withdrawal Data#
Let’s enrich the base withdrawal dataframe by adding in a column that identifies the vehicle class. To do this, we can use the
withdrawals_df, withdrawn_competitors_df, withdrawn_teams_df = dakar.get_withdrawals()
withdrawn_teams_df.head()
| team.bib | team.brand | team.model | team.vehicle | team.vehicleImg | team.clazz | team.w2rc | |
|---|---|---|---|---|---|---|---|
| 0 | 202 | MINI | JCW RALLY 3.0I | X-RAID MINI JCW TEAM | https://img.aso.fr/core_app/img-motorSports-da... | 96c0869600e0013dbf5f86f60e5c4da4 | False |
| 1 | 205 | TOYOTA | HILUX IMT EVO | TOYOTA GAZOO RACING | https://img.aso.fr/core_app/img-motorSports-da... | 96c0869600e0013dbf5f86f60e5c4da4 | False |
| 2 | 206 | TOYOTA | HILUX IMT EVO | TOYOTA GAZOO RACING | https://img.aso.fr/core_app/img-motorSports-da... | 96c0869600e0013dbf5f86f60e5c4da4 | False |
| 3 | 208 | TOYOTA | HILUX | GURTAM TOYOTA GAZOO RACING BALTICS | https://img.aso.fr/core_app/img-motorSports-da... | 96c0869600e0013dbf5f86f60e5c4da4 | False |
| 4 | 213 | MD | OPTIMUS | MD RALLYE SPORT | https://img.aso.fr/core_app/img-motorSports-da... | f00d7ec8d2d96e9cf11aa515109376cf | False |
clazz_df = dakar.get_clazz()
clazz_df.head()
| refueling | promotionalDisplay | reference | label | position | shortLabel | _bind | _id | _parent | $group | color | tinyLabel | ar | en | es | fr | category | categoryClazz | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | True | 2025-A-T1-+ | + | 4 | cat.name.A_T1_+ | allClazz-2025-A | 96c0869600e0013dbf5f86f60e5c4da4 | categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... | categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... | NaN | NaN | T1+: Prototype Cross-Country Cars 4x4 | T1+: Prototype Cross-Country Cars 4x4 | T1+: Prototype Cross-Country Cars 4x4 | T1+ : Voitures Tout-terrain Prototypes 4x4 | A | 2025-A-T1 |
| 1 | 0 | True | 2025-A-T1-1 | 1 | 0 | cat.name.A_T1_1 | allClazz-2025-A | f666973e89db183ecfefc75c3af8ffb1 | categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... | categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... | NaN | NaN | T1.1 Prototype Cross-Country Cars 4x4 | T1.1 Prototype Cross-Country Cars 4x4 | T1.1 Prototype Cross-Country Cars 4x4 | T1.1 : Voitures Tout-terrain Prototypes 4x4 | A | 2025-A-T1 |
| 2 | 0 | True | 2025-A-T1-2 | 2 | 1 | cat.name.A_T1_2 | allClazz-2025-A | f00d7ec8d2d96e9cf11aa515109376cf | categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... | categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... | NaN | NaN | T1.2 Prototype Cross-Country Cars 4x2 | T1.2 Prototype Cross-Country Cars 4x2 | T1.2 Prototype Cross-Country Cars 4x2 | T1.2 : Voitures Tout-terrain Prototypes 4x2 | A | 2025-A-T1 |
| 3 | 0 | True | 2025-A-T1-3 | 3 | 2 | cat.name.A_T1_3 | allClazz-2025-A | f071b5dbfd586a4ba46100196a98a9c4 | categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... | categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... | NaN | NaN | T1.3 FIA: النتيجة | T1.3: SCORE | T1.3 : SCORE | T1.3 : SCORE | A | 2025-A-T1 |
| 4 | 0 | True | 2025-A-T1-U | U | 3 | cat.name.A_T1_U | allClazz-2025-A | 1501ebcbaf3ad27e72aecfba7faa8037 | categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... | categoryGroup-2025-A:b49155b3f5670d2a907aa01e3... | NaN | NaN | T1.U: "Ultimate" Prototype Cross-Country Cars | T1.U: "Ultimate" Prototype Cross-Country Cars | T1.U: "Ultimate" Prototype Cross-Country Cars | T1.U : Voitures Tout-Terrain Prototypes "Ultim... | A | 2025-A-T1 |
import pandas as pd
clazz_map = pd.merge(withdrawn_teams_df[["team.bib", "team.clazz"]], clazz_df[[
"_id", "reference", "categoryClazz", "en"]], left_on="team.clazz", right_on="_id").drop(columns=["team.clazz", "_id"]).rename(columns={"en": "clazz_label"})
clazz_map.head()
| team.bib | reference | categoryClazz | clazz_label | |
|---|---|---|---|---|
| 0 | 202 | 2025-A-T1-+ | 2025-A-T1 | T1+: Prototype Cross-Country Cars 4x4 |
| 1 | 205 | 2025-A-T1-+ | 2025-A-T1 | T1+: Prototype Cross-Country Cars 4x4 |
| 2 | 206 | 2025-A-T1-+ | 2025-A-T1 | T1+: Prototype Cross-Country Cars 4x4 |
| 3 | 208 | 2025-A-T1-+ | 2025-A-T1 | T1+: Prototype Cross-Country Cars 4x4 |
| 4 | 213 | 2025-A-T1-2 | 2025-A-T1 | T1.2 Prototype Cross-Country Cars 4x2 |
clazz_map['reference'].unique()
array(['2025-A-T1-+', '2025-A-T1-2', '2025-A-T1-1', '2025-A-T3-1',
'2025-A-T4-SSV1', '2025-A-T4-T4', '2025-A-T5-1', '2025-A-T5-2'],
dtype=object)
We can also pull in further information from the groups data.
groups_df = dakar.get_groups()
groups_df.head()
| position | shortLabel | reference | label | tinyLabel | promotionalDisplay | _bind | _origin | _id | _parent | color | ar | en | es | fr | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 8 | 0 | cat.name.A_T1 | 2025-A-T1 | T1 | ULT | True | allGroups-2025 | categoryGroup-2025-A | b49155b3f5670d2a907aa01e319876b8 | category-2025:63b4f5da4591200d0a4cc239245eb03a | #EBBC4E | Ultimate | Ultimate | Ultimate | Ultimate |
| 7 | 1 | cat.name.A_T2 | 2025-A-T2 | T2 | STK | True | allGroups-2025 | categoryGroup-2025-A | 4dac064bf100bc806b91e7f2e7758297 | category-2025:63b4f5da4591200d0a4cc239245eb03a | #C7C9C7 | Stock | Stock | Stock | Stock |
| 5 | 4 | cat.name.A_T3 | 2025-A-T3 | T3 | CHG | True | allGroups-2025 | categoryGroup-2025-A | 15f329900afa29e3e6b099ae681ebe12 | category-2025:63b4f5da4591200d0a4cc239245eb03a | #E04E39 | Challenger | Challenger | Challenger | Challenger |
| 6 | 5 | cat.name.A_T4 | 2025-A-T4 | T4 | SSV | True | allGroups-2025 | categoryGroup-2025-A | 423ea731fdcba5cda62c8334985889b0 | category-2025:63b4f5da4591200d0a4cc239245eb03a | #A7C6ED | SSV | SSV | SSV | SSV |
| 9 | 6 | cat.name.A_T5 | 2025-A-T5 | T5 | TRK | True | allGroups-2025 | categoryGroup-2025-A | f1a437ac1135c9d9a5e33f5096f95259 | category-2025:63b4f5da4591200d0a4cc239245eb03a | #2D2926 | شاحنة | Truck | Camión | Camion |
For convenience, let’s merge some of that in:
clazz_map = pd.merge(clazz_map, groups_df[["reference", "tinyLabel", "label", "color", "en"]].rename(
columns={"reference":"categoryClazz"}), on="categoryClazz").rename(columns={"en": "group_label"})
# Sort and reindex
# Or should we leave the original index so that
# we can readily merge on with it based on index?
clazz_map.sort_values("team.bib", inplace=True)
clazz_map.reset_index(drop=True, inplace=True)
clazz_map.head()
| team.bib | reference | categoryClazz | clazz_label | tinyLabel | label | color | group_label | |
|---|---|---|---|---|---|---|---|---|
| 0 | 202 | 2025-A-T1-+ | 2025-A-T1 | T1+: Prototype Cross-Country Cars 4x4 | ULT | T1 | #EBBC4E | Ultimate |
| 1 | 205 | 2025-A-T1-+ | 2025-A-T1 | T1+: Prototype Cross-Country Cars 4x4 | ULT | T1 | #EBBC4E | Ultimate |
| 2 | 206 | 2025-A-T1-+ | 2025-A-T1 | T1+: Prototype Cross-Country Cars 4x4 | ULT | T1 | #EBBC4E | Ultimate |
| 3 | 208 | 2025-A-T1-+ | 2025-A-T1 | T1+: Prototype Cross-Country Cars 4x4 | ULT | T1 | #EBBC4E | Ultimate |
| 4 | 213 | 2025-A-T1-2 | 2025-A-T1 | T1.2 Prototype Cross-Country Cars 4x2 | ULT | T1 | #EBBC4E | Ultimate |